Skip to content

Commit f92bfca

Browse files
[AArch64] All bits of an exact right shift are demanded (#97448)
When building a vector which contains zero elements, the AArch64 ISel replaces those elements with `undef`, if they are right shifted out. However, these elements need to stay zero if the right shift is exact, or otherwise we will be introducing undefined behavior. Should allow #92528 to be recommitted.
1 parent 4339d2e commit f92bfca

File tree

2 files changed

+39
-0
lines changed

2 files changed

+39
-0
lines changed

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22142,6 +22142,10 @@ static SDValue performVectorShiftCombine(SDNode *N,
2214222142
if (DCI.DAG.ComputeNumSignBits(Op.getOperand(0)) > ShiftImm)
2214322143
return Op.getOperand(0);
2214422144

22145+
// If the shift is exact, the shifted out bits matter.
22146+
if (N->getFlags().hasExact())
22147+
return SDValue();
22148+
2214522149
APInt ShiftedOutBits = APInt::getLowBitsSet(OpScalarSize, ShiftImm);
2214622150
APInt DemandedMask = ~ShiftedOutBits;
2214722151

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
2+
; RUN: llc < %s | FileCheck %s
3+
target triple = "aarch64-linux"
4+
5+
define <2 x i32> @f(i8 %0, i8 %1) {
6+
; CHECK-LABEL: f:
7+
; CHECK: // %bb.0:
8+
; CHECK-NEXT: movi v0.2d, #0000000000000000
9+
; CHECK-NEXT: mov v0.b[3], w0
10+
; CHECK-NEXT: mov v0.b[7], w1
11+
; CHECK-NEXT: sshr v0.2s, v0.2s, #24
12+
; CHECK-NEXT: ret
13+
%3 = insertelement <2 x i8> poison, i8 %0, i64 0
14+
%4 = insertelement <2 x i8> %3, i8 %1, i64 1
15+
%5 = shufflevector <2 x i8> %4, <2 x i8> <i8 0, i8 poison>, <8 x i32> <i32 2, i32 2, i32 2, i32 0, i32 2, i32 2, i32 2, i32 1>
16+
%6 = bitcast <8 x i8> %5 to <2 x i32>
17+
%7 = ashr exact <2 x i32> %6, <i32 24, i32 24>
18+
ret <2 x i32> %7
19+
}
20+
21+
define <2 x i32> @g(i8 %0, i8 %1) {
22+
; CHECK-LABEL: g:
23+
; CHECK: // %bb.0:
24+
; CHECK-NEXT: movi v0.2d, #0000000000000000
25+
; CHECK-NEXT: mov v0.b[3], w0
26+
; CHECK-NEXT: mov v0.b[7], w1
27+
; CHECK-NEXT: ushr v0.2s, v0.2s, #24
28+
; CHECK-NEXT: ret
29+
%3 = insertelement <2 x i8> poison, i8 %0, i64 0
30+
%4 = insertelement <2 x i8> %3, i8 %1, i64 1
31+
%5 = shufflevector <2 x i8> %4, <2 x i8> <i8 0, i8 poison>, <8 x i32> <i32 2, i32 2, i32 2, i32 0, i32 2, i32 2, i32 2, i32 1>
32+
%6 = bitcast <8 x i8> %5 to <2 x i32>
33+
%7 = lshr exact <2 x i32> %6, <i32 24, i32 24>
34+
ret <2 x i32> %7
35+
}

0 commit comments

Comments
 (0)