Skip to content

[DAG] computeKnownBits - abds(x, y) will be zero in the upper bits if x and y are sign-extended #94448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2024

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Jun 5, 2024

As reported on #94442 - if x and y have more than one signbit, then the upper bits of its absolute value are guaranteed to be zero

Sibling PR to #94382

Alive2: https://alive2.llvm.org/ce/z/7_z2Vc

Fixes #94442

@llvmbot llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels Jun 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 5, 2024

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: Simon Pilgrim (RKSimon)

Changes

As reported on #94442 - if x and y have more than one signbit, then the upper bits of its absolute value are guaranteed to be zero

Sibling PR to #94382

Alive2: https://alive2.llvm.org/ce/z/7_z2Vc

Fixes #94442


Full diff: https://github.com/llvm/llvm-project/pull/94448.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+5)
  • (modified) llvm/test/CodeGen/AArch64/neon-abd.ll (+34)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 414c724b94f7b..698f4e0f9fc41 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -3477,6 +3477,11 @@ KnownBits SelectionDAG::computeKnownBits(SDValue Op, const APInt &DemandedElts,
     Known = computeKnownBits(Op.getOperand(1), DemandedElts, Depth + 1);
     Known2 = computeKnownBits(Op.getOperand(0), DemandedElts, Depth + 1);
     Known = KnownBits::abds(Known, Known2);
+    Known.Zero.setHighBits(
+        std::min(
+            ComputeNumSignBits(Op.getOperand(0), DemandedElts, Depth + 1),
+            ComputeNumSignBits(Op.getOperand(1), DemandedElts, Depth + 1)) -
+        1);
     break;
   }
   case ISD::UMUL_LOHI: {
diff --git a/llvm/test/CodeGen/AArch64/neon-abd.ll b/llvm/test/CodeGen/AArch64/neon-abd.ll
index 901cb8adc23f0..f743bae84053d 100644
--- a/llvm/test/CodeGen/AArch64/neon-abd.ll
+++ b/llvm/test/CodeGen/AArch64/neon-abd.ll
@@ -554,6 +554,40 @@ define <16 x i8> @umaxmin_v16i8_com1(<16 x i8> %0, <16 x i8> %1) {
   ret <16 x i8> %sub
 }
 
+; (abds x, y) upper bits are known zero if x and y have extra sign bits
+define <4 x i16> @combine_sabd_4h_zerosign(<4 x i16> %a, <4 x i16> %b) #0 {
+; CHECK-LABEL: combine_sabd_4h_zerosign:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    movi v0.2d, #0000000000000000
+; CHECK-NEXT:    ret
+  %a.ext = ashr <4 x i16> %a, <i16 7, i16 8, i16 9, i16 10>
+  %b.ext = ashr <4 x i16> %b, <i16 11, i16 12, i16 13, i16 14>
+  %max = tail call <4 x i16> @llvm.smax.v4i16(<4 x i16> %a.ext, <4 x i16> %b.ext)
+  %min = tail call <4 x i16> @llvm.smin.v4i16(<4 x i16> %a.ext, <4 x i16> %b.ext)
+  %sub = sub <4 x i16> %max, %min
+  %mask = and <4 x i16> %sub, <i16 32768, i16 32768, i16 32768, i16 32768>
+  ret <4 x i16> %mask
+}
+
+; negative test - mask extends beyond known zero bits
+define <2 x i32> @combine_sabd_2s_zerosign_negative(<2 x i32> %a, <2 x i32> %b) {
+; CHECK-LABEL: combine_sabd_2s_zerosign_negative:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    sshr v0.2s, v0.2s, #3
+; CHECK-NEXT:    sshr v1.2s, v1.2s, #15
+; CHECK-NEXT:    mvni v2.2s, #7, msl #16
+; CHECK-NEXT:    sabd v0.2s, v0.2s, v1.2s
+; CHECK-NEXT:    and v0.8b, v0.8b, v2.8b
+; CHECK-NEXT:    ret
+  %a.ext = ashr <2 x i32> %a, <i32 3, i32 3>
+  %b.ext = ashr <2 x i32> %b, <i32 15, i32 15>
+  %max = tail call <2 x i32> @llvm.smax.v2i32(<2 x i32> %a.ext, <2 x i32> %b.ext)
+  %min = tail call <2 x i32> @llvm.smin.v2i32(<2 x i32> %a.ext, <2 x i32> %b.ext)
+  %sub = sub <2 x i32> %max, %min
+  %mask = and <2 x i32> %sub, <i32 -524288, i32 -524288> ; 0xFFF80000
+  ret <2 x i32> %mask
+}
+
 declare <8 x i8> @llvm.abs.v8i8(<8 x i8>, i1)
 declare <16 x i8> @llvm.abs.v16i8(<16 x i8>, i1)
 

Comment on lines 3482 to 3483
ComputeNumSignBits(Op.getOperand(0), DemandedElts, Depth + 1),
ComputeNumSignBits(Op.getOperand(1), DemandedElts, Depth + 1)) -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do operand(1) first since it is supposed to be canonically simpler? Maybe skip the second call if the first call returned 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the logic looks sound to me.

RKSimon added 2 commits June 5, 2024 10:49
… x and y are sign-extended

As reported on llvm#94442 - if x and y have more than one signbit, then the upper bits of its absolute value are guaranteed to be zero

Alive2: https://alive2.llvm.org/ce/z/7_z2Vc
@RKSimon RKSimon force-pushed the abds_zero_upper branch from a85a14a to f6add38 Compare June 5, 2024 09:51
@RKSimon RKSimon merged commit 54b20cb into llvm:main Jun 5, 2024
7 checks passed
@RKSimon RKSimon deleted the abds_zero_upper branch June 5, 2024 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DAG] computeKnownBits - ISD::ABDS is zero in the high bits if the input has multiple sign bits
3 participants