[InstCombine] Handle trunc i1 pattern in eq-of-parts fold #112704

nikic · 2024-10-17T13:05:52Z

Equality/inequality of the low bit can be represented by (trunc (xor x, y) to i1), possibly with an extra not. We have to handle this in the eq-of-parts fold now that we no longer canonicalize this to a masked icmp.

Proofs: https://alive2.llvm.org/ce/z/qidkzq

Fixes #110919.

llvmbot · 2024-10-17T13:06:30Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Equality/inequality of the low bit can be represented by (trunc (xor x, y) to i1), possibly with an extra not. We have to handle this in the eq-of-parts fold now that we no longer canonicalize this to a masked icmp.

Proofs: https://alive2.llvm.org/ce/z/qidkzq

Fixes #110919.

Full diff: https://github.com/llvm/llvm-project/pull/112704.diff

3 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (+17-6)
(modified) llvm/lib/Transforms/InstCombine/InstCombineInternal.h (+1-1)
(modified) llvm/test/Transforms/InstCombine/eq-of-parts.ll (+2-9)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 8112255a0b6c45..629ac859c06e35 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -1185,14 +1185,25 @@ static Value *extractIntPart(const IntPart &P, IRBuilderBase &Builder) {
 /// (icmp eq X0, Y0) & (icmp eq X1, Y1) -> icmp eq X01, Y01
 /// (icmp ne X0, Y0) | (icmp ne X1, Y1) -> icmp ne X01, Y01
 /// where X0, X1 and Y0, Y1 are adjacent parts extracted from an integer.
-Value *InstCombinerImpl::foldEqOfParts(ICmpInst *Cmp0, ICmpInst *Cmp1,
-                                       bool IsAnd) {
+Value *InstCombinerImpl::foldEqOfParts(Value *Cmp0, Value *Cmp1, bool IsAnd) {
   if (!Cmp0->hasOneUse() || !Cmp1->hasOneUse())
     return nullptr;
 
   CmpInst::Predicate Pred = IsAnd ? CmpInst::ICMP_EQ : CmpInst::ICMP_NE;
-  auto GetMatchPart = [&](ICmpInst *Cmp,
+  auto GetMatchPart = [&](Value *CmpV,
                           unsigned OpNo) -> std::optional<IntPart> {
+    Value *X, *Y;
+    // icmp ne (and x, 1), (and y, 1) <=> trunc (xor x, y) to i1
+    // icmp eq (and x, 1), (and y, 1) <=> not (trunc (xor x, y) to i1)
+    if (Pred == CmpInst::ICMP_NE
+            ? match(CmpV, m_Trunc(m_Xor(m_Value(X), m_Value(Y))))
+            : match(CmpV, m_Not(m_Trunc(m_Xor(m_Value(X), m_Value(Y))))))
+      return {{OpNo == 0 ? X : Y, 0, 1}};
+
+    auto *Cmp = dyn_cast<ICmpInst>(CmpV);
+    if (!Cmp)
+      return std::nullopt;
+
     if (Pred == Cmp->getPredicate())
       return matchIntPart(Cmp->getOperand(OpNo));
 
@@ -3413,9 +3424,6 @@ Value *InstCombinerImpl::foldAndOrOfICmps(ICmpInst *LHS, ICmpInst *RHS,
       return X;
   }
 
-  if (Value *X = foldEqOfParts(LHS, RHS, IsAnd))
-    return X;
-
   // (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0)
   // (icmp eq A, 0) & (icmp eq B, 0) --> (icmp eq (A|B), 0)
   // TODO: Remove this and below when foldLogOpOfMaskedICmps can handle undefs.
@@ -3538,6 +3546,9 @@ Value *InstCombinerImpl::foldBooleanAndOr(Value *LHS, Value *RHS,
       if (Value *Res = foldLogicOfFCmps(LHSCmp, RHSCmp, IsAnd, IsLogical))
         return Res;
 
+  if (Value *Res = foldEqOfParts(LHS, RHS, IsAnd))
+    return Res;
+
   return nullptr;
 }
 
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
index 7a060cdab2d37d..6c38725939f2a6 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+++ b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
@@ -411,7 +411,7 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final
                           bool IsAnd, bool IsLogical = false);
   Value *foldXorOfICmps(ICmpInst *LHS, ICmpInst *RHS, BinaryOperator &Xor);
 
-  Value *foldEqOfParts(ICmpInst *Cmp0, ICmpInst *Cmp1, bool IsAnd);
+  Value *foldEqOfParts(Value *Cmp0, Value *Cmp1, bool IsAnd);
 
   Value *foldAndOrOfICmpsUsingRanges(ICmpInst *ICmp1, ICmpInst *ICmp2,
                                      bool IsAnd);
diff --git a/llvm/test/Transforms/InstCombine/eq-of-parts.ll b/llvm/test/Transforms/InstCombine/eq-of-parts.ll
index afe5d6af1fcd49..5bcfef812c6bb6 100644
--- a/llvm/test/Transforms/InstCombine/eq-of-parts.ll
+++ b/llvm/test/Transforms/InstCombine/eq-of-parts.ll
@@ -1441,11 +1441,7 @@ define i1 @ne_optimized_highbits_cmp_todo_overlapping(i32 %x, i32 %y) {
 
 define i1 @and_trunc_i1(i8 %a1, i8 %a2) {
 ; CHECK-LABEL: @and_trunc_i1(
-; CHECK-NEXT:    [[XOR:%.*]] = xor i8 [[A1:%.*]], [[A2:%.*]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[XOR]], 2
-; CHECK-NEXT:    [[LOBIT:%.*]] = trunc i8 [[XOR]] to i1
-; CHECK-NEXT:    [[LOBIT_INV:%.*]] = xor i1 [[LOBIT]], true
-; CHECK-NEXT:    [[AND:%.*]] = and i1 [[CMP]], [[LOBIT_INV]]
+; CHECK-NEXT:    [[AND:%.*]] = icmp eq i8 [[A1:%.*]], [[A2:%.*]]
 ; CHECK-NEXT:    ret i1 [[AND]]
 ;
   %xor = xor i8 %a1, %a2
@@ -1494,10 +1490,7 @@ define i1 @and_trunc_i1_wrong_operands(i8 %a1, i8 %a2, i8 %a3) {
 
 define i1 @or_trunc_i1(i64 %a1, i64 %a2) {
 ; CHECK-LABEL: @or_trunc_i1(
-; CHECK-NEXT:    [[XOR:%.*]] = xor i64 [[A2:%.*]], [[A1:%.*]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i64 [[XOR]], 1
-; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i64 [[XOR]] to i1
-; CHECK-NEXT:    [[OR:%.*]] = or i1 [[CMP]], [[TRUNC]]
+; CHECK-NEXT:    [[OR:%.*]] = icmp ne i64 [[A2:%.*]], [[A1:%.*]]
 ; CHECK-NEXT:    ret i1 [[OR]]
 ;
   %xor = xor i64 %a2, %a1

goldsteinn · 2024-10-17T14:15:32Z

LGTM (edit: assuming) no regressions on Yingwei's benchmarks.

Also note, some further missing patterns: https://godbolt.org/z/b79Gnrxnx

PR Link: llvm/llvm-project#112704

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

We currently support simple reassociation for foldAndOrOfICmps(). Support the same for foldLogicOfFCmps() by going through the common foldBooleanAndOr() helper. This will also resolve the regression on llvm#112704, which is also due to missing reassoc support. I had to adjust one fold to add support for FMF flag preservation, otherwise there would be test regressions. There is a separate fold (reassociateFCmps) handling reassociation for *just* that specific case and it preserves FMF. Unfortuantely it's not rendered entirely redundant by this patch, because it handles one more level of reassociation as well.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

We currently support simple reassociation for foldAndOrOfICmps(). Support the same for foldLogicOfFCmps() by going through the common foldBooleanAndOr() helper. This will also resolve the regression on llvm#112704, which is also due to missing reassoc support. I had to adjust one fold to add support for FMF flag preservation, otherwise there would be test regressions. There is a separate fold (reassociateFCmps) handling reassociation for *just* that specific case and it preserves FMF. Unfortuantely it's not rendered entirely redundant by this patch, because it handles one more level of reassociation as well.

We currently support simple reassociation for foldAndOrOfICmps(). Support the same for foldLogicOfFCmps() by going through the common foldBooleanAndOr() helper. This will also resolve the regression on #112704, which is also due to missing reassoc support. I had to adjust one fold to add support for FMF flag preservation, otherwise there would be test regressions. There is a separate fold (reassociateFCmps) handling reassociation for *just* that specific case and it preserves FMF. Unfortunately it's not rendered entirely redundant by this patch, because it handles one more level of reassociation as well.

To guard against regression from #112704.

Equality/inequality of the low bit can be represented by `(trunc (xor x, y) to i1)`, possibly with an extra not. We have to handle this in the eq-of-parts fold now that we no longer canonicalize this to a masked icmp. Proofs: https://alive2.llvm.org/ce/z/qidkzq Fixes llvm#110919.

nikic requested review from dtcxzyw and goldsteinn October 17, 2024 13:05

llvmbot added the llvm:transforms label Oct 17, 2024

dtcxzyw mentioned this pull request Oct 17, 2024

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

goldsteinn approved these changes Oct 17, 2024

View reviewed changes

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Oct 17, 2024

pre-commit: test PR112704

e9ca818

PR Link: llvm/llvm-project#112704

dtcxzyw mentioned this pull request Oct 17, 2024

pre-commit: test PR112704 dtcxzyw/llvm-opt-benchmark#1511

Closed

dtcxzyw reviewed Oct 18, 2024

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp Show resolved Hide resolved

nikic mentioned this pull request Nov 13, 2024

[InstCombine] Support reassoc for foldLogicOfFCmps #116065

Merged

andjo403 reviewed Nov 16, 2024

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp Show resolved Hide resolved

nikic added a commit that referenced this pull request Nov 25, 2024

[InstCombine] Add extra test for eq of parts fold (NFC)

321fe74

To guard against regression from #112704.

nikic force-pushed the instcombine-eq-parts-trunc branch from a9402ae to dd6eff5 Compare November 25, 2024 09:31

llvmbot added the llvm:instcombine label Nov 25, 2024

add assert

7117fbb

dtcxzyw mentioned this pull request Nov 25, 2024

pre-commit: PR112704 dtcxzyw/llvm-opt-benchmark#1751

Closed

nikic merged commit e477989 into llvm:main Nov 25, 2024
8 checks passed

nikic deleted the instcombine-eq-parts-trunc branch November 25, 2024 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Handle trunc i1 pattern in eq-of-parts fold #112704

[InstCombine] Handle trunc i1 pattern in eq-of-parts fold #112704

nikic commented Oct 17, 2024

llvmbot commented Oct 17, 2024

goldsteinn commented Oct 17, 2024 •

edited

Loading

[InstCombine] Handle trunc i1 pattern in eq-of-parts fold #112704

[InstCombine] Handle trunc i1 pattern in eq-of-parts fold #112704

Conversation

nikic commented Oct 17, 2024

llvmbot commented Oct 17, 2024

goldsteinn commented Oct 17, 2024 • edited Loading

goldsteinn commented Oct 17, 2024 •

edited

Loading