[InstCombine] Fold (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2) #119196

andjo403 · 2024-12-09T10:44:38Z

Folds the patterns:
not(trunc X to i1) | iszero(X & Pow2) -> (X & (1 | Pow2)) != (1 | Pow2)
(trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2)

this is the same pattern that foldAndOrOfICmpsOfAndWithPow2 handles but one !iszero(X & 1) have been folded to a trunc X to i1 or iszero(X & 1) folded to not(trunc X to i1)
around 100 files updated for the llvm-opt-benchmark with like 6 files with some regressions
like https://alive2.llvm.org/ce/z/SQLjda and https://alive2.llvm.org/ce/z/_Vu4yB
do not know how to handle the regressions feels like for every fold that I add to handle one regressions i gett new regressions.

proof: https://alive2.llvm.org/ce/z/ofzAyQ

…X & (1 | Pow2)) != (1 | Pow2)

… Pow2)) != (1 | Pow2) also folds (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2)

llvmbot · 2024-12-09T10:45:13Z

@llvm/pr-subscribers-llvm-transforms

Author: Andreas Jonson (andjo403)

Changes

Folds the patterns:
not(trunc X to i1) | iszero(X & Pow2) -> (X & (1 | Pow2)) != (1 | Pow2)
(trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2)

this is the same pattern that foldAndOrOfICmpsOfAndWithPow2 handles but one !iszero(X & 1) have been folded to a trunc X to i1 or iszero(X & 1) folded to not(trunc X to i1)
around 100 files updated for the llvm-opt-benchmark with like 6 files with some regressions
like https://alive2.llvm.org/ce/z/SQLjda and https://alive2.llvm.org/ce/z/_Vu4yB
do not know how to handle the regressions feels like for every fold that I add to handle one regressions i gett new regressions.

proof: https://alive2.llvm.org/ce/z/ofzAyQ

Full diff: https://github.com/llvm/llvm-project/pull/119196.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (+38)
(modified) llvm/test/Transforms/InstCombine/onehot_merge.ll (+292)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index b4033fc2a418a1..da58bedbce8b34 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -805,6 +805,40 @@ Value *InstCombinerImpl::foldAndOrOfICmpsOfAndWithPow2(ICmpInst *LHS,
   return nullptr;
 }
 
+// Fold not(trunc X to i1) | iszero(X & Pow2)
+//  -> (X & (1 | Pow2)) != (1 | Pow2)
+// Fold (trunc X to i1) & !iszero(X & Pow2))
+//  -> (X & (1 | Pow2)) == (1 | Pow2)
+static Value *foldTruncAndOrICmpOfAndWithPow2(InstCombiner::BuilderTy &Builder,
+                                              Value *LHS, Value *RHS,
+                                              bool IsAnd, bool IsLogical,
+                                              const SimplifyQuery &Q) {
+  CmpInst::Predicate Pred = IsAnd ? CmpInst::ICMP_NE : CmpInst::ICMP_EQ;
+
+  if (isa<ICmpInst>(LHS))
+    std::swap(LHS, RHS);
+
+  Value *X, *Pow2;
+
+  if ((IsAnd ? match(LHS, m_Trunc(m_Value(X)))
+             : match(LHS, m_Not(m_Trunc(m_Value(X))))) &&
+      match(RHS, m_SpecificICmp(Pred, m_c_And(m_Specific(X), m_Value(Pow2)),
+                                m_ZeroInt())) &&
+      isKnownToBeAPowerOfTwo(Pow2, Q.DL, /*OrZero=*/false, /*Depth=*/0, Q.AC,
+                             Q.CxtI, Q.DT)) {
+    // If this is a logical and/or, then we must prevent propagation of a
+    // poison value from the RHS by inserting freeze.
+    if (IsLogical)
+      Pow2 = Builder.CreateFreeze(Pow2);
+    Value *Mask = Builder.CreateOr(ConstantInt::get(Pow2->getType(), 1), Pow2);
+    Value *Masked = Builder.CreateAnd(X, Mask);
+    auto NewPred = IsAnd ? CmpInst::ICMP_EQ : CmpInst::ICMP_NE;
+    return Builder.CreateICmp(NewPred, Masked, Mask);
+  }
+
+  return nullptr;
+}
+
 /// General pattern:
 ///   X & Y
 ///
@@ -3541,6 +3575,10 @@ Value *InstCombinerImpl::foldBooleanAndOr(Value *LHS, Value *RHS,
 
   if (Value *Res = foldEqOfParts(LHS, RHS, IsAnd))
     return Res;
+  const SimplifyQuery Q = SQ.getWithInstruction(&I);
+  if (Value *Res = foldTruncAndOrICmpOfAndWithPow2(Builder, LHS, RHS, IsAnd,
+                                                   IsLogical, Q))
+    return Res;
 
   return nullptr;
 }
diff --git a/llvm/test/Transforms/InstCombine/onehot_merge.ll b/llvm/test/Transforms/InstCombine/onehot_merge.ll
index 2e57597455c2cd..e4586a231535a3 100644
--- a/llvm/test/Transforms/InstCombine/onehot_merge.ll
+++ b/llvm/test/Transforms/InstCombine/onehot_merge.ll
@@ -1143,3 +1143,295 @@ define i1 @foo1_and_signbit_lshr_without_shifting_signbit_not_pwr2_logical(i32 %
   %or = select i1 %t2, i1 true, i1 %t4
   ret i1 %or
 }
+
+define i1 @trunc_or_icmp_consts(i8 %k) {
+; CHECK-LABEL: @trunc_or_icmp_consts(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i8 [[K:%.*]], 9
+; CHECK-NEXT:    [[OR:%.*]] = icmp ne i8 [[TMP1]], 9
+; CHECK-NEXT:    ret i1 [[OR]]
+;
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %and = and i8 %k, 8
+  %icmp = icmp eq i8 %and, 0
+  %ret = or i1 %not, %icmp
+  ret i1 %ret
+}
+
+define i1 @icmp_or_trunc_consts(i8 %k) {
+; CHECK-LABEL: @icmp_or_trunc_consts(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i8 [[K:%.*]], 9
+; CHECK-NEXT:    [[OR:%.*]] = icmp ne i8 [[TMP1]], 9
+; CHECK-NEXT:    ret i1 [[OR]]
+;
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %and = and i8 %k, 8
+  %icmp = icmp eq i8 %and, 0
+  %ret = or i1 %icmp, %not
+  ret i1 %ret
+}
+
+define i1 @trunc_or_icmp(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_or_icmp(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or i8 [[T]], 1
+; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp ne i8 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = or i1 %icmp, %not
+  ret i1 %ret
+}
+
+define i1 @trunc_logical_or_icmp(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_logical_or_icmp(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or i8 [[T]], 1
+; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp ne i8 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = select i1 %icmp, i1 true, i1 %not
+  ret i1 %ret
+}
+
+define i1 @icmp_logical_or_trunc(i8 %k, i8 %c1) {
+; CHECK-LABEL: @icmp_logical_or_trunc(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = freeze i8 [[T]]
+; CHECK-NEXT:    [[TMP2:%.*]] = or i8 [[TMP1]], 1
+; CHECK-NEXT:    [[TMP3:%.*]] = and i8 [[K:%.*]], [[TMP2]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp ne i8 [[TMP3]], [[TMP2]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = select i1 %not, i1 true, i1 %icmp
+  ret i1 %ret
+}
+
+define <2 x i1> @trunc_or_icmp_vec(<2 x i8> %k, <2 x i8> %c1) {
+; CHECK-LABEL: @trunc_or_icmp_vec(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw <2 x i8> splat (i8 1), [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or <2 x i8> [[T]], splat (i8 1)
+; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp ne <2 x i8> [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret <2 x i1> [[RET]]
+;
+  %t = shl <2 x i8> <i8 1, i8 1>, %c1
+  %t1 = and <2 x i8> %t, %k
+  %icmp = icmp eq <2 x i8> %t1, zeroinitializer
+  %trunc = trunc <2 x i8> %k to <2 x i1>
+  %not = xor <2 x i1> %trunc, <i1 true, i1 true>
+  %ret = or <2 x i1> %icmp, %not
+  ret <2 x i1> %ret
+}
+
+define i1 @neg_trunc_or_icmp_not_pow2(i8 %k, i8 %c1) {
+; CHECK-LABEL: @neg_trunc_or_icmp_not_pow2(
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[C1:%.*]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp eq i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[NOT:%.*]] = xor i1 [[TRUNC]], true
+; CHECK-NEXT:    [[RET:%.*]] = or i1 [[ICMP]], [[NOT]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t1 = and i8 %c1, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = or i1 %icmp, %not
+  ret i1 %ret
+}
+
+define i1 @neg_trunc_or_icmp_not_trunc(i8 %k, i8 %c1) {
+; CHECK-LABEL: @neg_trunc_or_icmp_not_trunc(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[T]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp eq i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[RET:%.*]] = or i1 [[ICMP]], [[TRUNC]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = or i1 %icmp, %trunc
+  ret i1 %ret
+}
+
+define i1 @neg_trunc_or_icmp_ne(i8 %k, i8 %c1) {
+; CHECK-LABEL: @neg_trunc_or_icmp_ne(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[T]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp ne i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[NOT:%.*]] = xor i1 [[TRUNC]], true
+; CHECK-NEXT:    [[RET:%.*]] = or i1 [[ICMP]], [[NOT]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = or i1 %icmp, %not
+  ret i1 %ret
+}
+
+define i1 @trunc_and_icmp_consts(i8 %k) {
+; CHECK-LABEL: @trunc_and_icmp_consts(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i8 [[K:%.*]], 9
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[TMP1]], 9
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %trunc = trunc i8 %k to i1
+  %and = and i8 %k, 8
+  %icmp = icmp ne i8 %and, 0
+  %ret = and i1 %trunc, %icmp
+  ret i1 %ret
+}
+
+define i1 @icmp_and_trunc_consts(i8 %k) {
+; CHECK-LABEL: @icmp_and_trunc_consts(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i8 [[K:%.*]], 9
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[TMP1]], 9
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %trunc = trunc i8 %k to i1
+  %and = and i8 %k, 8
+  %icmp = icmp ne i8 %and, 0
+  %ret = and i1 %icmp, %trunc
+  ret i1 %ret
+}
+
+define i1 @trunc_and_icmp(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_and_icmp(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or i8 [[T]], 1
+; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = and i1 %icmp, %trunc
+  ret i1 %ret
+}
+
+define i1 @trunc_logical_and_icmp(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_logical_and_icmp(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or i8 [[T]], 1
+; CHECK-NEXT:    [[TMP2:%.*]] = and i8 [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = select i1 %icmp, i1 %trunc, i1 false
+  ret i1 %ret
+}
+
+define i1 @icmp_logical_and_trunc(i8 %k, i8 %c1) {
+; CHECK-LABEL: @icmp_logical_and_trunc(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = freeze i8 [[T]]
+; CHECK-NEXT:    [[TMP2:%.*]] = or i8 [[TMP1]], 1
+; CHECK-NEXT:    [[TMP3:%.*]] = and i8 [[K:%.*]], [[TMP2]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq i8 [[TMP3]], [[TMP2]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = select i1 %trunc, i1 %icmp, i1 false
+  ret i1 %ret
+}
+
+define <2 x i1> @trunc_and_icmp_vec(<2 x i8> %k, <2 x i8> %c1) {
+; CHECK-LABEL: @trunc_and_icmp_vec(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw <2 x i8> splat (i8 1), [[C1:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = or <2 x i8> [[T]], splat (i8 1)
+; CHECK-NEXT:    [[TMP2:%.*]] = and <2 x i8> [[K:%.*]], [[TMP1]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp eq <2 x i8> [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret <2 x i1> [[RET]]
+;
+  %t = shl <2 x i8> <i8 1, i8 1>, %c1
+  %t1 = and <2 x i8> %t, %k
+  %icmp = icmp ne <2 x i8> %t1, zeroinitializer
+  %trunc = trunc <2 x i8> %k to <2 x i1>
+  %ret = and <2 x i1> %icmp, %trunc
+  ret <2 x i1> %ret
+}
+
+define i1 @trunc_and_icmp_not_pow2(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_and_icmp_not_pow2(
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[C1:%.*]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp ne i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[RET:%.*]] = and i1 [[ICMP]], [[TRUNC]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t1 = and i8 %c1, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = and i1 %icmp, %trunc
+  ret i1 %ret
+}
+
+define i1 @trunc_and_icmp_not_trunc(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_and_icmp_not_trunc(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[T]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp ne i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[NOT:%.*]] = xor i1 [[TRUNC]], true
+; CHECK-NEXT:    [[RET:%.*]] = and i1 [[ICMP]], [[NOT]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp ne i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %not = xor i1 %trunc, true
+  %ret = and i1 %icmp, %not
+  ret i1 %ret
+}
+
+define i1 @trunc_and_icmp_eq(i8 %k, i8 %c1) {
+; CHECK-LABEL: @trunc_and_icmp_eq(
+; CHECK-NEXT:    [[T:%.*]] = shl nuw i8 1, [[C1:%.*]]
+; CHECK-NEXT:    [[T1:%.*]] = and i8 [[T]], [[K:%.*]]
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp eq i8 [[T1]], 0
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[K]] to i1
+; CHECK-NEXT:    [[RET:%.*]] = and i1 [[ICMP]], [[TRUNC]]
+; CHECK-NEXT:    ret i1 [[RET]]
+;
+  %t = shl i8 1, %c1
+  %t1 = and i8 %t, %k
+  %icmp = icmp eq i8 %t1, 0
+  %trunc = trunc i8 %k to i1
+  %ret = and i1 %icmp, %trunc
+  ret i1 %ret
+}

andjo403 · 2024-12-09T10:53:55Z

this fixes a regression due to #84628

dtcxzyw

Can you adjust foldLogOpOfMaskedICmps to convert trunc X to i1 back into (X & 1) != 0?

BTW, I skimmed through the comments in #84628. It seems to cause many performance regressions. It would be better to revert #84628 and find another way to unblock #83829.
cc @nikic @YanWQ-monad @dianqk @goldsteinn

andjo403 · 2024-12-10T20:21:04Z

Can you adjust foldLogOpOfMaskedICmps to convert trunc X to i1 back into (X & 1) != 0?

I do not know what you mean by this?

dtcxzyw · 2024-12-11T02:36:14Z

Can you adjust foldLogOpOfMaskedICmps to convert trunc X to i1 back into (X & 1) != 0?

I do not know what you mean by this?

I mean that we should pass a value instead of anICmpInst into foldLogOpOfMaskedICmps, then we can handle this pattern in decomposeBitTestICmp and reuse the remaining fold logic.

BTW, I recommend reverting #84628. Let's wait for inputs from other folks.

dianqk · 2024-12-12T01:30:47Z

I know a little about instcombine. Would it be possible to canonicalize (X & 1) != 0 to trunc X to i1, and replace all (X & 1) != 0 pattern matching with trunc X to i1?

andjo403 · 2024-12-14T10:13:52Z

I mean that we should pass a value instead of anICmpInst into foldLogOpOfMaskedICmps, then we can handle this pattern in decomposeBitTestICmp and reuse the remaining fold logic.

This fold is not covered by foldLogOpOfMaskedICmps so that is also a fold that need to handle trunc X to i1

one down side if #84628 is reverted is that it will be harder to find this type of missing folds if that change shall be applied again when the regressions is handled.

andjo403 · 2024-12-17T21:57:31Z

think that I can close this PR as it looks better to move the fold done in foldAndOrOfICmpsOfAndWithPow2 in to foldLogOpOfMaskedICmps and then trunc x to i1 pattern only need to be handled in getMaskedTypeForICmpPair.

but the question still is if I shall try to find/fix regressions due to the removed trunc X to i1 into (X & 1) != 0 or as @dtcxzyw suggest reverting #84628

andjo403 · 2025-01-02T09:47:26Z

Ping See summary in last comment

dtcxzyw · 2025-01-02T13:14:47Z

@nikic Any thoughts?

andjo403 · 2025-01-06T21:32:53Z

To have some better information I have done some searching in the code to see how many places I think trunc X to i1 need to be handled to have no regressions from #84628

one reason for adding handling of the trunc pattern is that vectors already is folded to trunc so that will be better optimized.

I find about 13 places that I think need some work:
computeKnownBitsFromCond for dominating conditions in ValueTracking
and then also findValuesAffectedByCondition only adds trunc for assume

computeKnownBitsFromContext for assumes in ValueTracking handled in #118406 but have the ephemeral value problem for the not( trunc) pattern.

getValueFromCondition in LazyValueInfo

simplifySelectWithFakeICmpEq in InstructionSimplify

foldSelectICmpAnd in InstCombineSelect

foldSelectICmpAndAnd in InstCombineSelect looks like it will result in more instructions so maybe not. maybe the reason that decomposeBitTestICmp is not handled in this function also.

foldSelectICmpAndZeroShl in InstCombineSelect unclear why decomposeBitTestICmp is not handled in this function looks like it will save instructions.

foldSelectICmpAndBinOp in InstCombineSelect maybe shall have a common foldSelectICmpBitTest function that have the bit test code so it is not repeated.

fold to alignment assume bundle in InstCombineCalls

getMaskedTypeForICmpPair in InstCombineAndOrXor if foldAndOrOfICmpsOfAndWithPow2 is moved in to foldLogOpOfMaskedICmps that is the only fold in this file that need update.

matchLeftShift in HexagonLoopIdiomRecognition

detectShiftUntilBitTestIdiom in LoopIdiomRecognize

many of the places already have decomposeBitTestICmp so maybe good to make a decomposeBitTest that only have a Value as input that handle trunc and then call decomposeBitTestICmp for icmp.

dtcxzyw · 2025-01-07T05:17:52Z

many of the places already have decomposeBitTestICmp so maybe good to make a decomposeBitTest that only have a Value as input that handle trunc and then call decomposeBitTestICmp for icmp.

Thank you for the detailed investigation! I agree that we can continue to migrate to trunc X to i1.

dtcxzyw · 2025-01-07T05:19:35Z

think that I can close this PR as it looks better to move the fold done in foldAndOrOfICmpsOfAndWithPow2 in to foldLogOpOfMaskedICmps and then trunc x to i1 pattern only need to be handled in getMaskedTypeForICmpPair.

Agree.

andjo403 · 2025-01-07T16:29:56Z

Good then I will continue with new PRs

andjo403 added 2 commits December 9, 2024 09:02

[NFC] Pre-commit test Fold not(trunc X to i1) | iszero(X & Pow2) -> (…

f73635a

…X & (1 | Pow2)) != (1 | Pow2)

andjo403 requested a review from nikic as a code owner December 9, 2024 10:44

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Dec 9, 2024

andjo403 requested review from dtcxzyw and nikic and removed request for nikic December 9, 2024 10:44

This was referenced Dec 9, 2024

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR119196 dtcxzyw/llvm-opt-benchmark#1852

Closed

dtcxzyw reviewed Dec 9, 2024

View reviewed changes

andjo403 added 2 commits December 14, 2024 11:01

[NFC] Add test for redundant freeze

266fac9

[InstCombine] remove redundant freeze in foldTruncAndOrICmpOfAndWithPow2

b781ca8

andjo403 closed this Jan 7, 2025

andjo403 deleted the foldAndOrOfTruncICmpOfAndWithPow2 branch January 7, 2025 16:35

andjo403 mentioned this pull request Jan 7, 2025

[InstCombine] move foldAndOrOfICmpsOfAndWithPow2 into foldLogOpOfMaskedICmps #121970

Merged

andjo403 mentioned this pull request Jan 26, 2025

[InstSimplify] Handle trunc to i1 in Select with bit test folds. #122944

Merged

dtcxzyw mentioned this pull request Apr 6, 2025

Missing fold (X == 0) | (trunc nuw X to i1) -> true #134093

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Fold (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2) #119196

[InstCombine] Fold (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2) #119196

Uh oh!

andjo403 commented Dec 9, 2024

Uh oh!

llvmbot commented Dec 9, 2024

Uh oh!

andjo403 commented Dec 9, 2024 •

edited

Loading

Uh oh!

dtcxzyw left a comment

Uh oh!

andjo403 commented Dec 10, 2024

Uh oh!

dtcxzyw commented Dec 11, 2024

Uh oh!

dianqk commented Dec 12, 2024

Uh oh!

andjo403 commented Dec 14, 2024

Uh oh!

andjo403 commented Dec 17, 2024

Uh oh!

andjo403 commented Jan 2, 2025

Uh oh!

dtcxzyw commented Jan 2, 2025

Uh oh!

andjo403 commented Jan 6, 2025

Uh oh!

dtcxzyw commented Jan 7, 2025

Uh oh!

dtcxzyw commented Jan 7, 2025

Uh oh!

andjo403 commented Jan 7, 2025

Uh oh!

Uh oh!

[InstCombine] Fold (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2) #119196

[InstCombine] Fold (trunc X to i1) & !iszero(X & Pow2)) -> (X & (1 | Pow2)) == (1 | Pow2) #119196

Uh oh!

Conversation

andjo403 commented Dec 9, 2024

Uh oh!

llvmbot commented Dec 9, 2024

Uh oh!

andjo403 commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

andjo403 commented Dec 10, 2024

Uh oh!

dtcxzyw commented Dec 11, 2024

Uh oh!

dianqk commented Dec 12, 2024

Uh oh!

andjo403 commented Dec 14, 2024

Uh oh!

andjo403 commented Dec 17, 2024

Uh oh!

andjo403 commented Jan 2, 2025

Uh oh!

dtcxzyw commented Jan 2, 2025

Uh oh!

andjo403 commented Jan 6, 2025

Uh oh!

dtcxzyw commented Jan 7, 2025

Uh oh!

dtcxzyw commented Jan 7, 2025

Uh oh!

andjo403 commented Jan 7, 2025

Uh oh!

Uh oh!

andjo403 commented Dec 9, 2024 •

edited

Loading