Skip to content

LICM: hoist BO assoc when BinOp is in RHS #107072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 4, 2024

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Sep 3, 2024

Extend hoistBOAssociation smoothly to handle the case when the inner BinaryOperator is in the RHS of the outer BinaryOperator. This completes the generalization of hoistBOAssociation, and the only limitation after this patch is the fact that only Add and Mul are hoisted.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

Extend hoistBOAssociation smoothly to handle the case when the inner BinaryOperator is in the RHS of the outer BinaryOperator. This completes the generalization of hoistBOAssociation, and the only limitation after this patch is the fact that only Add and Mul are hoisted.


Patch is 42.94 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107072.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LICM.cpp (+48-28)
  • (modified) llvm/test/Transforms/LICM/hoist-binop.ll (+869-21)
  • (modified) llvm/test/Transforms/LICM/sink-foldable.ll (+1-2)
  • (modified) llvm/test/Transforms/LICM/update-scev-after-hoist.ll (+27-7)
diff --git a/llvm/lib/Transforms/Scalar/LICM.cpp b/llvm/lib/Transforms/Scalar/LICM.cpp
index 86c7dceffc5245..d6dee41dfefa23 100644
--- a/llvm/lib/Transforms/Scalar/LICM.cpp
+++ b/llvm/lib/Transforms/Scalar/LICM.cpp
@@ -2801,55 +2801,75 @@ static bool hoistMulAddAssociation(Instruction &I, Loop &L,
   return true;
 }
 
-/// Reassociate associative binary expressions of the form
+/// Reassociate binary expressions of the form
 ///
-/// 1. "(LV op C1) op C2" ==> "LV op (C1 op C2)"
+/// 1. "(LV op C1) op C2" ==> "LV op (C1 op C2)" if op is an associative BinOp
+/// 2. "(C1 op LV) op C2" ==> "LV op (C1 op C2)" if op is a commutative BinOp
+/// 3. "C2 op (C1 op LV)" ==> "(C2 op C1) op LV" if op an associative BinOp
+/// 4. "C2 op (LV op C1)" ==> "(C2 op C1) op LV" if op is a commutative BinOp
 ///
-/// where op is an associative binary op, LV is a loop variant, and C1 and C2
-/// are loop invariants that we want to hoist.
-///
-/// TODO: This can be extended to more cases such as
-/// 2. "C1 op (C2 op LV)" ==> "(C1 op C2) op LV"
-/// 3. "(C1 op LV) op C2" ==> "LV op (C1 op C2)" if op is commutative
-/// 4. "C1 op (LV op C2)" ==> "(C1 op C2) op LV" if op is commutative
+/// where LV is a loop variant, and C1 and C2 are loop invariants that we want
+/// to hoist.
 static bool hoistBOAssociation(Instruction &I, Loop &L,
                                ICFLoopSafetyInfo &SafetyInfo,
                                MemorySSAUpdater &MSSAU, AssumptionCache *AC,
                                DominatorTree *DT) {
-  auto *BO = dyn_cast<BinaryOperator>(&I);
-  if (!BO || !BO->isAssociative())
-    return false;
+  using namespace PatternMatch;
 
-  // Only fold ADDs for now.
+  Value *LV, *C1, *C2;
+  if (!match(&I, m_c_BinOp(m_BinOp(m_Value(LV), m_Value(C1)), m_Value(C2))) ||
+      !L.isLoopInvariant(C2))
+    return false;
+  auto *BO = cast<BinaryOperator>(&I);
+  auto *C = dyn_cast<BinaryOperator>(BO->getOperand(1));
+  bool BinOpInRHS = C && C->getOperand(0) == LV;
+  auto *BO0 = cast<BinaryOperator>(BO->getOperand(BinOpInRHS));
   Instruction::BinaryOps Opcode = BO->getOpcode();
-  if (Opcode != Instruction::Add)
+  if (BO0->getOpcode() != Opcode)
     return false;
-
-  auto *BO0 = dyn_cast<BinaryOperator>(BO->getOperand(0));
-  if (!BO0 || BO0->getOpcode() != Opcode || !BO0->isAssociative() ||
-      BO0->hasNUsesOrMore(3))
+  bool BothAssociative = BO->isAssociative() && BO0->isAssociative();
+  bool BothCommutative = BO->isCommutative() && BO0->isCommutative();
+  if (!L.isLoopInvariant(LV) && L.isLoopInvariant(C1)) {
+    if ((!BinOpInRHS && !BothAssociative) || (BinOpInRHS && !BothCommutative))
+      return false;
+  } else if (L.isLoopInvariant(LV) && !L.isLoopInvariant(C1)) {
+    if ((!BinOpInRHS && !BothCommutative) || (BinOpInRHS && !BothAssociative))
+      return false;
+    std::swap(LV, C1);
+  } else
     return false;
 
-  // Transform: "(LV op C1) op C2" ==> "LV op (C1 op C2)"
-  Value *LV = BO0->getOperand(0);
-  Value *C1 = BO0->getOperand(1);
-  Value *C2 = BO->getOperand(1);
+  // TODO: Only hoist ADDs and MULs for now.
+  if (Opcode != Instruction::Add && Opcode != Instruction::Mul)
+    return false;
 
-  if (L.isLoopInvariant(LV) || !L.isLoopInvariant(C1) || !L.isLoopInvariant(C2))
+  // This is a heuristic to ensure that code-size doesn't blow up.
+  if (BO0->hasNUsesOrMore(3))
     return false;
 
   auto *Preheader = L.getLoopPreheader();
   assert(Preheader && "Loop is not in simplify form?");
 
-  auto *Inv = BinaryOperator::Create(Opcode, C1, C2, "invariant.op",
-                                     Preheader->getTerminator()->getIterator());
-  auto *NewBO = BinaryOperator::Create(
-      Opcode, LV, Inv, BO->getName() + ".reass", BO->getIterator());
+  // To create C2 op C1, instead of C1 op C2.
+  if (BinOpInRHS)
+    std::swap(C1, C2);
+
+  IRBuilder<> Builder(Preheader->getTerminator());
+  auto *Inv = Builder.CreateBinOp(Opcode, C1, C2, "invariant.op");
+
+  auto *NewBO =
+      BinOpInRHS
+          ? BinaryOperator::Create(Opcode, Inv, LV, BO->getName() + ".reass",
+                                   BO->getIterator())
+          : BinaryOperator::Create(Opcode, LV, Inv, BO->getName() + ".reass",
+                                   BO->getIterator());
 
   // Copy NUW for ADDs if both instructions have it.
   if (Opcode == Instruction::Add && BO->hasNoUnsignedWrap() &&
       BO0->hasNoUnsignedWrap()) {
-    Inv->setHasNoUnsignedWrap(true);
+    // If the constant-folder didn't kick in, and a new Instruction was created.
+    if (auto *I = dyn_cast<Instruction>(Inv))
+      I->setHasNoUnsignedWrap(true);
     NewBO->setHasNoUnsignedWrap(true);
   }
 
diff --git a/llvm/test/Transforms/LICM/hoist-binop.ll b/llvm/test/Transforms/LICM/hoist-binop.ll
index 7a309ea294fc85..b3dbcb2f3d5ddc 100644
--- a/llvm/test/Transforms/LICM/hoist-binop.ll
+++ b/llvm/test/Transforms/LICM/hoist-binop.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt -S -passes=licm < %s | FileCheck %s
 
-; Fold ADD and remove old op if unused.
+; Hoist ADD and remove old op if unused.
 define void @add_one_use(i64 %c1, i64 %c2) {
 ; CHECK-LABEL: @add_one_use(
 ; CHECK-NEXT:  entry:
@@ -22,8 +22,161 @@ loop:
   br label %loop
 }
 
-; Fold ADD and copy NUW if both ops have it.
-; https://alive2.llvm.org/ce/z/bPAT7Z
+; Hoist ADD and remove old op if unused.
+; Version where operands are commuted.
+define void @add_one_use_comm(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_one_use_comm(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INDEX]], [[INVARIANT_OP]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add i64 %c1, %index
+  %index.next = add i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist ADD and remove old op if unused.
+; Another version where operands are commuted.
+define void @add_one_use_comm2(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_one_use_comm2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add i64 %index, %c1
+  %index.next = add i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist ADD and remove old op if unused.
+; Another version where operands are commuted.
+define void @add_one_use_comm3(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_one_use_comm3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add i64 %c1, %index
+  %index.next = add i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist MUL and remove old op if unused.
+define void @mul_one_use(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_one_use(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[FACTOR_OP_MUL:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[STEP_ADD_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD_REASS]] = mul i64 [[INDEX]], [[FACTOR_OP_MUL]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul i64 %index, %c1
+  %index.next = mul i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist MUL and remove old op if unused.
+; Version where operands are commuted.
+define void @mul_one_use_comm(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_one_use_comm(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[FACTOR_OP_MUL:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[STEP_ADD_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD_REASS]] = mul i64 [[FACTOR_OP_MUL]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul i64 %c1, %index
+  %index.next = mul i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist MUL and remove old op if unused.
+; Another version where operands are commuted.
+define void @mul_one_use_comm2(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_one_use_comm2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[FACTOR_OP_MUL:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[STEP_ADD_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD_REASS]] = mul i64 [[INDEX]], [[FACTOR_OP_MUL]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul i64 %index, %c1
+  %index.next = mul i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist MUL and remove old op if unused.
+; Another version where operands are commuted.
+define void @mul_one_use_comm3(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_one_use_comm3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[FACTOR_OP_MUL:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[STEP_ADD_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD_REASS]] = mul i64 [[FACTOR_OP_MUL]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul i64 %c1, %index
+  %index.next = mul i64 %c2, %step.add
+  br label %loop
+}
+
+
+; Hoist ADD and copy NUW if both ops have it.
 define void @add_nuw(i64 %c1, i64 %c2) {
 ; CHECK-LABEL: @add_nuw(
 ; CHECK-NEXT:  entry:
@@ -47,7 +200,180 @@ loop:
   br label %loop
 }
 
-; Fold ADD but don't copy NUW if only one op has it.
+; Hoist ADD and copy NUW if both ops have it.
+; Version where operands are commuted.
+define void @add_nuw_comm(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_nuw_comm(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add nuw i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add nuw i64 [[C1]], [[INDEX]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add nuw i64 [[INDEX]], [[INVARIANT_OP]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add nuw i64 %c1, %index
+  call void @use(i64 %step.add)
+  %index.next = add nuw i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist ADD and copy NUW if both ops have it.
+; Another version where operands are commuted.
+define void @add_nuw_comm2(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_nuw_comm2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add nuw i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add nuw i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add nuw i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add nuw i64 %index, %c1
+  call void @use(i64 %step.add)
+  %index.next = add nuw i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist ADD and copy NUW if both ops have it.
+; Another version where operands are commuted.
+define void @add_nuw_comm3(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_nuw_comm3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add nuw i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add nuw i64 [[C1]], [[INDEX]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add nuw i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add nuw i64 %c1, %index
+  call void @use(i64 %step.add)
+  %index.next = add nuw i64 %c2, %step.add
+  br label %loop
+}
+; Hoist MUL and drop NUW if both ops have it.
+define void @mul_nuw(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_nuw(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = mul nuw i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = mul i64 [[INDEX]], [[INVARIANT_OP]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul nuw i64 %index, %c1
+  call void @use(i64 %step.add)
+  %index.next = mul nuw i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist MUL and drop NUW if both ops have it.
+; Version where operands are commuted.
+define void @mul_nuw_comm(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_nuw_comm(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = mul i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = mul nuw i64 [[C1]], [[INDEX]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = mul i64 [[INDEX]], [[INVARIANT_OP]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul nuw i64 %c1, %index
+  call void @use(i64 %step.add)
+  %index.next = mul nuw i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist MUL and drop NUW if both ops have it.
+; Another version where operands are commuted.
+define void @mul_nuw_comm2(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_nuw_comm2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = mul i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = mul nuw i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = mul i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul nuw i64 %index, %c1
+  call void @use(i64 %step.add)
+  %index.next = mul nuw i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist MUL and drop NUW if both ops have it.
+; Another version where operands are commuted.
+define void @mul_nuw_comm3(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @mul_nuw_comm3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = mul i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = mul nuw i64 [[C1]], [[INDEX]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = mul i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = mul nuw i64 %c1, %index
+  call void @use(i64 %step.add)
+  %index.next = mul nuw i64 %c2, %step.add
+  br label %loop
+}
+
+; Hoist ADD but don't copy NUW if only one op has it.
 define void @add_no_nuw(i64 %c1, i64 %c2) {
 ; CHECK-LABEL: @add_no_nuw(
 ; CHECK-NEXT:  entry:
@@ -71,15 +397,16 @@ loop:
   br label %loop
 }
 
-; Fold ADD but don't copy NSW if one op has it.
-define void @add_no_nsw(i64 %c1, i64 %c2) {
-; CHECK-LABEL: @add_no_nsw(
+; Hoist ADD but don't copy NUW if only one op has it.
+; Version where operands are commuted.
+define void @add_no_nuw_comm(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_no_nuw_comm(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C1:%.*]], [[C2:%.*]]
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
-; CHECK-NEXT:    [[STEP_ADD:%.*]] = add i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add i64 [[C1]], [[INDEX]]
 ; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
 ; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INDEX]], [[INVARIANT_OP]]
 ; CHECK-NEXT:    br label [[LOOP]]
@@ -87,25 +414,51 @@ define void @add_no_nsw(i64 %c1, i64 %c2) {
 entry:
   br label %loop
 
+loop:
+  %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
+  %step.add = add i64 %c1, %index
+  call void @use(i64 %step.add)
+  %index.next = add nuw i64 %step.add, %c2
+  br label %loop
+}
+
+; Hoist ADD but don't copy NUW if only one op has it.
+; Another version where operands are commuted.
+define void @add_no_nuw_comm2(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_no_nuw_comm2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C2:%.*]], [[C1:%.*]]
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INVARIANT_OP]], [[INDEX]]
+; CHECK-NEXT:    br label [[LOOP]]
+;
+entry:
+  br label %loop
+
 loop:
   %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
   %step.add = add i64 %index, %c1
   call void @use(i64 %step.add)
-  %index.next = add nsw i64 %step.add, %c2
+  %index.next = add nuw i64 %c2, %step.add
   br label %loop
 }
 
-; Fold ADD but don't copy NSW even if both ops have it.
-define void @add_no_nsw_2(i64 %c1, i64 %c2) {
-; CHECK-LABEL: @add_no_nsw_2(
+; Hoist ADD but don't copy NUW if only one op has it.
+; Another version where operands are commuted.
+define void @add_no_nuw_comm3(i64 %c1, i64 %c2) {
+; CHECK-LABEL: @add_no_nuw_comm3(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C1:%.*]], [[C2:%.*]]
+; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add i64 [[C2:%.*]], [[C1:%.*]]
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT_REASS:%.*]], [[LOOP]] ]
-; CHECK-NEXT:    [[STEP_ADD:%.*]] = add nsw i64 [[INDEX]], [[C1]]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add i64 [[C1]], [[INDEX]]
 ; CHECK-NEXT:    call void @use(i64 [[STEP_ADD]])
-; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INDEX]], [[INVARIANT_OP]]
+; CHECK-NEXT:    [[INDEX_NEXT_REASS]] = add i64 [[INVARIANT_OP]], [[INDEX]]
 ; CHECK-NEXT:    br label [[LOOP]]
 ;
 entry:
@@ -113,13 +466,508 @@ entry:
 
 loop:
   %index = phi i64 [ 0, %entry ], [ %index.next, %loop ]
-  %step.add = add nsw i64 %index, %c1
+  %step.add = add i64 %c...
[truncated]

@artagnon artagnon force-pushed the licm-boassoc-fullcomm branch 2 times, most recently from 499eaa5 to d7f2c17 Compare September 4, 2024 12:21
Copy link

github-actions bot commented Sep 4, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Extend hoistBOAssociation smoothly to handle the case when the inner
BinaryOperator is in the RHS of the outer BinaryOperator. This completes
the generalization of hoistBOAssociation, and the only limitation after
this patch is the fact that only Add and Mul are hoisted.
@artagnon artagnon force-pushed the licm-boassoc-fullcomm branch from d7f2c17 to 2bc99bf Compare September 4, 2024 12:28
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

A caveat here is that this will not work correctly if both operands are binary operators. So a more general approach would be to instead check whether the LHS is loop invariant and then swap. Unlikely to matter though.

@artagnon
Copy link
Contributor Author

artagnon commented Sep 4, 2024

A caveat here is that this will not work correctly if both operands are binary operators. So a more general approach would be to instead check whether the LHS is loop invariant and then swap. Unlikely to matter though.

Thanks for catching this oversight. I've pushed a fix along with some tests: could you kindly check it?

@artagnon artagnon force-pushed the licm-boassoc-fullcomm branch from 9096b02 to e575d84 Compare September 4, 2024 17:15
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@artagnon artagnon merged commit 16900d3 into llvm:main Sep 4, 2024
8 checks passed
@artagnon artagnon deleted the licm-boassoc-fullcomm branch September 4, 2024 21:01
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 4, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/5469

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp   -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test/ompt /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test/ompt /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants