-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[LV] Create in-loop sub reductions #147026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-vectorizers Author: Sam Tebbs (SamTebbs33) ChangesThis PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Full diff: https://github.com/llvm/llvm-project/pull/147026.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index b275b1064cef2..2b7d06e0c682e 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -1263,6 +1263,10 @@ RecurrenceDescriptor::getReductionOpChain(PHINode *Phi, Loop *L) const {
if (isFMulAddIntrinsic(Cur))
return true;
+ // Recognize a sub reduction. It gets canonicalized to add(sub (0, ...)).
+ if (Cur->getOpcode() == Instruction::Sub && getOpcode() == Instruction::Add)
+ return true;
+
return Cur->getOpcode() == getOpcode();
};
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 907839711a39c..1cfbcf1336620 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9144,6 +9144,14 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
CurrentLinkI->getFastMathFlags());
LinkVPBB->insert(FMulRecipe, CurrentLink->getIterator());
VecOp = FMulRecipe;
+ } else if (PhiR->isInLoop() && Kind == RecurKind::Add &&
+ CurrentLinkI->getOpcode() == Instruction::Sub) {
+ Type *PhiTy = PhiR->getUnderlyingValue()->getType();
+ auto *Zero = Plan->getOrAddLiveIn(ConstantInt::get(PhiTy, 0));
+ VPWidenRecipe *Sub = new VPWidenRecipe(
+ *CurrentLinkI, {Zero, CurrentLink->getOperand(1)});
+ LinkVPBB->insert(Sub, CurrentLink->getIterator());
+ VecOp = Sub;
} else {
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {
if (isa<VPWidenRecipe>(CurrentLink)) {
diff --git a/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll b/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
index e762c9ff81322..4c6e0e487d0dd 100644
--- a/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
+++ b/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
@@ -627,7 +627,7 @@ for.end: ; preds = %for.body, %entry
ret float %result.0.lcssa
}
-; Sub we can create a reduction, but not inloop
+; Sub we can create a reduction inloop
define i32 @reduction_sub_lhs(ptr noalias nocapture %A) {
; CHECK-LABEL: @reduction_sub_lhs(
; CHECK-NEXT: entry:
@@ -636,15 +636,16 @@ define i32 @reduction_sub_lhs(ptr noalias nocapture %A) {
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
; CHECK: vector.body:
; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP1:%.*]], [[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[TMP3:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i64 [[INDEX]]
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP0]], align 4
-; CHECK-NEXT: [[TMP1]] = sub <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]
+; CHECK-NEXT: [[TMP4:%.*]] = sub nsw <4 x i32> zeroinitializer, [[WIDE_LOAD]]
+; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[TMP4]])
+; CHECK-NEXT: [[TMP3]] = add i32 [[TMP1]], [[VEC_PHI]]
; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], 256
; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
; CHECK: middle.block:
-; CHECK-NEXT: [[TMP3:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[TMP1]])
; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]
; CHECK: scalar.ph:
; CHECK-NEXT: br label [[FOR_BODY:%.*]]
|
@llvm/pr-subscribers-llvm-transforms Author: Sam Tebbs (SamTebbs33) ChangesThis PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Full diff: https://github.com/llvm/llvm-project/pull/147026.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index b275b1064cef2..2b7d06e0c682e 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -1263,6 +1263,10 @@ RecurrenceDescriptor::getReductionOpChain(PHINode *Phi, Loop *L) const {
if (isFMulAddIntrinsic(Cur))
return true;
+ // Recognize a sub reduction. It gets canonicalized to add(sub (0, ...)).
+ if (Cur->getOpcode() == Instruction::Sub && getOpcode() == Instruction::Add)
+ return true;
+
return Cur->getOpcode() == getOpcode();
};
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 907839711a39c..1cfbcf1336620 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9144,6 +9144,14 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
CurrentLinkI->getFastMathFlags());
LinkVPBB->insert(FMulRecipe, CurrentLink->getIterator());
VecOp = FMulRecipe;
+ } else if (PhiR->isInLoop() && Kind == RecurKind::Add &&
+ CurrentLinkI->getOpcode() == Instruction::Sub) {
+ Type *PhiTy = PhiR->getUnderlyingValue()->getType();
+ auto *Zero = Plan->getOrAddLiveIn(ConstantInt::get(PhiTy, 0));
+ VPWidenRecipe *Sub = new VPWidenRecipe(
+ *CurrentLinkI, {Zero, CurrentLink->getOperand(1)});
+ LinkVPBB->insert(Sub, CurrentLink->getIterator());
+ VecOp = Sub;
} else {
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {
if (isa<VPWidenRecipe>(CurrentLink)) {
diff --git a/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll b/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
index e762c9ff81322..4c6e0e487d0dd 100644
--- a/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
+++ b/llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
@@ -627,7 +627,7 @@ for.end: ; preds = %for.body, %entry
ret float %result.0.lcssa
}
-; Sub we can create a reduction, but not inloop
+; Sub we can create a reduction inloop
define i32 @reduction_sub_lhs(ptr noalias nocapture %A) {
; CHECK-LABEL: @reduction_sub_lhs(
; CHECK-NEXT: entry:
@@ -636,15 +636,16 @@ define i32 @reduction_sub_lhs(ptr noalias nocapture %A) {
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
; CHECK: vector.body:
; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP1:%.*]], [[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[TMP3:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i64 [[INDEX]]
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP0]], align 4
-; CHECK-NEXT: [[TMP1]] = sub <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]
+; CHECK-NEXT: [[TMP4:%.*]] = sub nsw <4 x i32> zeroinitializer, [[WIDE_LOAD]]
+; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[TMP4]])
+; CHECK-NEXT: [[TMP3]] = add i32 [[TMP1]], [[VEC_PHI]]
; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], 256
; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
; CHECK: middle.block:
-; CHECK-NEXT: [[TMP3:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[TMP1]])
; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]
; CHECK: scalar.ph:
; CHECK-NEXT: br label [[FOR_BODY:%.*]]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi - Could this add a sub into the middle of an extending reduction?
(It might be good to transform a sub reduction into a add reduction + negate at the end of the loop, to allow the more efficient sum reductions to trigger).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with nit addressed.
@davemgreen wrote:
Hi - Could this add a sub into the middle of an extending reduction?
(It might be good to transform a sub reduction into a add reduction + negate at the end of the loop, to allow the more efficient sum reductions to trigger).
That's not a bad idea, although that might be better left for a future PR. The reason for this change is to align the implementations of partial and in-loop reductions. I'm also not entirely sure if we necessarily want to do this for partial reductions, because some of the cdot
patterns rely on the input being negated inside the loop.
; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i64 [[INDEX]] | ||
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP0]], align 4 | ||
; CHECK-NEXT: [[TMP1]] = sub <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]] | ||
; CHECK-NEXT: [[TMP4:%.*]] = sub nsw <4 x i32> zeroinitializer, [[WIDE_LOAD]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, where does the nsw
come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's coming from CurrentLinkI
in https://github.com/llvm/llvm-project/pull/147026/files#diff-da321d454a7246f8ae276bf1db2782bf26b5210b8133cb59e4d7fd45d0905decR9151 .
EDIT: I mis-read which sub this was, so I don't actually know where it's coming from!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out I had the right hunch and it's been removed by my latest commits.
@@ -627,7 +627,7 @@ for.end: ; preds = %for.body, %entry | |||
ret float %result.0.lcssa | |||
} | |||
|
|||
; Sub we can create a reduction, but not inloop | |||
; Sub we can create a reduction inloop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can create an in-loop reduction for sub-reductions by negating the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -9144,6 +9144,14 @@ void LoopVectorizationPlanner::adjustRecipesForReductions( | |||
CurrentLinkI->getFastMathFlags()); | |||
LinkVPBB->insert(FMulRecipe, CurrentLink->getIterator()); | |||
VecOp = FMulRecipe; | |||
} else if (PhiR->isInLoop() && Kind == RecurKind::Add && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this cannot be classified as Sub reduction? Would be good to update the RecurrenceDescriptor to clarify that RecurKind::Add can be a sub-reduction.
Would be great if we could avoid having RecurKind meanings depend on whether it's in loop or not,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a reason so I'll experiment with adding RecurKind::Sub 👍 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Let me know what you think.
Type *PhiTy = PhiR->getUnderlyingValue()->getType(); | ||
auto *Zero = Plan->getOrAddLiveIn(ConstantInt::get(PhiTy, 0)); | ||
VPWidenRecipe *Sub = new VPWidenRecipe( | ||
*CurrentLinkI, {Zero, CurrentLink->getOperand(1)}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct to use the flags from the original sub, e.g. if the operand is INT_MIN it will sign wrap I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
The first of these appears to hit the assert about differences in the cost model. https://godbolt.org/z/GraEdK6Er |
Yep, the new created instruction shouldn't take |
llvm/lib/Analysis/IVDescriptors.cpp
Outdated
// Normally the recur kind is expected to stay the same across all | ||
// reduction instructions. Add and sub can appear in chained reductions so | ||
// accept a sub if the recur kind is add, and vice versa. | ||
if (Kind == RecurKind::Add && Cur->getOpcode() == Instruction::Sub) | ||
Kind = RecurKind::Sub; | ||
else if (Kind == RecurKind::Sub && Cur->getOpcode() == Instruction::Add) | ||
Kind = RecurKind::Add; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little confused by changing the recur kind here. I tried removing this change and updating isRecurrenceInstr()
to:
case Instruction::Sub:
return InstDesc(Kind == RecurKind::Sub, I);
case Instruction::Add:
return InstDesc(Kind == RecurKind::Sub || Kind == RecurKind::Add, I);
And it seems no LV test changed, but maybe I've missed something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention, but this also requires updating isReductionPHI()
to attempt to match Add
reductions first:
if (AddReductionVar(Phi, RecurKind::Add, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found an ADD reduction PHI." << *Phi << "\n");
return true;
}
if (AddReductionVar(Phi, RecurKind::Sub, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found a SUB reduction PHI." << *Phi << "\n");
return true;
}
So if a reduction is all adds, it's matched as an Add
reduction. If it's all subs (or a mix of add/sub) it's matched as a Sub
reduction. Though maybe that's wrong? Is the intention that the reduction kind matches the last operation in the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that's a cleaner solution, thanks.
} else if (PhiR->isInLoop() && Kind == RecurKind::Sub && | ||
CurrentLinkI->getOpcode() == Instruction::Sub) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is no-longer tested (removed it and all tests passed), and at does somewhat confuse me (maybe just needs a comment). I'd think you'd negate the input for a sub
in an add
reduction, not a sub
in a sub
reduction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is something to do with the input of the llvm.vector.reduce.add
in the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes that's right, it needs to be checking for an Add RecurKind.
llvm/lib/Analysis/IVDescriptors.cpp
Outdated
if ((Kind != RecurKind::Sub && !Cur->isCommutative()) && !IsAPhi && | ||
!isa<SelectInst>(Cur) && !isa<ICmpInst>(Cur) && !isa<FCmpInst>(Cur) && | ||
!VisitedInsts.count(dyn_cast<Instruction>(Cur->getOperand(0)))) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this restriction be lifted? I don't think the new vector code is correct for @reduction_sub_rhs
:
Sub reduction on RHS:
s0 = x0 - 0
s1 = x1 - s0
s2 = x2 - s1
s3 = x3 - s2
= x3 - x2 + x1 - x0 + 0
!= vec_reduce_add((x0,x1,x2,x3) - (0, 0, 0, 0)) = x0 + x1 + x2 +x3
Sub reduction on LHS:
s0 = 0 - x0
s1 = s0 - x1
s2 = s1 - x2
s3 = s2 - x3
= 0 - x0 - x1 - x2 - x3
= vec_reduce_add((0, 0, 0, 0) - (x0,x1,x2,x3))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the output IR for that test and it seemed equivalent but tracing the loop itself reveals that you're right and it's not the same. I've added this restriction back now, thanks!
@@ -644,14 +644,14 @@ define i32 @reduction_sub_lhs(ptr noalias nocapture %A) { | |||
; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], 256 | |||
; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]] | |||
; CHECK: middle.block: | |||
; CHECK-NEXT: [[TMP3:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[TMP1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are now missing the negating the operand in the vector loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've traced how the scalar and vector loops act and they are equivalent as they are, but I think I should have removed the change I made to the comment since it doesn't create an in-loop reduction.
Data = [1, 2, 3, 4, 5, 6, 7, 8]
Scalar loop:
Iteration | Operation | New Phi
1 | 0 - 1 | -1
2 | -1 - 2 | -3
3 | -3 - 3 | -6
4 | -6 - 4 | -10
5 | -10 - 5 | -15
6 | -15 - 6 | -21
7 | -21 - 7 | -28
8 | -28 - 8 | -36
Vector loop:
Iteration | Operation | New Phi
1 | [0, 0, 0, 0] - [1, 2, 3, 4] | [-1, -2, -3, -4]
2 | [-1, -2, -3, -4] - [5, 6, 7, 8] | [-6, -8, -10, -12]
reduce.add([-6, -8, -10, -12]) = -36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
@@ -9144,6 +9144,15 @@ void LoopVectorizationPlanner::adjustRecipesForReductions( | |||
CurrentLinkI->getFastMathFlags()); | |||
LinkVPBB->insert(FMulRecipe, CurrentLink->getIterator()); | |||
VecOp = FMulRecipe; | |||
} else if (PhiR->isInLoop() && Kind == RecurKind::Add && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this still looking for RecurdKind::Add
and why does this code still need to look at CurrentLink->getOpcode()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
@@ -627,7 +627,7 @@ for.end: ; preds = %for.body, %entry | |||
ret float %result.0.lcssa | |||
} | |||
|
|||
; Sub we can create a reduction, but not inloop | |||
; We can create an in-loop reduction for sub-reductions by negating the input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test with an interleave factor higher than 1?
I just tried building this test with -mllvm -prefer-inloop-reductions -mllvm -force-vector-interleave=2
:
int foo(int *src, int start, int n) {
for(int i=0; i<n; ++i)
start -= src[i];
return start;
}
and this resulted in a segmentation fault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thank you. And I've added an interleaving run line to the inloop reduction test to make sure nothing breaks the other tests.
This PR allows the loop vectorizer to handle sub reductions by forming a normal add reduction with a negated input.
de3da9e
to
64aa25f
Compare
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/include/llvm/Analysis/IVDescriptors.h llvm/lib/Analysis/IVDescriptors.cpp llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp llvm/lib/Transforms/Utils/LoopUtils.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp llvm/test/Transforms/LoopVectorize/reduction-inloop.ll The following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
} Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
} Please refer to the Undefined Behavior Manual for more information. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
This PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input.