-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[LV] Don't skip instrs with side-effects in reg pressure computation. #126415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-powerpc Author: Florian Hahn (fhahn) ChangescalculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users. This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores. Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects. Full diff: https://github.com/llvm/llvm-project/pull/126415.diff 5 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index dacee6445072aab..5f31a186cfd6e7c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5253,7 +5253,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
OpenIntervals.erase(ToRemove);
// Ignore instructions that are never used within the loop.
- if (!Ends.count(I))
+ if (!Ends.count(I) && !I->mayHaveSideEffects())
continue;
// Skip ignored values.
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
index 8239d32445c1054..961ef64acd4a46c 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
@@ -14,8 +14,9 @@
define void @get_invariant_reg_usage(ptr %z) {
; CHECK: LV: Checking a loop in 'get_invariant_reg_usage'
; CHECK: LV(REG): VF = vscale x 16
-; CHECK-NEXT: LV(REG): Found max usage: 1 item
+; CHECK-NEXT: LV(REG): Found max usage: 2 item
; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 3 registers
+; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 1 registers
; CHECK-NEXT: LV(REG): Found invariant usage: 2 item
; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers
; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 8 registers
diff --git a/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
index f45a2f0f5b7e8f3..021ef0d543a18ef 100644
--- a/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
@@ -15,8 +15,9 @@ define void @bar(ptr %A, i32 signext %n) {
; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
; CHECK-SCALAR-NEXT: LV: The target has 30 registers of LoongArch::GPRRC register class
; CHECK-SCALAR-NEXT: LV: The target has 32 registers of LoongArch::FPRRC register class
-; CHECK-VECTOR: LV(REG): Found max usage: 1 item
+; CHECK-VECTOR: LV(REG): Found max usage: 2 item
; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::VRRC, 3 registers
+; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
; CHECK-VECTOR-NEXT: LV(REG): Found invariant usage: 1 item
; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
; CHECK-VECTOR-NEXT: LV: The target has 32 registers of LoongArch::VRRC register class
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
index aac9aff3d391f5d..db4b580a3967718 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
@@ -179,7 +179,7 @@ define void @double_(ptr nocapture %A, i32 %n) nounwind uwtable ssp {
;CHECK-PWR9: LV(REG): VF = 1
;CHECK-PWR9: LV(REG): Found max usage: 2 item
-;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers
+;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 3 registers
;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 5 registers
;CHECK-PWR9: LV(REG): Found invariant usage: 1 item
;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 1 registers
@@ -248,7 +248,7 @@ define void @fp16_(ptr nocapture readonly %pIn, ptr nocapture %pOut, i32 %numRow
;CHECK-LABEL: fp16_
;CHECK: LV(REG): VF = 1
;CHECK: LV(REG): Found max usage: 2 item
-;CHECK: LV(REG): RegisterClass: PPC::GPRRC, 4 registers
+;CHECK: LV(REG): RegisterClass: PPC::GPRRC, 5 registers
;CHECK: LV(REG): RegisterClass: PPC::VSXRC, 2 registers
entry:
%tmp.0.extract.trunc = trunc i32 %scale.coerce to i16
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
index f630f4f21e065ff..4e563c242ee848c 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
@@ -128,8 +128,9 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur
; CHECK-NEXT: LV(REG): At #5 Interval # 3
; CHECK-NEXT: LV(REG): At #6 Interval # 3
; CHECK-NEXT: LV(REG): At #7 Interval # 3
-; CHECK-NEXT: LV(REG): At #9 Interval # 1
-; CHECK-NEXT: LV(REG): At #10 Interval # 2
+; CHECK-NEXT: LV(REG): At #8 Interval # 3
+; CHECK-NEXT: LV(REG): At #9 Interval # 2
+; CHECK-NEXT: LV(REG): At #10 Interval # 3
; CHECK-NEXT: LV(REG): VF = vscale x 4
; CHECK-NEXT: LV(REG): Found max usage: 2 item
; CHECK-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
@@ -377,8 +378,9 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur
; CHECK-NEXT: LV(REG): At #5 Interval # 3
; CHECK-NEXT: LV(REG): At #6 Interval # 3
; CHECK-NEXT: LV(REG): At #7 Interval # 3
-; CHECK-NEXT: LV(REG): At #9 Interval # 1
-; CHECK-NEXT: LV(REG): At #10 Interval # 2
+; CHECK-NEXT: LV(REG): At #8 Interval # 3
+; CHECK-NEXT: LV(REG): At #9 Interval # 2
+; CHECK-NEXT: LV(REG): At #10 Interval # 3
; CHECK-NEXT: LV(REG): VF = vscale x 4
; CHECK-NEXT: LV(REG): Found max usage: 2 item
; CHECK-NEXT: LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense, LGTM.
@@ -5253,7 +5253,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) { | |||
OpenIntervals.erase(ToRemove); | |||
|
|||
// Ignore instructions that are never used within the loop. | |||
if (!Ends.count(I)) | |||
if (!Ends.count(I) && !I->mayHaveSideEffects()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this may also include calls to intrinsics or functions that return void?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, any instruction that is not explicitly used and has side-effects. So here we don't skip intrinsics any more or calls that return void.
Instructions that should not count as 'real' users, e.g. llvm.assume() calls are skipped below (via ValuesToIgnore).
calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users. This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores. Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects.
40dd9b2
to
c214984
Compare
…omputation. (#126415) calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users. This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores. Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects. PR: llvm/llvm-project#126415
Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on #126415 PR: #126437
…26437) Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on llvm/llvm-project#126415 PR: llvm/llvm-project#126437
Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on llvm#126415 PR: llvm#126437
Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on llvm#126415 PR: llvm#126437
calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users.
This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores.
Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects.