[LV] Don't skip instrs with side-effects in reg pressure computation. #126415

fhahn · 2025-02-09T11:22:01Z

calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users.

This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores.

Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects.

llvmbot · 2025-02-09T11:22:33Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-backend-powerpc

Author: Florian Hahn (fhahn)

Changes

calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users.

This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores.

Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects.

Full diff: https://github.com/llvm/llvm-project/pull/126415.diff

5 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll (+2-1)
(modified) llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll (+2-1)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (+2-2)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (+6-4)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index dacee6445072aab..5f31a186cfd6e7c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5253,7 +5253,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
       OpenIntervals.erase(ToRemove);
 
     // Ignore instructions that are never used within the loop.
-    if (!Ends.count(I))
+    if (!Ends.count(I) && !I->mayHaveSideEffects())
       continue;
 
     // Skip ignored values.
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
index 8239d32445c1054..961ef64acd4a46c 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
@@ -14,8 +14,9 @@
 define void @get_invariant_reg_usage(ptr %z) {
 ; CHECK: LV: Checking a loop in 'get_invariant_reg_usage'
 ; CHECK: LV(REG): VF = vscale x 16
-; CHECK-NEXT: LV(REG): Found max usage: 1 item
+; CHECK-NEXT: LV(REG): Found max usage: 2 item
 ; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 3 registers
+; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 1 registers 
 ; CHECK-NEXT: LV(REG): Found invariant usage: 2 item
 ; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers
 ; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 8 registers 
diff --git a/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
index f45a2f0f5b7e8f3..021ef0d543a18ef 100644
--- a/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/LoongArch/reg-usage.ll
@@ -15,8 +15,9 @@ define void @bar(ptr %A, i32 signext %n) {
 ; CHECK-SCALAR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
 ; CHECK-SCALAR-NEXT: LV: The target has 30 registers of LoongArch::GPRRC register class
 ; CHECK-SCALAR-NEXT: LV: The target has 32 registers of LoongArch::FPRRC register class
-; CHECK-VECTOR:      LV(REG): Found max usage: 1 item
+; CHECK-VECTOR:      LV(REG): Found max usage: 2 item
 ; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::VRRC, 3 registers
+; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
 ; CHECK-VECTOR-NEXT: LV(REG): Found invariant usage: 1 item
 ; CHECK-VECTOR-NEXT: LV(REG): RegisterClass: LoongArch::GPRRC, 1 registers
 ; CHECK-VECTOR-NEXT: LV: The target has 32 registers of LoongArch::VRRC register class
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
index aac9aff3d391f5d..db4b580a3967718 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll
@@ -179,7 +179,7 @@ define void @double_(ptr nocapture %A, i32 %n) nounwind uwtable ssp {
 
 ;CHECK-PWR9: LV(REG): VF = 1
 ;CHECK-PWR9: LV(REG): Found max usage: 2 item
-;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers
+;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 3 registers
 ;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 5 registers
 ;CHECK-PWR9: LV(REG): Found invariant usage: 1 item
 ;CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 1 registers
@@ -248,7 +248,7 @@ define void @fp16_(ptr nocapture readonly %pIn, ptr nocapture %pOut, i32 %numRow
 ;CHECK-LABEL: fp16_
 ;CHECK: LV(REG): VF = 1
 ;CHECK: LV(REG): Found max usage: 2 item
-;CHECK: LV(REG): RegisterClass: PPC::GPRRC, 4 registers
+;CHECK: LV(REG): RegisterClass: PPC::GPRRC, 5 registers
 ;CHECK: LV(REG): RegisterClass: PPC::VSXRC, 2 registers
 entry:
   %tmp.0.extract.trunc = trunc i32 %scale.coerce to i16
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
index f630f4f21e065ff..4e563c242ee848c 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
@@ -128,8 +128,9 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV(REG): At #5 Interval # 3
 ; CHECK-NEXT:  LV(REG): At #6 Interval # 3
 ; CHECK-NEXT:  LV(REG): At #7 Interval # 3
-; CHECK-NEXT:  LV(REG): At #9 Interval # 1
-; CHECK-NEXT:  LV(REG): At #10 Interval # 2
+; CHECK-NEXT:  LV(REG): At #8 Interval # 3
+; CHECK-NEXT:  LV(REG): At #9 Interval # 2
+; CHECK-NEXT:  LV(REG): At #10 Interval # 3
 ; CHECK-NEXT:  LV(REG): VF = vscale x 4
 ; CHECK-NEXT:  LV(REG): Found max usage: 2 item
 ; CHECK-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 3 registers
@@ -377,8 +378,9 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV(REG): At #5 Interval # 3
 ; CHECK-NEXT:  LV(REG): At #6 Interval # 3
 ; CHECK-NEXT:  LV(REG): At #7 Interval # 3
-; CHECK-NEXT:  LV(REG): At #9 Interval # 1
-; CHECK-NEXT:  LV(REG): At #10 Interval # 2
+; CHECK-NEXT:  LV(REG): At #8 Interval # 3
+; CHECK-NEXT:  LV(REG): At #9 Interval # 2
+; CHECK-NEXT:  LV(REG): At #10 Interval # 3
 ; CHECK-NEXT:  LV(REG): VF = vscale x 4
 ; CHECK-NEXT:  LV(REG): Found max usage: 2 item
 ; CHECK-NEXT:  LV(REG): RegisterClass: RISCV::GPRRC, 3 registers

SamTebbs33

That makes a lot of sense, LGTM.

david-arm · 2025-02-11T13:42:11Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

@@ -5253,7 +5253,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
      OpenIntervals.erase(ToRemove);

    // Ignore instructions that are never used within the loop.
-    if (!Ends.count(I))
+    if (!Ends.count(I) && !I->mayHaveSideEffects())


I guess this may also include calls to intrinsics or functions that return void?

Yes, any instruction that is not explicitly used and has side-effects. So here we don't skip intrinsics any more or calls that return void.

Instructions that should not count as 'real' users, e.g. llvm.assume() calls are skipped below (via ValuesToIgnore).

calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users. This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores. Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects.

…omputation. (#126415) calculateRegisterUsage adds end points for each user of an instruction to Ends and ignores instructions not added to it, i.e. instructions with no users. This means things like stores aren't included, which in turn means values that are only used in stores are also not included for consideration. This means we underestimate the register usage in cases where the only users are things like stores. Update the code to don't skip instructions without users (i.e. not in Ends) if they have side-effects. PR: llvm/llvm-project#126415

Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on #126415 PR: #126437

…26437) Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on llvm/llvm-project#126415 PR: llvm/llvm-project#126437

Add a version of calculateRegisterUsage that works estimates register usage for a VPlan. This mostly just ports the existing code, with some updates to figure out what recipes will generate vectors vs scalars. There are number of changes in the computed register usages, but they should be more accurate w.r.t. to the generated vector code. There are the following changes: * Scalar usage increases in most cases by 1, as we always create a scalar canonical IV, which is alive across the loop and is not considered by the legacy implementation * Output is ordered by insertion, now scalar registers are added first due the canonical IV phi. * Using the VPlan, we now also more precisely know if an induction will be vectorized or scalarized. Depends on llvm#126415 PR: llvm#126437

fhahn requested review from SamTebbs33, ayalz and david-arm February 9, 2025 11:22

llvmbot added backend:RISC-V backend:PowerPC vectorizers llvm:transforms labels Feb 9, 2025

fhahn mentioned this pull request Feb 9, 2025

[LV] Compute register usage for interleaving on VPlan. #126437

Merged

SamTebbs33 approved these changes Feb 10, 2025

View reviewed changes

david-arm reviewed Feb 11, 2025

View reviewed changes

fhahn force-pushed the lv-reg-usage-sideeffects branch from 40dd9b2 to c214984 Compare March 19, 2025 12:20

!fixup extend comment

c58a2e6

fhahn merged commit 11b8699 into llvm:main Mar 19, 2025
9 of 10 checks passed

fhahn deleted the lv-reg-usage-sideeffects branch March 19, 2025 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LV] Don't skip instrs with side-effects in reg pressure computation. #126415

[LV] Don't skip instrs with side-effects in reg pressure computation. #126415

Uh oh!

fhahn commented Feb 9, 2025

Uh oh!

llvmbot commented Feb 9, 2025 •

edited

Loading

Uh oh!

SamTebbs33 left a comment

Uh oh!

david-arm Feb 11, 2025

Uh oh!

fhahn Mar 10, 2025

Uh oh!

Uh oh!

Uh oh!

[LV] Don't skip instrs with side-effects in reg pressure computation. #126415

[LV] Don't skip instrs with side-effects in reg pressure computation. #126415

Uh oh!

Conversation

fhahn commented Feb 9, 2025

Uh oh!

llvmbot commented Feb 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamTebbs33 left a comment

Choose a reason for hiding this comment

Uh oh!

david-arm Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Feb 9, 2025 •

edited

Loading