[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

fhahn · 2024-08-29T11:30:38Z

Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2).
If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap, so unsigned max represents an upper bound.

llvmbot · 2024-08-29T11:31:09Z

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2).
If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap, so unsigned max represents an upper bound.

Full diff: https://github.com/llvm/llvm-project/pull/106530.diff

2 Files Affected:

(modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+19-6)
(modified) llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll (+2-4)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 980f142f113265..a06490fdd06fa8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -219,13 +219,29 @@ static std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
   const SCEV *ScStart;
   const SCEV *ScEnd;
 
+  auto &DL = Lp->getHeader()->getDataLayout();
+  Type *IdxTy = DL.getIndexType(PtrExpr->getType());
+  const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
   if (SE->isLoopInvariant(PtrExpr, Lp)) {
     ScStart = ScEnd = PtrExpr;
   } else if (auto *AR = dyn_cast<SCEVAddRecExpr>(PtrExpr)) {
-    const SCEV *Ex = PSE.getSymbolicMaxBackedgeTakenCount();
-
     ScStart = AR->getStart();
-    ScEnd = AR->evaluateAtIteration(Ex, *SE);
+    const SCEV *BTC = PSE.getBackedgeTakenCount();
+    if (!isa<SCEVCouldNotCompute>(BTC))
+      ScEnd = AR->evaluateAtIteration(BTC, *SE);
+    else {
+      // Evaluating AR at MaxBTC may wrap and create an expression that is less
+      // than the start of the AddRec due to wrapping (for example consider
+      // MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd
+      // will get incremented by EltSize before returning, so this effectively
+      // sets ScEnd to unsigned max. Note that LAA separately checks that
+      // accesses cannot not wrap, so unsigned max represents an upper bound.
+      const SCEV *MaxBTC = PSE.getSymbolicMaxBackedgeTakenCount();
+      ScEnd = AR->evaluateAtIteration(MaxBTC, *SE);
+      if (!SE->isKnownNonNegative(SE->getMinusSCEV(ScEnd, ScStart)))
+        ScEnd = SE->getNegativeSCEV(
+            SE->getAddExpr(EltSizeSCEV, SE->getOne(EltSizeSCEV->getType())));
+    }
     const SCEV *Step = AR->getStepRecurrence(*SE);
 
     // For expressions with negative step, the upper bound is ScStart and the
@@ -247,9 +263,6 @@ static std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
   assert(SE->isLoopInvariant(ScEnd, Lp)&& "ScEnd needs to be invariant");
 
   // Add the size of the pointed element to ScEnd.
-  auto &DL = Lp->getHeader()->getDataLayout();
-  Type *IdxTy = DL.getIndexType(PtrExpr->getType());
-  const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
   ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
 
   Iter->second = {ScStart, ScEnd};
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll b/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
index b9951c7ed02ec0..6da807749ab909 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
@@ -3,7 +3,6 @@
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
-; FIXME: Start == End for access group with AddRec.
 define void @runtime_checks_with_symbolic_max_btc_neg_1(ptr %P, ptr %S, i32 %x, i32 %y) {
 ; CHECK-LABEL: 'runtime_checks_with_symbolic_max_btc_neg_1'
 ; CHECK-NEXT:    loop:
@@ -17,7 +16,7 @@ define void @runtime_checks_with_symbolic_max_btc_neg_1(ptr %P, ptr %S, i32 %x,
 ; CHECK-NEXT:        ptr %S
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP1]]:
-; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: ((4 * %y) + %P))
+; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: -1)
 ; CHECK-NEXT:            Member: {((4 * %y) + %P),+,4}<%loop>
 ; CHECK-NEXT:        Group [[GRP2]]:
 ; CHECK-NEXT:          (Low: %S High: (4 + %S))
@@ -44,7 +43,6 @@ exit:
   ret void
 }
 
-; FIXME: Start > End for access group with AddRec.
 define void @runtime_check_with_symbolic_max_btc_neg_2(ptr %P, ptr %S, i32 %x, i32 %y) {
 ; CHECK-LABEL: 'runtime_check_with_symbolic_max_btc_neg_2'
 ; CHECK-NEXT:    loop:
@@ -58,7 +56,7 @@ define void @runtime_check_with_symbolic_max_btc_neg_2(ptr %P, ptr %S, i32 %x, i
 ; CHECK-NEXT:        ptr %S
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP3]]:
-; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: (-4 + (4 * %y) + %P))
+; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: -1)
 ; CHECK-NEXT:            Member: {((4 * %y) + %P),+,4}<%loop>
 ; CHECK-NEXT:        Group [[GRP4]]:
 ; CHECK-NEXT:          (Low: %S High: (4 + %S))

david-arm · 2024-08-29T13:29:56Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

-    ScEnd = AR->evaluateAtIteration(Ex, *SE);
+    const SCEV *BTC = PSE.getBackedgeTakenCount();
+    if (!isa<SCEVCouldNotCompute>(BTC))
+      ScEnd = AR->evaluateAtIteration(BTC, *SE);


Why do we not have to worry about wrapping for this case? The value for BTC may not be constant and may also be an expression I think.

IIUC it should be safe because we know the loop takes the backedge exactly BTC times. LAA separately checks if the pointers used for accesses can warp. If the evaluation wraps, the loop either executes UB (memory access with poison pointer) or the wraped/poisoned pointer is not used..

Added a comment explaining the reasoning

I guess I probably don't understand LoopAccessAnalysis well enough here, but I tried out this patch and ran SCEV analysis on this loop:

loop: %iv = phi i32 [ %y, %entry ], [ %iv.next, %loop ] %gep1.iv = getelementptr inbounds i32, ptr %S, i32 %iv %load = load i32, ptr %gep1.iv, align 4 %gep2.iv = getelementptr inbounds i32, ptr %P, i32 %iv store i32 %load, ptr %gep2.iv, align 4 %iv.next = add nsw i32 %iv, 1 %c.2 = icmp slt i32 %iv.next, 2147483647 br i1 %c.2, label %loop, label %exit

and with the exact BTC the debug output looks like this:

Grouped accesses: Group 0xaaaaf5a0b678: (Low: ((4 * %y) + %P) High: (-4 + %P)) Member: {((4 * %y) + %P),+,4}<%loop> Group 0xaaaaf5a0b6a8: (Low: ((4 * %y) + %S) High: (-4 + %S)) Member: {((4 * %y) + %S),+,4}<%loop>

The High value has still wrapped and looks wrong. Are you saying that some code in LoopAccessAnalysis will bail out before even attempting to use the High value if the backedge taken count is exact, whereas for a symbolic backedge taken count we take a different code path and will still attempt to use the High value?

The maximum size of an allocated object is half the pointer index space (in this case 2147483647) and the pointer past the object must also not wrap. The base pointer of the inbounds GEP must also be inbounds, so the last accessed address can be P + 2147483646, so the address one past the object is P + 2147483647

For the test above, the last accessed address would be P + 2147483646 * 4, so that would be UB.

I added variants to the new version of the PR that use GEPs of i8, where one case is valid and doesn't wrap (and has a correct upper bound) and one case where we just wrap by 1 and have an incorrect bound but the input is UB.

https://github.com/llvm/llvm-project/pull/128061/files#diff-c6a2d04947e1dbcdf611a1a351b05824360a86c5827c88867422d241f0fe6394

I hope I didn't miss anything

Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap, so unsigned max represents an upper bound.

fhahn

Unfortunately I deleted the branch by accident which closed the PR and now I am not able to open it again.

I created a new version here: #128061

fhahn · 2025-02-20T20:43:23Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

-    ScEnd = AR->evaluateAtIteration(Ex, *SE);
+    const SCEV *BTC = PSE.getBackedgeTakenCount();
+    if (!isa<SCEVCouldNotCompute>(BTC))
+      ScEnd = AR->evaluateAtIteration(BTC, *SE);


The maximum size of an allocated object is half the pointer index space (in this case 2147483647) and the pointer past the object must also not wrap. The base pointer of the inbounds GEP must also be inbounds, so the last accessed address can be P + 2147483646, so the address one past the object is P + 2147483647

For the test above, the last accessed address would be P + 2147483646 * 4, so that would be UB.

I added variants to the new version of the PR that use GEPs of i8, where one case is valid and doesn't wrap (and has a correct upper bound) and one case where we just wrap by 1 and have an incorrect bound but the input is UB.

https://github.com/llvm/llvm-project/pull/128061/files#diff-c6a2d04947e1dbcdf611a1a351b05824360a86c5827c88867422d241f0fe6394

I hope I didn't miss anything

fhahn requested review from preames, Meinersbur, anemet and david-arm August 29, 2024 11:30

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Aug 29, 2024

david-arm reviewed Aug 29, 2024

View reviewed changes

fhahn added 2 commits August 29, 2024 18:07

!fixup docment why evaluating at BTC should be safe

c037350

fhahn force-pushed the laa-symbolic-max-btc-overflow branch from 85e4af4 to c037350 Compare August 29, 2024 19:25

fhahn closed this Nov 22, 2024

fhahn deleted the laa-symbolic-max-btc-overflow branch November 22, 2024 20:21

fhahn restored the laa-symbolic-max-btc-overflow branch November 22, 2024 20:21

fhahn mentioned this pull request Feb 20, 2025

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #128061

Merged

fhahn commented Feb 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

Uh oh!

fhahn commented Aug 29, 2024

Uh oh!

llvmbot commented Aug 29, 2024

Uh oh!

david-arm Aug 29, 2024

Uh oh!

fhahn Aug 29, 2024

Uh oh!

david-arm Sep 2, 2024

Uh oh!

fhahn Feb 20, 2025

Uh oh!

fhahn left a comment

Uh oh!

fhahn Feb 20, 2025

Uh oh!

Uh oh!

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

Uh oh!

Conversation

fhahn commented Aug 29, 2024

Uh oh!

llvmbot commented Aug 29, 2024

Uh oh!

david-arm Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

fhahn Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm Sep 2, 2024

Choose a reason for hiding this comment

Uh oh!

fhahn Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!