Skip to content

[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. #106530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Aug 29, 2024

Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2).
If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap, so unsigned max represents an upper bound.

@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2).
If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap, so unsigned max represents an upper bound.


Full diff: https://github.com/llvm/llvm-project/pull/106530.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+19-6)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll (+2-4)
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 980f142f113265..a06490fdd06fa8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -219,13 +219,29 @@ static std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
   const SCEV *ScStart;
   const SCEV *ScEnd;
 
+  auto &DL = Lp->getHeader()->getDataLayout();
+  Type *IdxTy = DL.getIndexType(PtrExpr->getType());
+  const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
   if (SE->isLoopInvariant(PtrExpr, Lp)) {
     ScStart = ScEnd = PtrExpr;
   } else if (auto *AR = dyn_cast<SCEVAddRecExpr>(PtrExpr)) {
-    const SCEV *Ex = PSE.getSymbolicMaxBackedgeTakenCount();
-
     ScStart = AR->getStart();
-    ScEnd = AR->evaluateAtIteration(Ex, *SE);
+    const SCEV *BTC = PSE.getBackedgeTakenCount();
+    if (!isa<SCEVCouldNotCompute>(BTC))
+      ScEnd = AR->evaluateAtIteration(BTC, *SE);
+    else {
+      // Evaluating AR at MaxBTC may wrap and create an expression that is less
+      // than the start of the AddRec due to wrapping (for example consider
+      // MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd
+      // will get incremented by EltSize before returning, so this effectively
+      // sets ScEnd to unsigned max. Note that LAA separately checks that
+      // accesses cannot not wrap, so unsigned max represents an upper bound.
+      const SCEV *MaxBTC = PSE.getSymbolicMaxBackedgeTakenCount();
+      ScEnd = AR->evaluateAtIteration(MaxBTC, *SE);
+      if (!SE->isKnownNonNegative(SE->getMinusSCEV(ScEnd, ScStart)))
+        ScEnd = SE->getNegativeSCEV(
+            SE->getAddExpr(EltSizeSCEV, SE->getOne(EltSizeSCEV->getType())));
+    }
     const SCEV *Step = AR->getStepRecurrence(*SE);
 
     // For expressions with negative step, the upper bound is ScStart and the
@@ -247,9 +263,6 @@ static std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
   assert(SE->isLoopInvariant(ScEnd, Lp)&& "ScEnd needs to be invariant");
 
   // Add the size of the pointed element to ScEnd.
-  auto &DL = Lp->getHeader()->getDataLayout();
-  Type *IdxTy = DL.getIndexType(PtrExpr->getType());
-  const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
   ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
 
   Iter->second = {ScStart, ScEnd};
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll b/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
index b9951c7ed02ec0..6da807749ab909 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/evaluate-at-symbolic-max-backedge-taken-count-may-wrap.ll
@@ -3,7 +3,6 @@
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
-; FIXME: Start == End for access group with AddRec.
 define void @runtime_checks_with_symbolic_max_btc_neg_1(ptr %P, ptr %S, i32 %x, i32 %y) {
 ; CHECK-LABEL: 'runtime_checks_with_symbolic_max_btc_neg_1'
 ; CHECK-NEXT:    loop:
@@ -17,7 +16,7 @@ define void @runtime_checks_with_symbolic_max_btc_neg_1(ptr %P, ptr %S, i32 %x,
 ; CHECK-NEXT:        ptr %S
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP1]]:
-; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: ((4 * %y) + %P))
+; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: -1)
 ; CHECK-NEXT:            Member: {((4 * %y) + %P),+,4}<%loop>
 ; CHECK-NEXT:        Group [[GRP2]]:
 ; CHECK-NEXT:          (Low: %S High: (4 + %S))
@@ -44,7 +43,6 @@ exit:
   ret void
 }
 
-; FIXME: Start > End for access group with AddRec.
 define void @runtime_check_with_symbolic_max_btc_neg_2(ptr %P, ptr %S, i32 %x, i32 %y) {
 ; CHECK-LABEL: 'runtime_check_with_symbolic_max_btc_neg_2'
 ; CHECK-NEXT:    loop:
@@ -58,7 +56,7 @@ define void @runtime_check_with_symbolic_max_btc_neg_2(ptr %P, ptr %S, i32 %x, i
 ; CHECK-NEXT:        ptr %S
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP3]]:
-; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: (-4 + (4 * %y) + %P))
+; CHECK-NEXT:          (Low: ((4 * %y) + %P) High: -1)
 ; CHECK-NEXT:            Member: {((4 * %y) + %P),+,4}<%loop>
 ; CHECK-NEXT:        Group [[GRP4]]:
 ; CHECK-NEXT:          (Low: %S High: (4 + %S))

ScEnd = AR->evaluateAtIteration(Ex, *SE);
const SCEV *BTC = PSE.getBackedgeTakenCount();
if (!isa<SCEVCouldNotCompute>(BTC))
ScEnd = AR->evaluateAtIteration(BTC, *SE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not have to worry about wrapping for this case? The value for BTC may not be constant and may also be an expression I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC it should be safe because we know the loop takes the backedge exactly BTC times. LAA separately checks if the pointers used for accesses can warp. If the evaluation wraps, the loop either executes UB (memory access with poison pointer) or the wraped/poisoned pointer is not used..

Added a comment explaining the reasoning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I probably don't understand LoopAccessAnalysis well enough here, but I tried out this patch and ran SCEV analysis on this loop:

loop:
  %iv = phi i32 [ %y, %entry ], [ %iv.next, %loop ]
  %gep1.iv = getelementptr inbounds i32, ptr %S, i32 %iv
  %load = load i32, ptr %gep1.iv, align 4
  %gep2.iv = getelementptr inbounds i32, ptr %P, i32 %iv
  store i32 %load, ptr %gep2.iv, align 4
  %iv.next = add nsw i32 %iv, 1
  %c.2 = icmp slt i32 %iv.next, 2147483647
  br i1 %c.2, label %loop, label %exit

and with the exact BTC the debug output looks like this:

    Grouped accesses:
      Group 0xaaaaf5a0b678:
        (Low: ((4 * %y) + %P) High: (-4 + %P))
          Member: {((4 * %y) + %P),+,4}<%loop>
      Group 0xaaaaf5a0b6a8:
        (Low: ((4 * %y) + %S) High: (-4 + %S))
          Member: {((4 * %y) + %S),+,4}<%loop>

The High value has still wrapped and looks wrong. Are you saying that some code in LoopAccessAnalysis will bail out before even attempting to use the High value if the backedge taken count is exact, whereas for a symbolic backedge taken count we take a different code path and will still attempt to use the High value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum size of an allocated object is half the pointer index space (in this case 2147483647) and the pointer past the object must also not wrap. The base pointer of the inbounds GEP must also be inbounds, so the last accessed address can be P + 2147483646, so the address one past the object is P + 2147483647

For the test above, the last accessed address would be P + 2147483646 * 4, so that would be UB.

I added variants to the new version of the PR that use GEPs of i8, where one case is valid and doesn't wrap (and has a correct upper bound) and one case where we just wrap by 1 and have an incorrect bound but the input is UB.

https://github.com/llvm/llvm-project/pull/128061/files#diff-c6a2d04947e1dbcdf611a1a351b05824360a86c5827c88867422d241f0fe6394

I hope I didn't miss anything

fhahn added 2 commits August 29, 2024 18:07
Evaluating AR at the symbolic max BTC may wrap and create an expression
that is less than the start of the AddRec due to wrapping (for example
consider MaxBTC = -2).
If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get
incremented by EltSize before returning, so this effectively
sets ScEnd to unsigned max. Note that LAA separately checks that
accesses cannot not wrap, so unsigned max represents an upper bound.
@fhahn fhahn force-pushed the laa-symbolic-max-btc-overflow branch from 85e4af4 to c037350 Compare August 29, 2024 19:25
@fhahn fhahn closed this Nov 22, 2024
@fhahn fhahn deleted the laa-symbolic-max-btc-overflow branch November 22, 2024 20:21
@fhahn fhahn restored the laa-symbolic-max-btc-overflow branch November 22, 2024 20:21
Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I deleted the branch by accident which closed the PR and now I am not able to open it again.

I created a new version here: #128061

ScEnd = AR->evaluateAtIteration(Ex, *SE);
const SCEV *BTC = PSE.getBackedgeTakenCount();
if (!isa<SCEVCouldNotCompute>(BTC))
ScEnd = AR->evaluateAtIteration(BTC, *SE);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum size of an allocated object is half the pointer index space (in this case 2147483647) and the pointer past the object must also not wrap. The base pointer of the inbounds GEP must also be inbounds, so the last accessed address can be P + 2147483646, so the address one past the object is P + 2147483647

For the test above, the last accessed address would be P + 2147483646 * 4, so that would be UB.

I added variants to the new version of the PR that use GEPs of i8, where one case is valid and doesn't wrap (and has a correct upper bound) and one case where we just wrap by 1 and have an incorrect bound but the input is UB.

https://github.com/llvm/llvm-project/pull/128061/files#diff-c6a2d04947e1dbcdf611a1a351b05824360a86c5827c88867422d241f0fe6394

I hope I didn't miss anything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants