Skip to content

[DA] disambiguate evolution of base addresses #116628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 30, 2025
Merged

[DA] disambiguate evolution of base addresses #116628

merged 1 commit into from
Jan 30, 2025

Conversation

sebpop
Copy link
Contributor

@sebpop sebpop commented Nov 18, 2024

This patch fixes two bugs:
#41488
#53942

The dependence analysis assumes that the base address of array accesses is invariant across loop iterations. In both bugs the base address evolves following loop iterations: the base address flip-flops between two different memory objects.

Based on the scalar evolution of base addresses, the patch adds code to separate the 3 alias cases {must, no, may}-alias where the base addresses are identical at every iteration, never the same, and unknown.

@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2024

@llvm/pr-subscribers-testing-tools

@llvm/pr-subscribers-llvm-analysis

Author: Sebastian Pop (sebpop)

Changes

This patch fixes two bugs:
#41488
#53942

The dependence analysis assumes that the base address of array accesses is invariant across loop iterations. In both bugs the base address evolves following loop iterations: the base address flip-flops between two different memory objects.

Based on the scalar evolution of base addresses, the patch adds code to separate the 3 alias cases {must, no, may}-alias where the base addresses are identical at every iteration, never the same, and unknown.


Full diff: https://github.com/llvm/llvm-project/pull/116628.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/DependenceAnalysis.cpp (+59-21)
  • (added) llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll (+136)
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index a4a98ea0bae146..cd7dd481c16513 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -712,10 +712,60 @@ void Dependence::dump(raw_ostream &OS) const {
 // tbaa, non-overlapping regions etc), then it is known there is no dependecy.
 // Otherwise the underlying objects are checked to see if they point to
 // different identifiable objects.
-static AliasResult underlyingObjectsAlias(AAResults *AA,
-                                          const DataLayout &DL,
-                                          const MemoryLocation &LocA,
-                                          const MemoryLocation &LocB) {
+static AliasResult underlyingObjectsAlias(AAResults *AA, LoopInfo *LI,
+                                          ScalarEvolution *SE, Instruction *A,
+                                          Instruction *B) {
+  const MemoryLocation &LocA = MemoryLocation::get(A);
+  const MemoryLocation &LocB = MemoryLocation::get(B);
+
+  // Check the underlying objects are the same
+  const Value *AObj = getUnderlyingObject(LocA.Ptr);
+  const Value *BObj = getUnderlyingObject(LocB.Ptr);
+
+  // If the underlying objects are the same, they must alias.
+  if (AObj == BObj)
+    return AliasResult::MustAlias;
+
+  if (auto *APhi = dyn_cast<PHINode>(AObj)) {
+    if (auto *BPhi = dyn_cast<PHINode>(BObj)) {
+      Loop *ALoop = LI->getLoopFor(APhi->getParent());
+      Loop *BLoop = LI->getLoopFor(BPhi->getParent());
+      if (ALoop == BLoop) {
+        auto *SCEVa = SE->getSCEV(const_cast<Value *>(AObj));
+        auto *SCEVb = SE->getSCEV(const_cast<Value *>(BObj));
+
+        // If the SCEVs are the same, they must alias.
+        if (SCEVa == SCEVb)
+          return AliasResult::MustAlias;
+
+        // If SCEV cannot analyze one of the values, then they may alias.
+        if (isa<SCEVUnknown>(SCEVa) || isa<SCEVUnknown>(SCEVb))
+          return AliasResult::MayAlias;
+
+        // Check whether the start values alias.
+        const SCEV *StartA = SCEVa;
+        while (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(StartA))
+          StartA = AR->getStart();
+
+        const SCEV *StartB = SCEVb;
+        while (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(StartB))
+          StartB = AR->getStart();
+
+        if (const SCEVUnknown *UA = dyn_cast<SCEVUnknown>(StartA)) {
+          if (const SCEVUnknown *UB = dyn_cast<SCEVUnknown>(StartB)) {
+            MemoryLocation LocAS =
+                MemoryLocation::getBeforeOrAfter(UA->getValue());
+            MemoryLocation LocBS =
+                MemoryLocation::getBeforeOrAfter(UB->getValue());
+            if (AA->isNoAlias(LocAS, LocBS))
+              return AliasResult::NoAlias;
+          }
+        }
+        return AliasResult::MayAlias;
+      }
+    }
+  }
+
   // Check the original locations (minus size) for noalias, which can happen for
   // tbaa, incompatible underlying object locations, etc.
   MemoryLocation LocAS =
@@ -725,14 +775,6 @@ static AliasResult underlyingObjectsAlias(AAResults *AA,
   if (AA->isNoAlias(LocAS, LocBS))
     return AliasResult::NoAlias;
 
-  // Check the underlying objects are the same
-  const Value *AObj = getUnderlyingObject(LocA.Ptr);
-  const Value *BObj = getUnderlyingObject(LocB.Ptr);
-
-  // If the underlying objects are the same, they must alias
-  if (AObj == BObj)
-    return AliasResult::MustAlias;
-
   // We may have hit the recursion limit for underlying objects, or have
   // underlying objects where we don't know they will alias.
   if (!isIdentifiedObject(AObj) || !isIdentifiedObject(BObj))
@@ -743,7 +785,6 @@ static AliasResult underlyingObjectsAlias(AAResults *AA,
   return AliasResult::NoAlias;
 }
 
-
 // Returns true if the load or store can be analyzed. Atomic and volatile
 // operations have properties which this analysis does not understand.
 static
@@ -3606,9 +3647,7 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst,
   Value *SrcPtr = getLoadStorePointerOperand(Src);
   Value *DstPtr = getLoadStorePointerOperand(Dst);
 
-  switch (underlyingObjectsAlias(AA, F->getDataLayout(),
-                                 MemoryLocation::get(Dst),
-                                 MemoryLocation::get(Src))) {
+  switch (underlyingObjectsAlias(AA, LI, SE, Dst, Src)) {
   case AliasResult::MayAlias:
   case AliasResult::PartialAlias:
     // cannot analyse objects if we don't understand their aliasing.
@@ -4030,11 +4069,8 @@ const SCEV *DependenceInfo::getSplitIteration(const Dependence &Dep,
   assert(Dst->mayReadFromMemory() || Dst->mayWriteToMemory());
   assert(isLoadOrStore(Src));
   assert(isLoadOrStore(Dst));
-  Value *SrcPtr = getLoadStorePointerOperand(Src);
-  Value *DstPtr = getLoadStorePointerOperand(Dst);
-  assert(underlyingObjectsAlias(
-             AA, F->getDataLayout(), MemoryLocation::get(Dst),
-             MemoryLocation::get(Src)) == AliasResult::MustAlias);
+  assert(underlyingObjectsAlias(AA, LI, SE, Dst, Src) ==
+         AliasResult::MustAlias);
 
   // establish loop nesting levels
   establishNestingLevels(Src, Dst);
@@ -4043,6 +4079,8 @@ const SCEV *DependenceInfo::getSplitIteration(const Dependence &Dep,
 
   unsigned Pairs = 1;
   SmallVector<Subscript, 2> Pair(Pairs);
+  Value *SrcPtr = getLoadStorePointerOperand(Src);
+  Value *DstPtr = getLoadStorePointerOperand(Dst);
   const SCEV *SrcSCEV = SE->getSCEV(SrcPtr);
   const SCEV *DstSCEV = SE->getSCEV(DstPtr);
   Pair[0].Src = SrcSCEV;
diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
new file mode 100644
index 00000000000000..2d80b2493a389b
--- /dev/null
+++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
@@ -0,0 +1,136 @@
+; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa 2>&1 \
+; RUN: | FileCheck %s
+
+; Check that dependence analysis correctly handles flip-flop of base addresses.
+; Bug 41488 - https://github.com/llvm/llvm-project/issues/41488
+
+; CHECK-LABEL: bug41488_test1
+; CHECK-NOT: da analyze - none!
+
+define float @bug41488_test1() {
+entry:
+  %g = alloca float, align 4
+  %h = alloca float, align 4
+  br label %for.body
+
+for.body:
+  %p = phi float* [ %g, %entry ], [ %q, %for.body ]
+  %q = phi float* [ %h, %entry ], [ %p, %for.body ]
+  %0 = load float, float* %p, align 4
+  store float undef, float* %q, align 4
+  %branch_cond = fcmp ugt float %0, 0.0
+  br i1 %branch_cond, label %for.cond.cleanup, label %for.body
+
+for.cond.cleanup:
+  ret float undef
+}
+
+; CHECK-LABEL: bug41488_test2
+; CHECK-NOT: da analyze - none!
+
+define void @bug41488_test2(i32 %n) {
+entry:
+  %g = alloca float, align 4
+  %h = alloca float, align 4
+  br label %for.body
+
+for.body:
+  %i = phi i32 [0, %entry ], [ %inc, %for.body ]
+  %p = phi float* [ %g, %entry ], [ %q, %for.body ]
+  %q = phi float* [ %h, %entry ], [ %p, %for.body ]
+  %0 = load float, float* %p, align 4
+  store float 0.0, float* %q, align 4
+  %inc = add nuw i32 %i, 1
+  %branch_cond = icmp ult i32 %i, %n
+  br i1 %branch_cond, label %for.body, label %for.cond.cleanup
+
+for.cond.cleanup:
+  ret void
+}
+
+; Bug 53942 - https://github.com/llvm/llvm-project/issues/53942
+; CHECK-LABEL: bug53942_foo
+; CHECK-NOT: da analyze - none!
+
+define void @bug53942_foo(i32 noundef %n, ptr noalias nocapture noundef writeonly %A, ptr noalias nocapture noundef %B) {
+entry:
+  %cmp8 = icmp sgt i32 %n, 1
+  br i1 %cmp8, label %for.body.preheader, label %for.cond.cleanup
+
+for.body.preheader:                               ; preds = %entry
+  %wide.trip.count = zext nneg i32 %n to i64
+  br label %for.body
+
+for.cond.cleanup:                                 ; preds = %for.body, %entry
+  ret void
+
+for.body:                                         ; preds = %for.body.preheader, %for.body
+  %indvars.iv = phi i64 [ 1, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
+  %ptr1.011 = phi ptr [ %A, %for.body.preheader ], [ %ptr2.09, %for.body ]
+  %ptr2.09 = phi ptr [ %B, %for.body.preheader ], [ %ptr1.011, %for.body ]
+  %.pre = load double, ptr %B, align 8
+  %arrayidx2 = getelementptr inbounds double, ptr %ptr1.011, i64 %indvars.iv
+  store double %.pre, ptr %arrayidx2, align 8
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
+}
+
+
+; Bug 53942 - https://github.com/llvm/llvm-project/issues/53942
+; CHECK-LABEL: bug53942_bar
+; CHECK-NOT: da analyze - none!
+
+define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias noundef %B) {
+entry:
+  br label %for.cond
+
+for.cond:                                         ; preds = %for.inc, %entry
+  %i.0 = phi i32 [ 1, %entry ], [ %inc, %for.inc ]
+  %cmp = icmp slt i32 %i.0, %n
+  br i1 %cmp, label %for.body, label %for.cond.cleanup
+
+for.cond.cleanup:                                 ; preds = %for.cond
+  br label %for.end
+
+for.body:                                         ; preds = %for.cond
+  %and = and i32 %i.0, 2
+  %tobool.not = icmp eq i32 %and, 0
+  br i1 %tobool.not, label %cond.false, label %cond.true
+
+cond.true:                                        ; preds = %for.body
+  br label %cond.end
+
+cond.false:                                       ; preds = %for.body
+  br label %cond.end
+
+cond.end:                                         ; preds = %cond.false, %cond.true
+  %cond = phi ptr [ %A, %cond.true ], [ %B, %cond.false ]
+  %and1 = and i32 %i.0, 2
+  %tobool2.not = icmp eq i32 %and1, 0
+  br i1 %tobool2.not, label %cond.false4, label %cond.true3
+
+cond.true3:                                       ; preds = %cond.end
+  br label %cond.end5
+
+cond.false4:                                      ; preds = %cond.end
+  br label %cond.end5
+
+cond.end5:                                        ; preds = %cond.false4, %cond.true3
+  %cond6 = phi ptr [ %B, %cond.true3 ], [ %A, %cond.false4 ]
+  %sub = add nsw i32 %i.0, -1
+  %idxprom = sext i32 %sub to i64
+  %arrayidx = getelementptr inbounds double, ptr %cond6, i64 %idxprom
+  %0 = load double, ptr %arrayidx, align 8
+  %idxprom7 = zext nneg i32 %i.0 to i64
+  %arrayidx8 = getelementptr inbounds double, ptr %cond, i64 %idxprom7
+  store double %0, ptr %arrayidx8, align 8
+  br label %for.inc
+
+for.inc:                                          ; preds = %cond.end5
+  %inc = add nuw nsw i32 %i.0, 1
+  br label %for.cond
+
+for.end:                                          ; preds = %for.cond.cleanup
+  ret void
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overall approach here isn't right. This is basically trying to mitigate a specific case in the BasicAA implementation, but doesn't really provide a guarantee that we can't hit the same underlying issue through a different pathway. I would expect that it's possible to construct a similar problematic case using selects instead of phis.

I think the correct way to address this issue would be to enable the cross-iteration mode of AA, which is also what LAA does. However, it currently doesn't actually catch the examples here, which I'm addressing in #116802.

Though I should also note that doing this alone is not sufficient to make this code correct, because it also makes an additional incorrect assumptions about AATags. In particular, this will handle scoped noalias metadata incorrectly for cases where the metadata is only valid within one loop iteration. It is necessary to inspect noalias.scope intrinsics to disambiguate these cases.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #116802 has landed, I think you should be able to fix the issue using BatchAAResults::enableCrossIterationMode.

@sebpop
Copy link
Contributor Author

sebpop commented Dec 12, 2024

I see 2 extra errors when using BatchAA in cross iter mode:

  LLVM :: Analysis/DependenceAnalysis/AA.ll
  LLVM :: Analysis/DependenceAnalysis/NonAffineExpr.ll

Both fails are in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1620 where the dyn_cast fails for some reason on a LoadInst.
@nikic could you please have a look? It's in the patch you pointed out above.

Copy link

github-actions bot commented Dec 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Dec 12, 2024

✅ With the latest revision this PR passed the undef deprecator.

@nikic
Copy link
Contributor

nikic commented Dec 12, 2024

@sebpop Just from a quick look, it's probably because IR is modified during BatchAA lifetime and the cache becomes invalid. To start with you can just create the BatchAA directly inside underlyingObjectsAlias (without any batching). It would be nice to use it in a larger scope for compile-time reasons, but it requires some more care.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -0,0 +1,136 @@
; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa 2>&1 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use the update_analyze_checks support you added recently?

sjoerdmeijer added a commit to sjoerdmeijer/llvm-project that referenced this pull request Jan 29, 2025
This is a work in progress patch to enable loop-interchange by default
and is a continuation of the RFC:

https://discourse.llvm.org/t/enabling-loop-interchange/82589

Basically, we promised to fix any compile-time and correctness issues in
the different components involved here (loop-interchange and dependence
analaysis.) before discussing enabling interchange by default. We think
are close to complete this; I would like to explain where we are and
wanted to check if there are any thoughts or concerns. A quick overview
of the correctness and compile-time improvements that we have made include:

Correctness:
- [LoopInterchange] Remove 'S' Scalar Dependencies (llvm#119345)
- [LoopInterchange] Fix overflow in cost calculation (llvm#111807)
- [LoopInterchange] Handle LE and GE correctly (PR llvm#124901) @kasuga-fj
- [DA] disambiguate evolution of base addresses (llvm#116628)

Compile-times:
- [LoopInterchange] Constrain number of load/stores in a loop (llvm#118973)
- [LoopInterchange] Bail out early if minimum loop nest is not met (llvm#115128)
- [LoopInterchange] Hoist isComputableLoopNest() in the control flow (llvm#124247)

And in terms of remaining work, we think we are very close to fixing
these depenence analysis issues:
- [DA] do not handle array accesses of different offsets (llvm#123436)
- [DA] Dependence analysis does not handle array accesses of different sizes (llvm#116630)
- [DA] use NSW arithmetic llvm#116632

The compile-time increase with a geomean increase of 0.19% looks good
(after committing llvm#124247), I think:

  stage1-O3:
  Benchmark
  kimwitu++        +0.10%
  sqlite3          +0.14%
  consumer-typeset +0.07%
  Bullet           +0.06%
  tramp3d-v4       +0.21%
  mafft            +0.39%
  ClamAVi          +0.06%
  lencod           +0.61%
  SPASS            +0.17%
  7zip             +0.08%
  geomean          +0.19%

See also:
http://llvm-compile-time-tracker.com/compare.php?from=19a7fe03b4f58c4f73ea91d5e63bc4c6e61f987b&to=b24f1367d68ee675ea93ecda4939208c6b68ae4b&stat=instructions%3Au

We might want to look into lencod to see if we can improve more, but not
sure it is strictly necessary.
This patch fixes two bugs:
llvm#41488
llvm#53942

The dependence analysis assumes that the base address of array accesses is
invariant across loop iterations. In both bugs the base address evolves
following loop iterations: the base address flip-flops between two different
memory objects.

The patch uses the cross iteration mode of alias analysis to disambiguate the
base objects.
@sebpop sebpop merged commit 4b2d615 into llvm:main Jan 30, 2025
8 checks passed
@sebpop sebpop deleted the da-1 branch January 30, 2025 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants