Skip to content

[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 6, 2024

Conversation

david-arm
Copy link
Contributor

@david-arm david-arm commented Sep 11, 2024

With PR #88385 I am introducing support for vectorising more loops with early exits that don't require a scalar epilogue. As such, if a loop doesn't have a unique exit block it will not automatically imply we require a scalar epilogue. Also, in all places in the code today where we use the variable LoopExitBlock we actually mean the exit block from the latch. Therefore, it seemed reasonable to add a new getUniqueLatchExitBlock that allows the caller to determine the exit block taken from the latch and use this instead of getUniqueExitBlock. I also renamed LoopExitBlock to be LatchExitBlock. I feel this not only better reflects how the variable is used today, but also prepares the code for PR #88385.

While doing this I also noticed that one of the comments in requiresScalarEpilogue is wrong when we require a scalar epilogue, i.e. when we're not exiting from the latch block. This doesn't always imply we have multiple exits, e.g. see the test in

Transforms/LoopVectorize/unroll_nonlatch.ll

where the latch unconditionally branches back to the only exiting block.

@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-support

Author: David Sherwood (david-arm)

Changes

With PR #88385 I am introducing support for vectorising more loops with early exits that don't require a scalar epilogue. As such, if a loop doesn't have a unique exit block it will not automatically imply we require a scalar epilogue. Also, in all places in the code today where we use the variable LoopExitBlock we actually mean the exit block from the latch. Therefore, it seemed reasonable to add a new getUniqueLatchExitBlock that allows the caller to determine the exit block taken from the latch and use this instead of getUniqueExitBlock. I also renamed LoopExitBlock to be LatchExitBlock. I feel this not only better reflects how the variable is used today, but also prepares the code for PR #88385.

While doing this I also noticed that one of the comments in requiresScalarEpilogue is wrong when we require a scalar epilogue, i.e. when we're not exiting from the latch block. This doesn't always imply we have multiple exits, e.g. see the test in

Transforms/LoopVectorize/unroll_nonlatch.ll

where the latch unconditionally branches back to the only exiting block.


Full diff: https://github.com/llvm/llvm-project/pull/108231.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Support/GenericLoopInfo.h (+4)
  • (modified) llvm/include/llvm/Support/GenericLoopInfoImpl.h (+10)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+13-13)
diff --git a/llvm/include/llvm/Support/GenericLoopInfo.h b/llvm/include/llvm/Support/GenericLoopInfo.h
index d560ca648132c9..0fa13e2a3d0e15 100644
--- a/llvm/include/llvm/Support/GenericLoopInfo.h
+++ b/llvm/include/llvm/Support/GenericLoopInfo.h
@@ -294,6 +294,10 @@ template <class BlockT, class LoopT> class LoopBase {
   /// Otherwise return null.
   BlockT *getUniqueExitBlock() const;
 
+  /// Return the unique exit block for the latch, or null if there are multiple
+  /// different exit blocks.
+  BlockT *getUniqueLatchExitBlock() const;
+
   /// Return true if this loop does not have any exit blocks.
   bool hasNoExitBlocks() const;
 
diff --git a/llvm/include/llvm/Support/GenericLoopInfoImpl.h b/llvm/include/llvm/Support/GenericLoopInfoImpl.h
index d19022729ace32..4945ea30950d23 100644
--- a/llvm/include/llvm/Support/GenericLoopInfoImpl.h
+++ b/llvm/include/llvm/Support/GenericLoopInfoImpl.h
@@ -159,6 +159,16 @@ BlockT *LoopBase<BlockT, LoopT>::getUniqueExitBlock() const {
   return getExitBlockHelper(this, true).first;
 }
 
+template <class BlockT, class LoopT>
+BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
+  const BlockT *Latch = getLoopLatch();
+  assert(Latch && "Latch block must exists");
+  SmallVector<BlockT *, 4> ExitBlocks;
+  getUniqueExitBlocksHelper(this, ExitBlocks,
+                            [Latch](const BlockT *BB) { return BB == Latch; });
+  return ExitBlocks.size() == 1 ? ExitBlocks[0] : nullptr;
+}
+
 /// getExitEdges - Return all pairs of (_inside_block_,_outside_block_).
 template <class BlockT, class LoopT>
 void LoopBase<BlockT, LoopT>::getExitEdges(
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index a8722db654f5c9..04a350f1f8d9cb 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -652,9 +652,8 @@ class InnerLoopVectorizer {
   /// Middle Block between the vector and the scalar.
   BasicBlock *LoopMiddleBlock;
 
-  /// The unique ExitBlock of the scalar loop if one exists.  Note that
-  /// there can be multiple exiting edges reaching this block.
-  BasicBlock *LoopExitBlock;
+  /// The exit block from the loop latch, if one exists.
+  BasicBlock *LatchExitBlock;
 
   /// The scalar loop body.
   BasicBlock *LoopScalarBody;
@@ -1390,6 +1389,8 @@ class LoopVectorizationCostModel {
 
   /// Returns true if we're required to use a scalar epilogue for at least
   /// the final iteration of the original loop.
+  /// TODO: It would be good to cache the result and avoid recalculating,
+  /// which will also avoid reprinting the same debug message many times.
   bool requiresScalarEpilogue(bool IsVectorizing) const {
     if (!isScalarEpilogueAllowed()) {
       LLVM_DEBUG(dbgs() << "LV: Loop does not require scalar epilogue\n");
@@ -1398,8 +1399,8 @@ class LoopVectorizationCostModel {
     // If we might exit from anywhere but the latch, must run the exiting
     // iteration in scalar form.
     if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
-      LLVM_DEBUG(
-          dbgs() << "LV: Loop requires scalar epilogue: multiple exits\n");
+      LLVM_DEBUG(dbgs() << "LV: Loop requires scalar epilogue: not exiting "
+                           "from latch block\n");
       return true;
     }
     if (IsVectorizing && InterleaveInfo.requiresScalarEpilogue()) {
@@ -2043,8 +2044,7 @@ class GeneratedRTChecks {
   /// adjusts the branches to branch to the vector preheader or \p Bypass,
   /// depending on the generated condition.
   BasicBlock *emitSCEVChecks(BasicBlock *Bypass,
-                             BasicBlock *LoopVectorPreHeader,
-                             BasicBlock *LoopExitBlock) {
+                             BasicBlock *LoopVectorPreHeader) {
     if (!SCEVCheckCond)
       return nullptr;
 
@@ -2516,7 +2516,7 @@ void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) {
 
 BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
   BasicBlock *const SCEVCheckBlock =
-      RTChecks.emitSCEVChecks(Bypass, LoopVectorPreHeader, LoopExitBlock);
+      RTChecks.emitSCEVChecks(Bypass, LoopVectorPreHeader);
   if (!SCEVCheckBlock)
     return nullptr;
 
@@ -2533,7 +2533,7 @@ BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
       // If there is an epilogue which must run, there's no edge from the
       // middle block to exit blocks  and thus no need to update the immediate
       // dominator of the exit blocks.
-      DT->changeImmediateDominator(LoopExitBlock, SCEVCheckBlock);
+      DT->changeImmediateDominator(LatchExitBlock, SCEVCheckBlock);
   }
 
   LoopBypassBlocks.push_back(SCEVCheckBlock);
@@ -2581,9 +2581,9 @@ void InnerLoopVectorizer::createVectorLoopSkeleton(StringRef Prefix) {
   LoopScalarBody = OrigLoop->getHeader();
   LoopVectorPreHeader = OrigLoop->getLoopPreheader();
   assert(LoopVectorPreHeader && "Invalid loop structure");
-  LoopExitBlock = OrigLoop->getUniqueExitBlock(); // may be nullptr
-  assert((LoopExitBlock || Cost->requiresScalarEpilogue(VF.isVector())) &&
-         "multiple exit loop without required epilogue?");
+  LatchExitBlock = OrigLoop->getUniqueLatchExitBlock();
+  assert((LatchExitBlock || Cost->requiresScalarEpilogue(VF.isVector())) &&
+         "Expected loop without exit from latch to require an epilogue");
 
   LoopMiddleBlock =
       SplitBlock(LoopVectorPreHeader, LoopVectorPreHeader->getTerminator(), DT,
@@ -7788,7 +7788,7 @@ EpilogueVectorizerEpilogueLoop::createEpilogueVectorizedLoopSkeleton(
     // If there is an epilogue which must run, there's no edge from the
     // middle block to exit blocks  and thus no need to update the immediate
     // dominator of the exit blocks.
-    DT->changeImmediateDominator(LoopExitBlock,
+    DT->changeImmediateDominator(LatchExitBlock,
                                  EPI.EpilogueIterationCountCheck);
 
   // Keep track of bypass blocks, as they feed start values to the induction and

Copy link
Contributor

@Dinistro Dinistro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, but I'm not too familiar with LoopVectorize, so someone else should accept this.

@david-arm
Copy link
Contributor Author

Rebase

With PR llvm#88385 I am introducing support for vectorising more
loops with early exits that don't require a scalar epilogue.
As such, if a loop doesn't have a unique exit block it will
not automatically imply we require a scalar epilogue. Also,
in the only place in the code today where we use the variable
LoopExitBlock we actually mean the exit block from the latch.
Therefore, it seemed reasonable to add a new
getUniqueLatchExitBlock helper that allows the caller to
determine the exit block taken from the latch and use this
instead of getUniqueExitBlock. I also removed LoopExitBlock
since it was only used in one place.

While doing this I also noticed that one of the comments in
requiresScalarEpilogue is wrong when we require a scalar
epilogue, i.e. when we're not exiting from the latch block.
This doesn't always imply we have multiple exits, e.g. see
the test in

Transforms/LoopVectorize/unroll_nonlatch.ll

where the latch unconditionally branches back to the only
exiting block.
@david-arm
Copy link
Contributor Author

Ping

BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
const BlockT *Latch = getLoopLatch();
assert(Latch && "Latch block must exists");
SmallVector<BlockT *, 4> ExitBlocks;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallVector<BlockT *, 4> ExitBlocks;
auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {
assert(!AllowRepeats && "Unexpected parameter value.");
return !contains(BB) ? BB : nullptr;
};
return find_singleton<BlockT>(children<BlockT *>(Latch), IsExitBlock);

Requires changing Latch to be a non-const pointer (since there isn't a version of children with const), but if we already have the Latch block there's no need to search for it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

* Use find_singleton in getUniqueLatchExitBlock.
Copy link
Collaborator

@huntergr-arm huntergr-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks like this helps to nicely avoid code changes in follow-up changes.

@@ -294,6 +294,10 @@ template <class BlockT, class LoopT> class LoopBase {
/// Otherwise return null.
BlockT *getUniqueExitBlock() const;

/// Return the unique exit block for the latch, or null if there are multiple
/// different exit blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the latch is not exiting?

BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
BlockT *Latch = getLoopLatch();
assert(Latch && "Latch block must exists");
auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sufficient to capture this?

Suggested change
auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {
auto IsExitBlock = [this](BlockT *BB, bool AllowRepeats) -> BlockT * {

@david-arm
Copy link
Contributor Author

Looks like the build is failing due to an issue with git checkout. I ran "make check-all" downstream and it seems to pass.

@david-arm david-arm merged commit d77a36e into llvm:main Nov 6, 2024
5 of 7 checks passed
@david-arm david-arm deleted the loop_latch_exit branch January 28, 2025 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants