[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

david-arm · 2024-09-11T14:09:15Z

With PR #88385 I am introducing support for vectorising more loops with early exits that don't require a scalar epilogue. As such, if a loop doesn't have a unique exit block it will not automatically imply we require a scalar epilogue. Also, in all places in the code today where we use the variable LoopExitBlock we actually mean the exit block from the latch. Therefore, it seemed reasonable to add a new getUniqueLatchExitBlock that allows the caller to determine the exit block taken from the latch and use this instead of getUniqueExitBlock. I also renamed LoopExitBlock to be LatchExitBlock. I feel this not only better reflects how the variable is used today, but also prepares the code for PR #88385.

While doing this I also noticed that one of the comments in requiresScalarEpilogue is wrong when we require a scalar epilogue, i.e. when we're not exiting from the latch block. This doesn't always imply we have multiple exits, e.g. see the test in

Transforms/LoopVectorize/unroll_nonlatch.ll

where the latch unconditionally branches back to the only exiting block.

llvmbot · 2024-09-11T14:09:49Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-support

Author: David Sherwood (david-arm)

Changes

With PR #88385 I am introducing support for vectorising more loops with early exits that don't require a scalar epilogue. As such, if a loop doesn't have a unique exit block it will not automatically imply we require a scalar epilogue. Also, in all places in the code today where we use the variable LoopExitBlock we actually mean the exit block from the latch. Therefore, it seemed reasonable to add a new getUniqueLatchExitBlock that allows the caller to determine the exit block taken from the latch and use this instead of getUniqueExitBlock. I also renamed LoopExitBlock to be LatchExitBlock. I feel this not only better reflects how the variable is used today, but also prepares the code for PR #88385.

While doing this I also noticed that one of the comments in requiresScalarEpilogue is wrong when we require a scalar epilogue, i.e. when we're not exiting from the latch block. This doesn't always imply we have multiple exits, e.g. see the test in

Transforms/LoopVectorize/unroll_nonlatch.ll

where the latch unconditionally branches back to the only exiting block.

Full diff: https://github.com/llvm/llvm-project/pull/108231.diff

3 Files Affected:

(modified) llvm/include/llvm/Support/GenericLoopInfo.h (+4)
(modified) llvm/include/llvm/Support/GenericLoopInfoImpl.h (+10)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+13-13)

diff --git a/llvm/include/llvm/Support/GenericLoopInfo.h b/llvm/include/llvm/Support/GenericLoopInfo.h
index d560ca648132c9..0fa13e2a3d0e15 100644
--- a/llvm/include/llvm/Support/GenericLoopInfo.h
+++ b/llvm/include/llvm/Support/GenericLoopInfo.h
@@ -294,6 +294,10 @@ template <class BlockT, class LoopT> class LoopBase {
   /// Otherwise return null.
   BlockT *getUniqueExitBlock() const;
 
+  /// Return the unique exit block for the latch, or null if there are multiple
+  /// different exit blocks.
+  BlockT *getUniqueLatchExitBlock() const;
+
   /// Return true if this loop does not have any exit blocks.
   bool hasNoExitBlocks() const;
 
diff --git a/llvm/include/llvm/Support/GenericLoopInfoImpl.h b/llvm/include/llvm/Support/GenericLoopInfoImpl.h
index d19022729ace32..4945ea30950d23 100644
--- a/llvm/include/llvm/Support/GenericLoopInfoImpl.h
+++ b/llvm/include/llvm/Support/GenericLoopInfoImpl.h
@@ -159,6 +159,16 @@ BlockT *LoopBase<BlockT, LoopT>::getUniqueExitBlock() const {
   return getExitBlockHelper(this, true).first;
 }
 
+template <class BlockT, class LoopT>
+BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
+  const BlockT *Latch = getLoopLatch();
+  assert(Latch && "Latch block must exists");
+  SmallVector<BlockT *, 4> ExitBlocks;
+  getUniqueExitBlocksHelper(this, ExitBlocks,
+                            [Latch](const BlockT *BB) { return BB == Latch; });
+  return ExitBlocks.size() == 1 ? ExitBlocks[0] : nullptr;
+}
+
 /// getExitEdges - Return all pairs of (_inside_block_,_outside_block_).
 template <class BlockT, class LoopT>
 void LoopBase<BlockT, LoopT>::getExitEdges(
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index a8722db654f5c9..04a350f1f8d9cb 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -652,9 +652,8 @@ class InnerLoopVectorizer {
   /// Middle Block between the vector and the scalar.
   BasicBlock *LoopMiddleBlock;
 
-  /// The unique ExitBlock of the scalar loop if one exists.  Note that
-  /// there can be multiple exiting edges reaching this block.
-  BasicBlock *LoopExitBlock;
+  /// The exit block from the loop latch, if one exists.
+  BasicBlock *LatchExitBlock;
 
   /// The scalar loop body.
   BasicBlock *LoopScalarBody;
@@ -1390,6 +1389,8 @@ class LoopVectorizationCostModel {
 
   /// Returns true if we're required to use a scalar epilogue for at least
   /// the final iteration of the original loop.
+  /// TODO: It would be good to cache the result and avoid recalculating,
+  /// which will also avoid reprinting the same debug message many times.
   bool requiresScalarEpilogue(bool IsVectorizing) const {
     if (!isScalarEpilogueAllowed()) {
       LLVM_DEBUG(dbgs() << "LV: Loop does not require scalar epilogue\n");
@@ -1398,8 +1399,8 @@ class LoopVectorizationCostModel {
     // If we might exit from anywhere but the latch, must run the exiting
     // iteration in scalar form.
     if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
-      LLVM_DEBUG(
-          dbgs() << "LV: Loop requires scalar epilogue: multiple exits\n");
+      LLVM_DEBUG(dbgs() << "LV: Loop requires scalar epilogue: not exiting "
+                           "from latch block\n");
       return true;
     }
     if (IsVectorizing && InterleaveInfo.requiresScalarEpilogue()) {
@@ -2043,8 +2044,7 @@ class GeneratedRTChecks {
   /// adjusts the branches to branch to the vector preheader or \p Bypass,
   /// depending on the generated condition.
   BasicBlock *emitSCEVChecks(BasicBlock *Bypass,
-                             BasicBlock *LoopVectorPreHeader,
-                             BasicBlock *LoopExitBlock) {
+                             BasicBlock *LoopVectorPreHeader) {
     if (!SCEVCheckCond)
       return nullptr;
 
@@ -2516,7 +2516,7 @@ void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) {
 
 BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
   BasicBlock *const SCEVCheckBlock =
-      RTChecks.emitSCEVChecks(Bypass, LoopVectorPreHeader, LoopExitBlock);
+      RTChecks.emitSCEVChecks(Bypass, LoopVectorPreHeader);
   if (!SCEVCheckBlock)
     return nullptr;
 
@@ -2533,7 +2533,7 @@ BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
       // If there is an epilogue which must run, there's no edge from the
       // middle block to exit blocks  and thus no need to update the immediate
       // dominator of the exit blocks.
-      DT->changeImmediateDominator(LoopExitBlock, SCEVCheckBlock);
+      DT->changeImmediateDominator(LatchExitBlock, SCEVCheckBlock);
   }
 
   LoopBypassBlocks.push_back(SCEVCheckBlock);
@@ -2581,9 +2581,9 @@ void InnerLoopVectorizer::createVectorLoopSkeleton(StringRef Prefix) {
   LoopScalarBody = OrigLoop->getHeader();
   LoopVectorPreHeader = OrigLoop->getLoopPreheader();
   assert(LoopVectorPreHeader && "Invalid loop structure");
-  LoopExitBlock = OrigLoop->getUniqueExitBlock(); // may be nullptr
-  assert((LoopExitBlock || Cost->requiresScalarEpilogue(VF.isVector())) &&
-         "multiple exit loop without required epilogue?");
+  LatchExitBlock = OrigLoop->getUniqueLatchExitBlock();
+  assert((LatchExitBlock || Cost->requiresScalarEpilogue(VF.isVector())) &&
+         "Expected loop without exit from latch to require an epilogue");
 
   LoopMiddleBlock =
       SplitBlock(LoopVectorPreHeader, LoopVectorPreHeader->getTerminator(), DT,
@@ -7788,7 +7788,7 @@ EpilogueVectorizerEpilogueLoop::createEpilogueVectorizedLoopSkeleton(
     // If there is an epilogue which must run, there's no edge from the
     // middle block to exit blocks  and thus no need to update the immediate
     // dominator of the exit blocks.
-    DT->changeImmediateDominator(LoopExitBlock,
+    DT->changeImmediateDominator(LatchExitBlock,
                                  EPI.EpilogueIterationCountCheck);
 
   // Keep track of bypass blocks, as they feed start values to the induction and

Dinistro

Looks reasonable, but I'm not too familiar with LoopVectorize, so someone else should accept this.

david-arm · 2024-10-07T13:25:59Z

Rebase

With PR llvm#88385 I am introducing support for vectorising more loops with early exits that don't require a scalar epilogue. As such, if a loop doesn't have a unique exit block it will not automatically imply we require a scalar epilogue. Also, in the only place in the code today where we use the variable LoopExitBlock we actually mean the exit block from the latch. Therefore, it seemed reasonable to add a new getUniqueLatchExitBlock helper that allows the caller to determine the exit block taken from the latch and use this instead of getUniqueExitBlock. I also removed LoopExitBlock since it was only used in one place. While doing this I also noticed that one of the comments in requiresScalarEpilogue is wrong when we require a scalar epilogue, i.e. when we're not exiting from the latch block. This doesn't always imply we have multiple exits, e.g. see the test in Transforms/LoopVectorize/unroll_nonlatch.ll where the latch unconditionally branches back to the only exiting block.

david-arm · 2024-11-04T11:56:55Z

Ping

huntergr-arm · 2024-11-04T13:41:02Z

llvm/include/llvm/Support/GenericLoopInfoImpl.h

+BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
+  const BlockT *Latch = getLoopLatch();
+  assert(Latch && "Latch block must exists");
+  SmallVector<BlockT *, 4> ExitBlocks;


Suggested change

SmallVector<BlockT *, 4> ExitBlocks;

auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {

assert(!AllowRepeats && "Unexpected parameter value.");

return !contains(BB) ? BB : nullptr;

};

return find_singleton<BlockT>(children<BlockT *>(Latch), IsExitBlock);

Requires changing Latch to be a non-const pointer (since there isn't a version of children with const), but if we already have the Latch block there's no need to search for it again.

* Use find_singleton in getUniqueLatchExitBlock.

huntergr-arm

LGTM

fhahn

LGTM, looks like this helps to nicely avoid code changes in follow-up changes.

fhahn · 2024-11-05T21:40:05Z

llvm/include/llvm/Support/GenericLoopInfo.h

@@ -294,6 +294,10 @@ template <class BlockT, class LoopT> class LoopBase {
  /// Otherwise return null.
  BlockT *getUniqueExitBlock() const;

+  /// Return the unique exit block for the latch, or null if there are multiple
+  /// different exit blocks.


or the latch is not exiting?

fhahn · 2024-11-05T21:42:15Z

llvm/include/llvm/Support/GenericLoopInfoImpl.h

+BlockT *LoopBase<BlockT, LoopT>::getUniqueLatchExitBlock() const {
+  BlockT *Latch = getLoopLatch();
+  assert(Latch && "Latch block must exists");
+  auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {


nit: sufficient to capture this?

Suggested change

auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {

auto IsExitBlock = [this](BlockT *BB, bool AllowRepeats) -> BlockT * {

david-arm · 2024-11-06T10:27:54Z

Looks like the build is failing due to an issue with git checkout. I ran "make check-all" downstream and it seems to pass.

david-arm requested review from fhahn, Dinistro, igogo-x86 and huntergr-arm September 11, 2024 14:09

llvmbot added vectorizers llvm:support llvm:transforms labels Sep 11, 2024

Dinistro reviewed Sep 12, 2024

View reviewed changes

david-arm mentioned this pull request Sep 20, 2024

[LoopVectorize] Add support for vectorisation of more early exit loops #88385

Closed

david-arm force-pushed the loop_latch_exit branch from 36c2311 to b3db33b Compare October 7, 2024 13:25

david-arm mentioned this pull request Oct 22, 2024

[VPlan] Dispatch to multiple exit blocks via middle blocks. #112138

Merged

david-arm force-pushed the loop_latch_exit branch from b3db33b to 6962059 Compare November 4, 2024 11:56

huntergr-arm reviewed Nov 4, 2024

View reviewed changes

Address review comment

35e3a47

* Use find_singleton in getUniqueLatchExitBlock.

huntergr-arm approved these changes Nov 5, 2024

View reviewed changes

fhahn approved these changes Nov 5, 2024

View reviewed changes

Address review comments

63434e1

david-arm merged commit d77a36e into llvm:main Nov 6, 2024
5 of 7 checks passed

david-arm deleted the loop_latch_exit branch January 28, 2025 11:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

Uh oh!

david-arm commented Sep 11, 2024 •

edited

Loading

Uh oh!

llvmbot commented Sep 11, 2024 •

edited

Loading

Uh oh!

Dinistro left a comment

Uh oh!

david-arm commented Oct 7, 2024

Uh oh!

david-arm commented Nov 4, 2024

Uh oh!

huntergr-arm Nov 4, 2024

Uh oh!

david-arm Nov 5, 2024

Uh oh!

huntergr-arm left a comment

Uh oh!

fhahn left a comment

Uh oh!

fhahn Nov 5, 2024

Uh oh!

fhahn Nov 5, 2024

Uh oh!

david-arm commented Nov 6, 2024

Uh oh!

Uh oh!

Uh oh!

-  SmallVector<BlockT *, 4> ExitBlocks;
+  auto IsExitBlock = [&](BlockT *BB, bool AllowRepeats) -> BlockT * {
+    assert(!AllowRepeats && "Unexpected parameter value.");
+    return !contains(BB) ? BB : nullptr;
+  };
+  return find_singleton<BlockT>(children<BlockT *>(Latch), IsExitBlock);

	auto IsExitBlock = [&](BlockT BB, bool AllowRepeats) -> BlockT {
	auto IsExitBlock = [this](BlockT BB, bool AllowRepeats) -> BlockT {

[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

[LoopVectorize] Use new getUniqueLatchExitBlock routine #108231

Uh oh!

Conversation

david-arm commented Sep 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dinistro left a comment

Choose a reason for hiding this comment

Uh oh!

david-arm commented Oct 7, 2024

Uh oh!

david-arm commented Nov 4, 2024

Uh oh!

huntergr-arm Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

huntergr-arm left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

david-arm commented Nov 6, 2024

Uh oh!

Uh oh!

Uh oh!

david-arm commented Sep 11, 2024 •

edited

Loading

llvmbot commented Sep 11, 2024 •

edited

Loading