-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[SandboxVectorizer] Define SeedBundle: a set of instructions to be vectorized [retry] #111073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…vectorized Seed collection will assemble instructions to be vectorized into SeedBundles. This data structure is not intended to be used directly, but will be the basis for load bundles, store bundles, and so on.
@llvm/pr-subscribers-llvm-transforms Author: None (Sterling-Augustine) Changes[Retry 110696 with a proper rebase.] Seed collection will assemble instructions to be vectorized into SeedBundles. This data structure is not intended to be used directly, but will be the basis for load bundles, store bundles, and so on. Full diff: https://github.com/llvm/llvm-project/pull/111073.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h
new file mode 100644
index 00000000000000..12dd35386018dd
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h
@@ -0,0 +1,133 @@
+//===- SeedCollector.h ------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+// This file contains the mechanism for collecting the seed instructions that
+// are used as starting points for forming the vectorization graph.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_SEEDCOLLECTOR_H
+#define LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_SEEDCOLLECTOR_H
+
+#include "llvm/ADT/BitVector.h"
+#include "llvm/ADT/iterator_range.h"
+#include "llvm/Analysis/ScalarEvolution.h"
+#include "llvm/SandboxIR/Instruction.h"
+#include "llvm/SandboxIR/Utils.h"
+#include "llvm/SandboxIR/Value.h"
+#include <iterator>
+#include <memory>
+
+namespace llvm::sandboxir {
+
+/// A set of candidate Instructions for vectorizing together.
+class SeedBundle {
+public:
+ using SeedList = SmallVector<Instruction *>;
+ /// Initialize a bundle with \p I.
+ explicit SeedBundle(Instruction *I) { insertAt(begin(), I); }
+ explicit SeedBundle(SeedList &&L) : Seeds(std::move(L)) {
+ for (auto &S : Seeds)
+ NumUnusedBits += Utils::getNumBits(S);
+ }
+ /// No need to allow copies.
+ SeedBundle(const SeedBundle &) = delete;
+ SeedBundle &operator=(const SeedBundle &) = delete;
+ virtual ~SeedBundle() {}
+
+ using iterator = SeedList::iterator;
+ using const_iterator = SeedList::const_iterator;
+ iterator begin() { return Seeds.begin(); }
+ iterator end() { return Seeds.end(); }
+ const_iterator begin() const { return Seeds.begin(); }
+ const_iterator end() const { return Seeds.end(); }
+
+ Instruction *operator[](unsigned Idx) const { return Seeds[Idx]; }
+
+ /// Insert \p I into position \p P. Clients should choose Pos
+ /// by symbol, symbol-offset, and program order (which depends if scheduling
+ /// bottom-up or top-down).
+ void insertAt(iterator Pos, Instruction *I) {
+#ifdef EXPENSIVE_CHECKS
+ for (auto Itr : Seeds) {
+ assert(*Itr != I && "Attempt to insert an instruction twice.");
+ }
+#endif
+ Seeds.insert(Pos, I);
+ NumUnusedBits += Utils::getNumBits(I);
+ }
+
+ unsigned getFirstUnusedElementIdx() const {
+ for (unsigned ElmIdx : seq<unsigned>(0, Seeds.size()))
+ if (!isUsed(ElmIdx))
+ return ElmIdx;
+ return Seeds.size();
+ }
+ /// Marks instruction \p I "used" within the bundle. Clients
+ /// use this property when assembling a vectorized instruction from
+ /// the seeds in a bundle. This allows constant time evaluation
+ /// and "removal" from the list.
+ void setUsed(Instruction *I) {
+ auto It = std::find(begin(), end(), I);
+ assert(It != end() && "Instruction not in the bundle!");
+ auto Idx = It - begin();
+ setUsed(Idx, 1, /*VerifyUnused=*/false);
+ }
+
+ void setUsed(unsigned ElementIdx, unsigned Sz = 1, bool VerifyUnused = true) {
+ if (ElementIdx + Sz >= UsedLanes.size())
+ UsedLanes.resize(ElementIdx + Sz);
+ for (unsigned Idx : seq<unsigned>(ElementIdx, ElementIdx + Sz)) {
+ assert((!VerifyUnused || !UsedLanes.test(Idx)) &&
+ "Already marked as used!");
+ UsedLanes.set(Idx);
+ UsedLaneCount++;
+ }
+ NumUnusedBits -= Utils::getNumBits(Seeds[ElementIdx]);
+ }
+ /// \Returns whether or not \p Element has been used.
+ bool isUsed(unsigned Element) const {
+ return Element < UsedLanes.size() && UsedLanes.test(Element);
+ }
+ bool allUsed() const { return UsedLaneCount == Seeds.size(); }
+ unsigned getNumUnusedBits() const { return NumUnusedBits; }
+
+ /// \Returns a slice of seed elements, starting at the element \p StartIdx,
+ /// with a total size <= \p MaxVecRegBits, or an empty slice if the
+ /// requirements cannot be met . If \p ForcePowOf2 is true, then the returned
+ /// slice will have a total number of bits that is a power of 2.
+ MutableArrayRef<SeedList> getSlice(unsigned StartIdx, unsigned MaxVecRegBits,
+ bool ForcePowOf2);
+
+protected:
+ SeedList Seeds;
+ /// The lanes that we have already vectorized.
+ BitVector UsedLanes;
+ /// Tracks used lanes for constant-time accessor.
+ unsigned UsedLaneCount = 0;
+ /// Tracks the remaining bits available to vectorize
+ unsigned NumUnusedBits = 0;
+
+public:
+#ifndef NDEBUG
+ void dump(raw_ostream &OS) const {
+ for (auto [ElmIdx, I] : enumerate(*this)) {
+ OS.indent(2) << ElmIdx << ". ";
+ if (isUsed(ElmIdx))
+ OS << "[USED]";
+ else
+ OS << *I;
+ OS << "\n";
+ }
+ }
+ LLVM_DUMP_METHOD void dump() const {
+ dump(dbgs());
+ dbgs() << "\n";
+ }
+#endif // NDEBUG
+};
+} // namespace llvm::sandboxir
+#endif // LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_SEEDCOLLECTOR_H
diff --git a/llvm/lib/Transforms/Vectorize/CMakeLists.txt b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
index eeff4a9f6a8bae..887c2089c5a520 100644
--- a/llvm/lib/Transforms/Vectorize/CMakeLists.txt
+++ b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
@@ -6,6 +6,7 @@ add_llvm_component_library(LLVMVectorize
SandboxVectorizer/DependencyGraph.cpp
SandboxVectorizer/Passes/BottomUpVec.cpp
SandboxVectorizer/SandboxVectorizer.cpp
+ SandboxVectorizer/SeedCollector.cpp
SLPVectorizer.cpp
Vectorize.cpp
VectorCombine.cpp
diff --git a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SeedCollector.cpp b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SeedCollector.cpp
new file mode 100644
index 00000000000000..56d6dc2fd0038b
--- /dev/null
+++ b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SeedCollector.cpp
@@ -0,0 +1,62 @@
+//===- SeedCollection.cpp -0000000----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/Analysis/LoopAccessAnalysis.h"
+#include "llvm/Analysis/ValueTracking.h"
+#include "llvm/IR/BasicBlock.h"
+#include "llvm/IR/Type.h"
+#include "llvm/SandboxIR/Instruction.h"
+#include "llvm/SandboxIR/Utils.h"
+#include "llvm/Support/Debug.h"
+#include <span>
+
+using namespace llvm;
+namespace llvm::sandboxir {
+
+MutableArrayRef<SeedBundle::SeedList>
+SeedBundle::getSlice(unsigned StartIdx, unsigned MaxVecRegBits,
+ bool ForcePowerOf2) {
+ // Use uint32_t here for compatibility with IsPowerOf2_32
+
+ // BitCount tracks the size of the working slice. From that we can tell
+ // when the working slice's size is a power-of-two and when it exceeds
+ // the legal size in MaxVecBits.
+ uint32_t BitCount = 0;
+ uint32_t NumElements = 0;
+ // Can't start a slice with a used instruction.
+ assert(!isUsed(StartIdx) && "Expected unused at StartIdx");
+ for (auto S : make_range(Seeds.begin() + StartIdx, Seeds.end())) {
+ uint32_t InstBits = Utils::getNumBits(S);
+ // Stop if this instruction is used, or if adding it puts the slice over
+ // the limit.
+ if (isUsed(StartIdx + NumElements) || BitCount + InstBits > MaxVecRegBits)
+ break;
+ NumElements++;
+ BitCount += Utils::getNumBits(S);
+ }
+ // Most slices will already be power-of-two-sized. But this one isn't, remove
+ // instructions until it is. This could be tracked in the loop above but the
+ // logic is harder to follow. TODO: Move if performance is unacceptable.
+ if (ForcePowerOf2) {
+ while (!isPowerOf2_32(BitCount) && NumElements > 1) {
+ BitCount -= Utils::getNumBits(Seeds[StartIdx + NumElements - 1]);
+ NumElements--;
+ }
+ }
+
+ // Return any non-empty slice
+ if (NumElements > 1)
+ return MutableArrayRef<SeedBundle::SeedList>(&Seeds + StartIdx,
+ NumElements);
+ else
+ return {};
+}
+
+} // namespace llvm::sandboxir
diff --git a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
index 9f1a3409c0c394..dcd7232db5f60c 100644
--- a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
+++ b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
@@ -11,4 +11,5 @@ add_llvm_unittest(SandboxVectorizerTests
DependencyGraphTest.cpp
IntervalTest.cpp
LegalityTest.cpp
- )
+ SeedCollectorTest.cpp
+)
diff --git a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SeedCollectorTest.cpp b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SeedCollectorTest.cpp
new file mode 100644
index 00000000000000..9ed7c4017d35ed
--- /dev/null
+++ b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SeedCollectorTest.cpp
@@ -0,0 +1,115 @@
+//===- SeedCollectorTest.cpp ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h"
+#include "llvm/AsmParser/Parser.h"
+#include "llvm/SandboxIR/Function.h"
+#include "llvm/SandboxIR/Instruction.h"
+#include "llvm/Support/SourceMgr.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+struct SeedBundleTest : public testing::Test {
+ LLVMContext C;
+ std::unique_ptr<Module> M;
+
+ void parseIR(LLVMContext &C, const char *IR) {
+ SMDiagnostic Err;
+ M = parseAssemblyString(IR, Err, C);
+ if (!M)
+ Err.print("LegalityTest", errs());
+ }
+};
+
+TEST_F(SeedBundleTest, SeedBundle) {
+ parseIR(C, R"IR(
+define void @foo(float %v0, i32 %i0, i16 %i1, i8 %i2) {
+bb:
+ %add0 = fadd float %v0, %v0
+ %add1 = fadd float %v0, %v0
+ %add2 = add i8 %i2, %i2
+ %add3 = add i16 %i1, %i1
+ %add4 = add i32 %i0, %i0
+ %add5 = add i16 %i1, %i1
+ %add6 = add i8 %i2, %i2
+ %add7 = add i8 %i2, %i2
+ ret void
+}
+)IR");
+ Function &LLVMF = *M->getFunction("foo");
+ sandboxir::Context Ctx(C);
+ auto &F = *Ctx.createFunction(&LLVMF);
+ DataLayout DL(M->getDataLayout());
+ auto *BB = &*F.begin();
+ auto It = BB->begin();
+ auto *I0 = &*It++;
+ auto *I1 = &*It++;
+ // Assume first two instructions are identical in the number of bits.
+ const unsigned IOBits = sandboxir::Utils::getNumBits(I0, DL);
+ // Constructor
+ sandboxir::SeedBundle SBO(I0);
+ EXPECT_EQ(*SBO.begin(), I0);
+ // getNumUnusedBits after constructor
+ EXPECT_EQ(SBO.getNumUnusedBits(), IOBits);
+ // setUsed
+ SBO.setUsed(I0);
+ // allUsed
+ EXPECT_TRUE(SBO.allUsed());
+ // isUsed
+ EXPECT_TRUE(SBO.isUsed(0));
+ // getNumUnusedBits after setUsed
+ EXPECT_EQ(SBO.getNumUnusedBits(), 0u);
+ // insertAt
+ SBO.insertAt(SBO.end(), I1);
+ EXPECT_NE(*SBO.begin(), I1);
+ // getNumUnusedBits after insertAt
+ EXPECT_EQ(SBO.getNumUnusedBits(), IOBits);
+ // allUsed
+ EXPECT_FALSE(SBO.allUsed());
+ // getFirstUnusedElement
+ EXPECT_EQ(SBO.getFirstUnusedElementIdx(), 1u);
+
+ sandboxir::SeedBundle::SeedList Seeds;
+ // add2 through add7
+ Seeds.push_back(&*It++);
+ Seeds.push_back(&*It++);
+ Seeds.push_back(&*It++);
+ Seeds.push_back(&*It++);
+ Seeds.push_back(&*It++);
+ Seeds.push_back(&*It++);
+ unsigned BundleBits = 0;
+ for (auto &S : Seeds)
+ BundleBits += sandboxir::Utils::getNumBits(S);
+ // Ensure the instructions are as expected.
+ EXPECT_EQ(BundleBits, 88u);
+ // Constructor
+ sandboxir::SeedBundle SB1(std::move(Seeds));
+ // getNumUnusedBits after constructor
+ EXPECT_EQ(SB1.getNumUnusedBits(), BundleBits);
+ // setUsed with index
+ SB1.setUsed(1);
+ // getFirstUnusedElementIdx
+ EXPECT_EQ(SB1.getFirstUnusedElementIdx(), 0u);
+ SB1.setUsed(unsigned(0));
+ // getFirstUnusedElementIdx not at end
+ EXPECT_EQ(SB1.getFirstUnusedElementIdx(), 2u);
+ // getSlice
+ auto Slice0 = SB1.getSlice(2, /* MaxVecRegBits */ 64,
+ /* ForcePowerOf2 */ true);
+ EXPECT_EQ(Slice0.size(), 4u);
+ SB1.setUsed(2);
+ auto Slice1 = SB1.getSlice(3, /* MaxVecRegBits */ 64,
+ /* ForcePowerOf2 */ false);
+ EXPECT_EQ(Slice1.size(), 3u);
+ // getSlice empty case
+ SB1.setUsed(3);
+ auto Slice2 = SB1.getSlice(4, /* MaxVecRegBits */ 8,
+ /* ForcePowerOf2 */ true);
+ EXPECT_EQ(Slice2.size(), 0u);
+}
|
The only comment not directly addressed is about the need for the typedef: "The SeedList type is used for the internal vector but it's also used in the constructor, which I find a bit confusing. If we make this private, would there be a real need to have a separate typedef for the type used in the constructor ? I don't think we would really need one." My response: The type is also returned as a MutableArrayReference by getSlice. So the question here would be three-fold: How do we expect clients to iterate over slices? I'm not particularly attached, but once I find myself using the same type in multiple places, I like a name for it, personally. |
I rewrote getSlice from scratch, so might be worth another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the typedef, the user of getSlice()
expects some kind of array/vector/iterable object. The type itself shouldn't matter, but even if it does we should provide a typedef for
MutableArrayRef<SeedList>
, not the nested SeedList
.
// instructions until it is. This could be tracked in the loop above but the | ||
// logic is harder to follow. TODO: Move if performance is unacceptable. | ||
if (ForcePowerOf2) { | ||
while (!isPowerOf2_32(BitCount) && NumElements > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid iterating until we get a power of 2 with the help of a function that gives you the floor power-of-2 value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote this to do it in the above loop. Seems pretty clean now.
// getSlice | ||
auto Slice0 = SB1.getSlice(2, /* MaxVecRegBits */ 64, | ||
/* ForcePowerOf2 */ true); | ||
EXPECT_EQ(Slice0.size(), 4u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we check the contents of the Slice? Something like:
EXPECT_THAT(Slice0, testing::ElementsAre(I2, I3, I4, I5))`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Removed the typedef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this version of getSlice()
is so much better!
I think we may need couple more checks to exercise getSlice()
for power-of-2 a bit more (see comments), but other than that it looks good.
EXPECT_EQ(SB1.getFirstUnusedElementIdx(), 2u); | ||
// getSlice | ||
auto Slice0 = SB1.getSlice(2, /* MaxVecRegBits */ 64, | ||
/* ForcePowerOf2 */ true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't getSlice(2, 64, false)
return Insts[2], Insts[3], Insts[4] ,Insts[5]
too?
Perhaps we also need a test for getSlice(2, 72, true)
and getSlice(2, 80, true)
to make sure that they both return Insts[2], Insts[3], Insts[4] ,Insts[5]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will. Added the additional tests.
} | ||
} | ||
if (ForcePowerOf2) { | ||
NumElements = NumElementsPowerOfTwo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this potentially make the slice contain a "used" instruction that we didn't check for? e.g. 0-2 are unused but 3 is used, force power of 2 will make the slice contain 3 which is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The early exit on line 42 guarantees that there will never be a used instruction between StartIdx and NumElements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, we only set NumElementsPowerOfTwo
if the seed is unused
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/7198 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/4700 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/177/builds/6257 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/7773 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/9673 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/8269 Here is the relevant piece of the build log for the reference
|
The error I'm seeing on the OpenMP SLES builder is this
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/174/builds/6426 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/6717 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/4745 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/76/builds/3472 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/4177 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/72/builds/4098 Here is the relevant piece of the build log for the reference
|
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/187/builds/1718 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/134/builds/6581 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/6744 Here is the relevant piece of the build log for the reference
|
[Retry 110696 with a proper rebase.]
Seed collection will assemble instructions to be vectorized into SeedBundles. This data structure is not intended to be used directly, but will be the basis for load bundles, store bundles, and so on.