Skip to content
This repository was archived by the owner on Feb 5, 2019. It is now read-only.

Commit 1cbf1da

Browse files
committed
[PM/Unswitch] Fix a nasty bug in the new PM's unswitch introduced in
r335553 with the non-trivial unswitching of switches. The code correctly updated most aspects of the CFG and analyses, but missed some crucial aspects: 1) When multiple cases have the same successor, we unswitch that a single time and replace the switch with a direct branch. The CFG here is correct, but the target of this direct branch may have had a PHI node with multiple entries in it. 2) When we still have to clone a successor of the switch into an unswitched copy of the loop, we'll delete potentially multiple edges entering this successor, not just one. 3) We also have to delete multiple edges entering the successors in the original loop when they have to be retained. 4) When the "retained successor" *also* occurs as a case successor, we just assert failed everywhere. This doesn't happen very easily because its always valid to simply drop the case -- the retained successor for switches is always the default successor. However, it is likely possible through some contrivance of different loop passes, unrolling, and simplifying for this to occur in practice and certainly there is nothing "invalid" about the IR so this pass needs to handle it. 5) In the case of #4, we also will replace these multiple edges with a direct branch much like in #1 and need to collapse the entries in any PHI nodes to a single enrty. All of this stems from the delightful fact that the same successor can show up in multiple parts of the switch terminator, and each of these are considered a distinct edge for the purpose of PHI nodes (and iterating the successors and predecessors) but not for unswitching itself, the dominator tree, or many other things. For the record, I intensely dislike this "feature" of the IR in large part because of the complexity it causes in passes like this. We already have a ton of logic building sets and handling duplicates, and we just had to add a bunch more. I've added a complex test case that covers all five of the above failure modes. I've also added a variation on it where #4 and #5 occur in loop exit, adding fun where we have an LCSSA PHI node with "multiple entries" despite have dedicated exits. There were no additional issues found by this, but it seems a useful corner case to cover with testing. One thing that working on all of this code has made painfully clear for me as well is how amazingly inefficient our PHI node representation is (in terms of the in-memory data structures and the APIs used to update them). This code has truly marvelous complexity bounds because every time we remove an entry from a PHI node we do a linear scan to find it and then a linear update to the data structure to remove it. We could in theory batch all of the PHI node updates into a single linear walk of the operands making this much more efficient, but the APIs fight hard against this and the fact that we have to handle duplicates in the peculiar manner we do (removing all but one in some cases) makes even implementing that very tedious and annoying. Anyways, none of this is new here or specific to loop unswitching. All code in LLVM that updates PHI node operands suffers from these problems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336536 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent 5070d36 commit 1cbf1da

File tree

2 files changed

+499
-92
lines changed

2 files changed

+499
-92
lines changed

lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

+81-26
Original file line numberDiff line numberDiff line change
@@ -949,19 +949,6 @@ static BasicBlock *buildClonedLoopBlocks(
949949
AC.registerAssumption(II);
950950
}
951951

952-
// Remove the cloned parent as a predecessor of the cloned continue successor
953-
// if we did in fact clone it.
954-
auto *ClonedParentBB = cast<BasicBlock>(VMap.lookup(ParentBB));
955-
if (auto *ClonedContinueSuccBB =
956-
cast_or_null<BasicBlock>(VMap.lookup(ContinueSuccBB)))
957-
ClonedContinueSuccBB->removePredecessor(ClonedParentBB,
958-
/*DontDeleteUselessPHIs*/ true);
959-
// Replace the cloned branch with an unconditional branch to the cloned
960-
// unswitched successor.
961-
auto *ClonedSuccBB = cast<BasicBlock>(VMap.lookup(UnswitchedSuccBB));
962-
ClonedParentBB->getTerminator()->eraseFromParent();
963-
BranchInst::Create(ClonedSuccBB, ClonedParentBB);
964-
965952
// Update any PHI nodes in the cloned successors of the skipped blocks to not
966953
// have spurious incoming values.
967954
for (auto *LoopBB : L.blocks())
@@ -971,6 +958,45 @@ static BasicBlock *buildClonedLoopBlocks(
971958
for (PHINode &PN : ClonedSuccBB->phis())
972959
PN.removeIncomingValue(LoopBB, /*DeletePHIIfEmpty*/ false);
973960

961+
// Remove the cloned parent as a predecessor of any successor we ended up
962+
// cloning other than the unswitched one.
963+
auto *ClonedParentBB = cast<BasicBlock>(VMap.lookup(ParentBB));
964+
for (auto *SuccBB : successors(ParentBB)) {
965+
if (SuccBB == UnswitchedSuccBB)
966+
continue;
967+
968+
auto *ClonedSuccBB = cast_or_null<BasicBlock>(VMap.lookup(SuccBB));
969+
if (!ClonedSuccBB)
970+
continue;
971+
972+
ClonedSuccBB->removePredecessor(ClonedParentBB,
973+
/*DontDeleteUselessPHIs*/ true);
974+
}
975+
976+
// Replace the cloned branch with an unconditional branch to the cloned
977+
// unswitched successor.
978+
auto *ClonedSuccBB = cast<BasicBlock>(VMap.lookup(UnswitchedSuccBB));
979+
ClonedParentBB->getTerminator()->eraseFromParent();
980+
BranchInst::Create(ClonedSuccBB, ClonedParentBB);
981+
982+
// If there are duplicate entries in the PHI nodes because of multiple edges
983+
// to the unswitched successor, we need to nuke all but one as we replaced it
984+
// with a direct branch.
985+
for (PHINode &PN : ClonedSuccBB->phis()) {
986+
bool Found = false;
987+
// Loop over the incoming operands backwards so we can easily delete as we
988+
// go without invalidating the index.
989+
for (int i = PN.getNumOperands() - 1; i >= 0; --i) {
990+
if (PN.getIncomingBlock(i) != ClonedParentBB)
991+
continue;
992+
if (!Found) {
993+
Found = true;
994+
continue;
995+
}
996+
PN.removeIncomingValue(i, /*DeletePHIIfEmpty*/ false);
997+
}
998+
}
999+
9741000
// Record the domtree updates for the new blocks.
9751001
SmallPtrSet<BasicBlock *, 4> SuccSet;
9761002
for (auto *ClonedBB : NewBlocks) {
@@ -1779,7 +1805,8 @@ static bool unswitchNontrivialInvariants(
17791805
UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));
17801806
else
17811807
for (auto Case : SI->cases())
1782-
UnswitchedSuccBBs.insert(Case.getCaseSuccessor());
1808+
if (Case.getCaseSuccessor() != RetainedSuccBB)
1809+
UnswitchedSuccBBs.insert(Case.getCaseSuccessor());
17831810

17841811
assert(!UnswitchedSuccBBs.count(RetainedSuccBB) &&
17851812
"Should not unswitch the same successor we are retaining!");
@@ -1873,19 +1900,44 @@ static bool unswitchNontrivialInvariants(
18731900
// nuke the initial terminator placed in the split block.
18741901
SplitBB->getTerminator()->eraseFromParent();
18751902
if (FullUnswitch) {
1876-
for (BasicBlock *SuccBB : UnswitchedSuccBBs) {
1903+
// First we need to unhook the successor relationship as we'll be replacing
1904+
// the terminator with a direct branch. This is much simpler for branches
1905+
// than switches so we handle those first.
1906+
if (BI) {
18771907
// Remove the parent as a predecessor of the unswitched successor.
1878-
SuccBB->removePredecessor(ParentBB,
1879-
/*DontDeleteUselessPHIs*/ true);
1880-
DTUpdates.push_back({DominatorTree::Delete, ParentBB, SuccBB});
1908+
assert(UnswitchedSuccBBs.size() == 1 &&
1909+
"Only one possible unswitched block for a branch!");
1910+
BasicBlock *UnswitchedSuccBB = *UnswitchedSuccBBs.begin();
1911+
UnswitchedSuccBB->removePredecessor(ParentBB,
1912+
/*DontDeleteUselessPHIs*/ true);
1913+
DTUpdates.push_back({DominatorTree::Delete, ParentBB, UnswitchedSuccBB});
1914+
} else {
1915+
// Note that we actually want to remove the parent block as a predecessor
1916+
// of *every* case successor. The case successor is either unswitched,
1917+
// completely eliminating an edge from the parent to that successor, or it
1918+
// is a duplicate edge to the retained successor as the retained successor
1919+
// is always the default successor and as we'll replace this with a direct
1920+
// branch we no longer need the duplicate entries in the PHI nodes.
1921+
assert(SI->getDefaultDest() == RetainedSuccBB &&
1922+
"Not retaining default successor!");
1923+
for (auto &Case : SI->cases())
1924+
Case.getCaseSuccessor()->removePredecessor(
1925+
ParentBB,
1926+
/*DontDeleteUselessPHIs*/ true);
1927+
1928+
// We need to use the set to populate domtree updates as even when there
1929+
// are multiple cases pointing at the same successor we only want to
1930+
// remove and insert one edge in the domtree.
1931+
for (BasicBlock *SuccBB : UnswitchedSuccBBs)
1932+
DTUpdates.push_back({DominatorTree::Delete, ParentBB, SuccBB});
18811933
}
18821934

1883-
// Now splice the terminator from the original loop and rewrite its
1884-
// successors.
1935+
// Now that we've unhooked the successor relationship, splice the terminator
1936+
// from the original loop to the split.
18851937
SplitBB->getInstList().splice(SplitBB->end(), ParentBB->getInstList(), TI);
1938+
1939+
// Now wire up the terminator to the preheaders.
18861940
if (BI) {
1887-
assert(UnswitchedSuccBBs.size() == 1 &&
1888-
"Only one possible unswitched block for a branch!");
18891941
BasicBlock *ClonedPH = ClonedPHs.begin()->second;
18901942
BI->setSuccessor(ClonedSucc, ClonedPH);
18911943
BI->setSuccessor(1 - ClonedSucc, LoopPH);
@@ -1894,16 +1946,19 @@ static bool unswitchNontrivialInvariants(
18941946
assert(SI && "Must either be a branch or switch!");
18951947

18961948
// Walk the cases and directly update their successors.
1949+
SI->setDefaultDest(LoopPH);
18971950
for (auto &Case : SI->cases())
1898-
Case.setSuccessor(ClonedPHs.find(Case.getCaseSuccessor())->second);
1951+
if (Case.getCaseSuccessor() == RetainedSuccBB)
1952+
Case.setSuccessor(LoopPH);
1953+
else
1954+
Case.setSuccessor(ClonedPHs.find(Case.getCaseSuccessor())->second);
1955+
18991956
// We need to use the set to populate domtree updates as even when there
19001957
// are multiple cases pointing at the same successor we only want to
1901-
// insert one edge in the domtree.
1958+
// remove and insert one edge in the domtree.
19021959
for (BasicBlock *SuccBB : UnswitchedSuccBBs)
19031960
DTUpdates.push_back(
19041961
{DominatorTree::Insert, SplitBB, ClonedPHs.find(SuccBB)->second});
1905-
1906-
SI->setDefaultDest(LoopPH);
19071962
}
19081963

19091964
// Create a new unconditional branch to the continuing block (as opposed to

0 commit comments

Comments
 (0)