Skip to content

Commit 245e607

Browse files
[LoopSink] Exit loop finding BBs to sink into early when possible (NFC) (#101115)
As noted in the comments, findBBsToSinkInto is O(UseBBs.size() * ColdLoopBBs.size()) A very large function with a huge loop was incurring a high compile time in this code. The size of the ColdLoopBBs set was over 14K. There is a limit on the size of the UseBBs set, but not the ColdLoopBBs (and adding a limit for the latter actually slowed down some later passes). This change exits the loop early once we detect that there is no further refinement possible for the BBsToSinkInto set. This is possible because the ColdLoopBBs set is sorted in ascending magnitude of frequency. This cut down the LoopSinkPass time by around 33% (78s to just over 50s).
1 parent 9843843 commit 245e607

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

llvm/lib/Transforms/Scalar/LoopSink.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,23 @@ findBBsToSinkInto(const Loop &L, const SmallPtrSetImpl<BasicBlock *> &UseBBs,
144144
BBsToSinkInto.erase(DominatedBB);
145145
}
146146
BBsToSinkInto.insert(ColdestBB);
147+
continue;
147148
}
149+
// Otherwise, see if we can stop the search through the cold BBs early.
150+
// Since the ColdLoopBBs list is sorted in increasing magnitude of
151+
// frequency the cold BB frequencies can only get larger. The
152+
// BBsToSinkInto set can only get smaller and have a smaller
153+
// adjustedSumFreq, due to the earlier checking. So once we find a cold BB
154+
// with a frequency at least as large as the adjustedSumFreq of the
155+
// current BBsToSinkInto set, the earlier frequency check can never be
156+
// true for a future iteration. Note we could do check this more
157+
// aggressively earlier, but in practice this ended up being more
158+
// expensive overall (added checking to the critical path through the loop
159+
// that often ended up continuing early due to an empty
160+
// BBsDominatedByColdestBB set, and the frequency check there was false
161+
// most of the time anyway).
162+
if (adjustedSumFreq(BBsToSinkInto, BFI) <= BFI.getBlockFreq(ColdestBB))
163+
break;
148164
}
149165

150166
// Can't sink into blocks that have no valid insertion point.

0 commit comments

Comments
 (0)