Skip to content

Commit 4d57ba1

Browse files
committed
[lsr][term-fold] Restrict expansion budget, particularly for profiled loops
This is a follow up to an item I noted in my submission comment for e947f95. I don't have a real world example where this is triggering unprofitably, but avoiding the transform when we estimate the loop to be short running from profiling seems quite reasonable. A couple of observations: * Because we already checked for affine addrecs, the worst expansion we're going to see is the unknown trip count * unknown step case (both are loop invariant). I picked a fallback cost (2 * SCEV's default expansion budget of 4) which was high enough to allow this. These cases will end up being somewhat rare anyways as finding one where we can prove legality takes some work. * Short running loops with constant trip counts produce either constant expressions (if the step is constant) or multiplications by small constant (if the step is unknown). Both of these are cheap by any reasonable budget and thus not wroth checking for explicitly. * Estimated trip counts come from profile data, loops without profile data will not have one. Overall, this doesn't end up making much of a difference in practical behavior of the transform. Probably still worth landing, but I expect this to be pretty low impact. Possible future work (which I have no plans of doing at this time): * Ppick the lowest cost alternate IV if we have multiple ones under the budget. * Use SCEV to prove a constant upper bound on the trip count, and restrict the transform for unknown-but-small trip counts. * I think we can now remove the affine restriction if we wanted. Would need to think carefully about legality on this one. I don't have examples where any of these would be profitable in practice, but we'll see what comes up in the future.
1 parent 5282202 commit 4d57ba1

File tree

1 file changed

+20
-2
lines changed

1 file changed

+20
-2
lines changed

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6762,7 +6762,7 @@ static llvm::PHINode *GetInductionVariable(const Loop &L, ScalarEvolution &SE,
67626762

67636763
static std::optional<std::tuple<PHINode *, PHINode *, const SCEV *, bool>>
67646764
canFoldTermCondOfLoop(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
6765-
const LoopInfo &LI) {
6765+
const LoopInfo &LI, const TargetTransformInfo &TTI) {
67666766
if (!L->isInnermost()) {
67676767
LLVM_DEBUG(dbgs() << "Cannot fold on non-innermost loop\n");
67686768
return std::nullopt;
@@ -6813,13 +6813,23 @@ canFoldTermCondOfLoop(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
68136813
if (!isAlmostDeadIV(ToFold, LoopLatch, TermCond))
68146814
return std::nullopt;
68156815

6816+
// Inserting instructions in the preheader has a runtime cost, scale
6817+
// the allowed cost with the loops trip count as best we can.
6818+
const unsigned ExpansionBudget = [&]() {
6819+
if (std::optional<unsigned> SmallTC = getLoopEstimatedTripCount(L))
6820+
return std::min(2*SCEVCheapExpansionBudget, *SmallTC);
6821+
// Unknown trip count, assume long running by default.
6822+
return 2*SCEVCheapExpansionBudget;
6823+
}();
6824+
68166825
const SCEV *BECount = SE.getBackedgeTakenCount(L);
68176826
const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();
68186827
SCEVExpander Expander(SE, DL, "lsr_fold_term_cond");
68196828

68206829
PHINode *ToHelpFold = nullptr;
68216830
const SCEV *TermValueS = nullptr;
68226831
bool MustDropPoison = false;
6832+
auto InsertPt = L->getLoopPreheader()->getTerminator();
68236833
for (PHINode &PN : L->getHeader()->phis()) {
68246834
if (ToFold == &PN)
68256835
continue;
@@ -6861,6 +6871,14 @@ canFoldTermCondOfLoop(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
68616871
continue;
68626872
}
68636873

6874+
if (Expander.isHighCostExpansion(TermValueSLocal, L, ExpansionBudget,
6875+
&TTI, InsertPt)) {
6876+
LLVM_DEBUG(
6877+
dbgs() << "Is too expensive to expand terminating value for phi node"
6878+
<< PN << "\n");
6879+
continue;
6880+
}
6881+
68646882
// The candidate IV may have been otherwise dead and poison from the
68656883
// very first iteration. If we can't disprove that, we can't use the IV.
68666884
if (!mustExecuteUBIfPoisonOnPathTo(&PN, LoopLatch->getTerminator(), &DT)) {
@@ -6986,7 +7004,7 @@ static bool ReduceLoopStrength(Loop *L, IVUsers &IU, ScalarEvolution &SE,
69867004
}();
69877005

69887006
if (EnableFormTerm) {
6989-
if (auto Opt = canFoldTermCondOfLoop(L, SE, DT, LI)) {
7007+
if (auto Opt = canFoldTermCondOfLoop(L, SE, DT, LI, TTI)) {
69907008
auto [ToFold, ToHelpFold, TermValueS, MustDrop] = *Opt;
69917009

69927010
Changed = true;

0 commit comments

Comments
 (0)