Skip to content

[InlineCost] Cache collectEphemeralValues() to save compile time #130210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions llvm/include/llvm/Analysis/EphemeralValuesCache.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
//===- llvm/Analysis/EphemeralValuesCache.h ---------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This pass caches ephemeral values, i.e., values that are only used by
// @llvm.assume intrinsics, for cheap access after the initial collection.
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_ANALYSIS_EPHEMERALVALUESCACHE_H
#define LLVM_ANALYSIS_EPHEMERALVALUESCACHE_H

#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/IR/PassManager.h"

namespace llvm {

class Function;
class AssumptionCache;
class Value;

/// A cache of ephemeral values within a function.
class EphemeralValuesCache {
SmallPtrSet<const Value *, 32> EphValues;
Function &F;
AssumptionCache &AC;
bool Collected = false;

void collectEphemeralValues();

public:
EphemeralValuesCache(Function &F, AssumptionCache &AC) : F(F), AC(AC) {}
void clear() {
EphValues.clear();
Collected = false;
}
const SmallPtrSetImpl<const Value *> &ephValues() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need an extra layer of indirection? can't we just eagerly calculate the ephemeral values in run()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean removing the indirection by incorporating the functionality of the EphemeralValuesCache inside the EphemeralValuesAnalysis class? Isn't it better to follow the two-class design where you implement the analysis in one class and have a separate wrapper class for the pass in case someone needs to use the analysis without a pass ?

As for the eager vs lazy approach, my guess is that lazy is preferable because it may save you some compile time if you don't end up calling the analysis, but I don't feel strongly about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeMetrics::collectEphemeralValues() is the API if someone wants to calculate ephemeral values, I don't think we need another layer on top of that. I think EphemeralValuesAnalysis::Result could just be SmallPtrSet<const Value *, 32>

the pass should only call FAM.getAnalysis if it's going to use it, so the laziness should be done at the caller level

if (!Collected)
collectEphemeralValues();
return EphValues;
}
};

class EphemeralValuesAnalysis
: public AnalysisInfoMixin<EphemeralValuesAnalysis> {
friend AnalysisInfoMixin<EphemeralValuesAnalysis>;
static AnalysisKey Key;

public:
using Result = EphemeralValuesCache;
Result run(Function &F, FunctionAnalysisManager &FAM);
};

} // namespace llvm

#endif // LLVM_ANALYSIS_EPHEMERALVALUESCACHE_H
34 changes: 18 additions & 16 deletions llvm/include/llvm/Analysis/InlineCost.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ class Function;
class ProfileSummaryInfo;
class TargetTransformInfo;
class TargetLibraryInfo;
class EphemeralValuesCache;

namespace InlineConstants {
// Various thresholds used by inline cost analysis.
Expand Down Expand Up @@ -273,28 +274,29 @@ int getCallsiteCost(const TargetTransformInfo &TTI, const CallBase &Call,
///
/// Also note that calling this function *dynamically* computes the cost of
/// inlining the callsite. It is an expensive, heavyweight call.
InlineCost
getInlineCost(CallBase &Call, const InlineParams &Params,
TargetTransformInfo &CalleeTTI,
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI = nullptr,
ProfileSummaryInfo *PSI = nullptr,
OptimizationRemarkEmitter *ORE = nullptr);
InlineCost getInlineCost(
CallBase &Call, const InlineParams &Params, TargetTransformInfo &CalleeTTI,
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI = nullptr,
ProfileSummaryInfo *PSI = nullptr, OptimizationRemarkEmitter *ORE = nullptr,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache =
nullptr);

/// Get an InlineCost with the callee explicitly specified.
/// This allows you to calculate the cost of inlining a function via a
/// pointer. This behaves exactly as the version with no explicit callee
/// parameter in all other respects.
//
InlineCost
getInlineCost(CallBase &Call, Function *Callee, const InlineParams &Params,
TargetTransformInfo &CalleeTTI,
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI = nullptr,
ProfileSummaryInfo *PSI = nullptr,
OptimizationRemarkEmitter *ORE = nullptr);
InlineCost getInlineCost(
CallBase &Call, Function *Callee, const InlineParams &Params,
TargetTransformInfo &CalleeTTI,
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI = nullptr,
ProfileSummaryInfo *PSI = nullptr, OptimizationRemarkEmitter *ORE = nullptr,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache =
nullptr);

/// Returns InlineResult::success() if the call site should be always inlined
/// because of user directives, and the inlining is viable. Returns
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Analysis/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ add_llvm_component_library(LLVMAnalysis
DominanceFrontier.cpp
DXILResource.cpp
DXILMetadataAnalysis.cpp
EphemeralValuesCache.cpp
FunctionPropertiesAnalysis.cpp
GlobalsModRef.cpp
GuardUtils.cpp
Expand Down
28 changes: 28 additions & 0 deletions llvm/lib/Analysis/EphemeralValuesCache.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
//===- EphemeralValuesCache.cpp - Cache collecting ephemeral values -------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "llvm/Analysis/EphemeralValuesCache.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/CodeMetrics.h"

namespace llvm {

void EphemeralValuesCache::collectEphemeralValues() {
CodeMetrics::collectEphemeralValues(&F, &AC, EphValues);
Collected = true;
}

AnalysisKey EphemeralValuesAnalysis::Key;

EphemeralValuesCache
EphemeralValuesAnalysis::run(Function &F, FunctionAnalysisManager &FAM) {
auto &AC = FAM.getResult<AssumptionAnalysis>(F);
return EphemeralValuesCache(F, AC);
}

} // namespace llvm
8 changes: 7 additions & 1 deletion llvm/lib/Analysis/InlineAdvisor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/EphemeralValuesCache.h"
#include "llvm/Analysis/InlineCost.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"
Expand Down Expand Up @@ -150,6 +151,10 @@ std::optional<llvm::InlineCost> static getDefaultInlineAdvice(
auto GetTLI = [&](Function &F) -> const TargetLibraryInfo & {
return FAM.getResult<TargetLibraryAnalysis>(F);
};
auto GetEphValuesCache =
[&](Function &F) -> EphemeralValuesAnalysis::Result & {
return FAM.getResult<EphemeralValuesAnalysis>(F);
};

Function &Callee = *CB.getCalledFunction();
auto &CalleeTTI = FAM.getResult<TargetIRAnalysis>(Callee);
Expand All @@ -158,7 +163,8 @@ std::optional<llvm::InlineCost> static getDefaultInlineAdvice(
Callee.getContext().getDiagHandlerPtr()->isMissedOptRemarkEnabled(
DEBUG_TYPE);
return getInlineCost(CB, Params, CalleeTTI, GetAssumptionCache, GetTLI,
GetBFI, PSI, RemarksEnabled ? &ORE : nullptr);
GetBFI, PSI, RemarksEnabled ? &ORE : nullptr,
GetEphValuesCache);
};
return llvm::shouldInline(
CB, CalleeTTI, GetInlineCost, ORE,
Expand Down
49 changes: 33 additions & 16 deletions llvm/lib/Analysis/InlineCost.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "llvm/Analysis/BlockFrequencyInfo.h"
#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/EphemeralValuesCache.h"
#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/MemoryBuiltins.h"
Expand Down Expand Up @@ -269,6 +270,9 @@ class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {
/// easily cacheable. Instead, use the cover function paramHasAttr.
CallBase &CandidateCall;

/// Getter for the cache of ephemeral values.
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache = nullptr;

/// Extension points for handling callsite features.
// Called before a basic block was analyzed.
virtual void onBlockStart(const BasicBlock *BB) {}
Expand Down Expand Up @@ -462,7 +466,7 @@ class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {

// Custom analysis routines.
InlineResult analyzeBlock(BasicBlock *BB,
SmallPtrSetImpl<const Value *> &EphValues);
const SmallPtrSetImpl<const Value *> &EphValues);

// Disable several entry points to the visitor so we don't accidentally use
// them by declaring but not defining them here.
Expand Down Expand Up @@ -510,10 +514,12 @@ class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {
function_ref<BlockFrequencyInfo &(Function &)> GetBFI = nullptr,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI = nullptr,
ProfileSummaryInfo *PSI = nullptr,
OptimizationRemarkEmitter *ORE = nullptr)
OptimizationRemarkEmitter *ORE = nullptr,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache =
nullptr)
: TTI(TTI), GetAssumptionCache(GetAssumptionCache), GetBFI(GetBFI),
GetTLI(GetTLI), PSI(PSI), F(Callee), DL(F.getDataLayout()), ORE(ORE),
CandidateCall(Call) {}
CandidateCall(Call), GetEphValuesCache(GetEphValuesCache) {}

InlineResult analyze();

Expand Down Expand Up @@ -1126,9 +1132,11 @@ class InlineCostCallAnalyzer final : public CallAnalyzer {
function_ref<const TargetLibraryInfo &(Function &)> GetTLI = nullptr,
ProfileSummaryInfo *PSI = nullptr,
OptimizationRemarkEmitter *ORE = nullptr, bool BoostIndirect = true,
bool IgnoreThreshold = false)
bool IgnoreThreshold = false,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache =
nullptr)
: CallAnalyzer(Callee, Call, TTI, GetAssumptionCache, GetBFI, GetTLI, PSI,
ORE),
ORE, GetEphValuesCache),
ComputeFullInlineCost(OptComputeFullInlineCost ||
Params.ComputeFullInlineCost || ORE ||
isCostBenefitAnalysisEnabled()),
Expand Down Expand Up @@ -2566,7 +2574,7 @@ bool CallAnalyzer::visitInstruction(Instruction &I) {
/// viable, and true if inlining remains viable.
InlineResult
CallAnalyzer::analyzeBlock(BasicBlock *BB,
SmallPtrSetImpl<const Value *> &EphValues) {
const SmallPtrSetImpl<const Value *> &EphValues) {
for (Instruction &I : *BB) {
// FIXME: Currently, the number of instructions in a function regardless of
// our ability to simplify them during inline to constants or dead code,
Expand Down Expand Up @@ -2781,11 +2789,15 @@ InlineResult CallAnalyzer::analyze() {
NumConstantOffsetPtrArgs = ConstantOffsetPtrs.size();
NumAllocaArgs = SROAArgValues.size();

// FIXME: If a caller has multiple calls to a callee, we end up recomputing
// the ephemeral values multiple times (and they're completely determined by
// the callee, so this is purely duplicate work).
SmallPtrSet<const Value *, 32> EphValues;
CodeMetrics::collectEphemeralValues(&F, &GetAssumptionCache(F), EphValues);
// Collecting the ephemeral values of `F` can be expensive, so use the
// ephemeral values cache if available.
SmallPtrSet<const Value *, 32> EphValuesStorage;
const SmallPtrSetImpl<const Value *> *EphValues = &EphValuesStorage;
if (GetEphValuesCache)
EphValues = &GetEphValuesCache(F).ephValues();
else
CodeMetrics::collectEphemeralValues(&F, &GetAssumptionCache(F),
EphValuesStorage);

// The worklist of live basic blocks in the callee *after* inlining. We avoid
// adding basic blocks of the callee which can be proven to be dead for this
Expand Down Expand Up @@ -2824,7 +2836,7 @@ InlineResult CallAnalyzer::analyze() {

// Analyze the cost of this block. If we blow through the threshold, this
// returns false, and we can bail on out.
InlineResult IR = analyzeBlock(BB, EphValues);
InlineResult IR = analyzeBlock(BB, *EphValues);
if (!IR.isSuccess())
return IR;

Expand Down Expand Up @@ -2967,9 +2979,11 @@ InlineCost llvm::getInlineCost(
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI,
ProfileSummaryInfo *PSI, OptimizationRemarkEmitter *ORE) {
ProfileSummaryInfo *PSI, OptimizationRemarkEmitter *ORE,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache) {
return getInlineCost(Call, Call.getCalledFunction(), Params, CalleeTTI,
GetAssumptionCache, GetTLI, GetBFI, PSI, ORE);
GetAssumptionCache, GetTLI, GetBFI, PSI, ORE,
GetEphValuesCache);
}

std::optional<int> llvm::getInliningCostEstimate(
Expand Down Expand Up @@ -3089,7 +3103,8 @@ InlineCost llvm::getInlineCost(
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<const TargetLibraryInfo &(Function &)> GetTLI,
function_ref<BlockFrequencyInfo &(Function &)> GetBFI,
ProfileSummaryInfo *PSI, OptimizationRemarkEmitter *ORE) {
ProfileSummaryInfo *PSI, OptimizationRemarkEmitter *ORE,
function_ref<EphemeralValuesCache &(Function &)> GetEphValuesCache) {

auto UserDecision =
llvm::getAttributeBasedInliningDecision(Call, Callee, CalleeTTI, GetTLI);
Expand All @@ -3105,7 +3120,9 @@ InlineCost llvm::getInlineCost(
<< ")\n");

InlineCostCallAnalyzer CA(*Callee, Call, Params, CalleeTTI,
GetAssumptionCache, GetBFI, GetTLI, PSI, ORE);
GetAssumptionCache, GetBFI, GetTLI, PSI, ORE,
/*BoostIndirect=*/true, /*IgnoreThreshold=*/false,
GetEphValuesCache);
InlineResult ShouldInline = CA.analyze();

LLVM_DEBUG(CA.dump());
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Passes/PassBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
#include "llvm/Analysis/DependenceAnalysis.h"
#include "llvm/Analysis/DomPrinter.h"
#include "llvm/Analysis/DominanceFrontier.h"
#include "llvm/Analysis/EphemeralValuesCache.h"
#include "llvm/Analysis/FunctionPropertiesAnalysis.h"
#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/IRSimilarityIdentifier.h"
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Passes/PassRegistry.def
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,7 @@ FUNCTION_ANALYSIS("debug-ata", DebugAssignmentTrackingAnalysis())
FUNCTION_ANALYSIS("demanded-bits", DemandedBitsAnalysis())
FUNCTION_ANALYSIS("domfrontier", DominanceFrontierAnalysis())
FUNCTION_ANALYSIS("domtree", DominatorTreeAnalysis())
FUNCTION_ANALYSIS("ephemerals", EphemeralValuesAnalysis())
FUNCTION_ANALYSIS("func-properties", FunctionPropertiesAnalysis())
FUNCTION_ANALYSIS("machine-function-info", MachineFunctionAnalysis(TM))
FUNCTION_ANALYSIS("gc-function", GCFunctionAnalysis())
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Transforms/IPO/Inliner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/BlockFrequencyInfo.h"
#include "llvm/Analysis/CGSCCPassManager.h"
#include "llvm/Analysis/EphemeralValuesCache.h"
#include "llvm/Analysis/InlineAdvisor.h"
#include "llvm/Analysis/InlineCost.h"
#include "llvm/Analysis/LazyCallGraph.h"
Expand Down Expand Up @@ -388,6 +389,9 @@ PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
Advice->recordUnsuccessfulInlining(IR);
continue;
}
// TODO: Shouldn't we be invalidating all analyses on F here?
// The caller was modified, so invalidate Ephemeral Values.
FAM.getResult<EphemeralValuesAnalysis>(F).clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, shouldn't we be invalidating all analyses on the function here, rather than just EphemeralValues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, but let's make this a separate patch in case it breaks something. I added a TODO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds reasonable.


DidInline = true;
InlinedCallees.insert(&Callee);
Expand Down
3 changes: 3 additions & 0 deletions llvm/test/Transforms/Inline/cgscc-incremental-invalidate.ll
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,18 @@
; CHECK: Invalidating analysis: LoopAnalysis on test1_f
; CHECK: Invalidating analysis: BranchProbabilityAnalysis on test1_f
; CHECK: Invalidating analysis: BlockFrequencyAnalysis on test1_f
; CHECK: Invalidating analysis: EphemeralValuesAnalysis on test1_f
; CHECK: Running analysis: DominatorTreeAnalysis on test1_g
; CHECK: Invalidating analysis: DominatorTreeAnalysis on test1_g
; CHECK: Invalidating analysis: LoopAnalysis on test1_g
; CHECK: Invalidating analysis: BranchProbabilityAnalysis on test1_g
; CHECK: Invalidating analysis: BlockFrequencyAnalysis on test1_g
; CHECK: Invalidating analysis: EphemeralValuesAnalysis on test1_g
; CHECK: Invalidating analysis: DominatorTreeAnalysis on test1_h
; CHECK: Invalidating analysis: LoopAnalysis on test1_h
; CHECK: Invalidating analysis: BranchProbabilityAnalysis on test1_h
; CHECK: Invalidating analysis: BlockFrequencyAnalysis on test1_h
; CHECK: Invalidating analysis: EphemeralValuesAnalysis on test1_h
; CHECK-NOT: Invalidating analysis:
; CHECK: Running pass: DominatorTreeVerifierPass on test1_g
; CHECK-NEXT: Running analysis: DominatorTreeAnalysis on test1_g
Expand Down
1 change: 1 addition & 0 deletions llvm/unittests/Analysis/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ set(ANALYSIS_TEST_SOURCES
DDGTest.cpp
DomTreeUpdaterTest.cpp
DXILResourceTest.cpp
EphemeralValuesCacheTest.cpp
GraphWriterTest.cpp
GlobalsModRefTest.cpp
FunctionPropertiesAnalysisTest.cpp
Expand Down
Loading