-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[Coro] Use DebugInfoCache to speed up cloning in CoroSplitPass #118630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coro] Use DebugInfoCache to speed up cloning in CoroSplitPass #118630
Conversation
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of global debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt GlobalDI | Cached CU DIFinder (cur.) | |-----------------+----------+----------------+-------------------+---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | |-----------------+----------+----------------+-------------------+---------------------------| | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
4bc420c
to
bec4d02
Compare
784c570
to
bdb7970
Compare
@llvm/pr-subscribers-debuginfo @llvm/pr-subscribers-llvm-transforms Author: Artem Pianykh (artempyanykh) Changes[Coro] Use DebugInfoCache to speed up cloning in CoroSplitPass Summary: The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample Test Plan: Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. Full diff: https://github.com/llvm/llvm-project/pull/118630.diff 14 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Coroutines/ABI.h b/llvm/include/llvm/Transforms/Coroutines/ABI.h
index 0b2d405f3caec4..2cf614b6bb1e2a 100644
--- a/llvm/include/llvm/Transforms/Coroutines/ABI.h
+++ b/llvm/include/llvm/Transforms/Coroutines/ABI.h
@@ -15,6 +15,7 @@
#ifndef LLVM_TRANSFORMS_COROUTINES_ABI_H
#define LLVM_TRANSFORMS_COROUTINES_ABI_H
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Transforms/Coroutines/CoroShape.h"
#include "llvm/Transforms/Coroutines/MaterializationUtils.h"
@@ -53,7 +54,8 @@ class BaseABI {
// Perform the function splitting according to the ABI.
virtual void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) = 0;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) = 0;
Function &F;
coro::Shape &Shape;
@@ -73,7 +75,8 @@ class SwitchABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
class AsyncABI : public BaseABI {
@@ -86,7 +89,8 @@ class AsyncABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
class AnyRetconABI : public BaseABI {
@@ -99,7 +103,8 @@ class AnyRetconABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
} // end namespace coro
diff --git a/llvm/lib/Analysis/CGSCCPassManager.cpp b/llvm/lib/Analysis/CGSCCPassManager.cpp
index 948bc2435ab275..3ba085cdb0be8b 100644
--- a/llvm/lib/Analysis/CGSCCPassManager.cpp
+++ b/llvm/lib/Analysis/CGSCCPassManager.cpp
@@ -14,6 +14,7 @@
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/iterator_range.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/InstIterator.h"
@@ -139,6 +140,11 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
// Get the call graph for this module.
LazyCallGraph &CG = AM.getResult<LazyCallGraphAnalysis>(M);
+ // Prime DebugInfoCache.
+ // TODO: Currently, the only user is CoroSplitPass. Consider running
+ // conditionally.
+ AM.getResult<DebugInfoCacheAnalysis>(M);
+
// Get Function analysis manager from its proxy.
FunctionAnalysisManager &FAM =
AM.getCachedResult<FunctionAnalysisManagerModuleProxy>(M)->getManager();
@@ -350,6 +356,7 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
// analysis proxies by handling them above and in any nested pass managers.
PA.preserveSet<AllAnalysesOn<LazyCallGraph::SCC>>();
PA.preserve<LazyCallGraphAnalysis>();
+ PA.preserve<DebugInfoCacheAnalysis>();
PA.preserve<CGSCCAnalysisManagerModuleProxy>();
PA.preserve<FunctionAnalysisManagerModuleProxy>();
return PA;
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 2803b340bd22e0..96e531d498d722 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -32,6 +32,7 @@
#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/TargetTransformInfo.h"
@@ -79,15 +80,39 @@ using namespace llvm;
#define DEBUG_TYPE "coro-split"
namespace {
+const DebugInfoFinder *cachedDIFinder(Function &F,
+ const DebugInfoCache *DICache) {
+ if (!DICache)
+ return nullptr;
+
+ auto *SP = F.getSubprogram();
+ auto *CU = SP ? SP->getUnit() : nullptr;
+ if (!CU)
+ return nullptr;
+
+ auto Found = DICache->Result.find(CU);
+ if (Found == DICache->Result.end())
+ return nullptr;
+
+ return &Found->getSecond();
+}
+
/// Collect (a known) subset of global debug info metadata potentially used by
/// the function \p F.
///
/// This metadata set can be used to avoid cloning debug info not owned by \p F
/// and is shared among all potential clones \p F.
-void collectGlobalDebugInfo(Function &F, MetadataSetTy &GlobalDebugInfo) {
+void collectGlobalDebugInfo(Function &F, MetadataSetTy &GlobalDebugInfo,
+ const DebugInfoCache *DICache) {
TimeTraceScope FunctionScope("CollectGlobalDebugInfo");
DebugInfoFinder DIFinder;
+
+ // Copy DIFinder from cache which is primed on F's compile unit when available
+ auto *PrimedDIFinder = cachedDIFinder(F, DICache);
+ if (PrimedDIFinder)
+ DIFinder = *PrimedDIFinder;
+
DISubprogram *SPClonedWithinModule = CollectDebugInfoForCloning(
F, CloneFunctionChangeType::LocalChangesOnly, DIFinder);
@@ -1394,11 +1419,11 @@ namespace {
struct SwitchCoroutineSplitter {
static void split(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI, const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Switch);
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
// Create a resume clone by cloning the body of the original function,
// setting new entry block and replacing coro.suspend an appropriate value
@@ -1712,7 +1737,8 @@ CallInst *coro::createMustTailCall(DebugLoc Loc, Function *MustTailCallFn,
void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Async);
assert(Clones.empty());
// Reset various things that the optimizer might have decided it
@@ -1799,7 +1825,7 @@ void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
assert(Clones.size() == Shape.CoroSuspends.size());
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
for (auto [Idx, CS] : llvm::enumerate(Shape.CoroSuspends)) {
auto *Suspend = CS;
@@ -1812,7 +1838,8 @@ void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
void coro::AnyRetconABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Retcon || Shape.ABI == coro::ABI::RetconOnce);
assert(Clones.empty());
@@ -1934,7 +1961,7 @@ void coro::AnyRetconABI::splitCoroutine(Function &F, coro::Shape &Shape,
assert(Clones.size() == Shape.CoroSuspends.size());
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
for (auto [Idx, CS] : llvm::enumerate(Shape.CoroSuspends)) {
auto Suspend = CS;
@@ -1988,13 +2015,15 @@ static bool hasSafeElideCaller(Function &F) {
void coro::SwitchABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
- SwitchCoroutineSplitter::split(F, Shape, Clones, TTI);
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
+ SwitchCoroutineSplitter::split(F, Shape, Clones, TTI, DICache);
}
static void doSplitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
coro::BaseABI &ABI, TargetTransformInfo &TTI,
- bool OptimizeFrame) {
+ bool OptimizeFrame,
+ const DebugInfoCache *DICache) {
PrettyStackTraceFunction prettyStackTrace(F);
auto &Shape = ABI.Shape;
@@ -2019,7 +2048,7 @@ static void doSplitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
if (isNoSuspendCoroutine) {
handleNoSuspendCoroutine(Shape);
} else {
- ABI.splitCoroutine(F, Shape, Clones, TTI);
+ ABI.splitCoroutine(F, Shape, Clones, TTI, DICache);
}
// Replace all the swifterror operations in the original function.
@@ -2216,6 +2245,9 @@ PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
auto &FAM =
AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager();
+ const auto &MAMProxy = AM.getResult<ModuleAnalysisManagerCGSCCProxy>(C, CG);
+ const auto *DICache = MAMProxy.getCachedResult<DebugInfoCacheAnalysis>(M);
+
// Check for uses of llvm.coro.prepare.retcon/async.
SmallVector<Function *, 2> PrepareFns;
addPrepareFunction(M, PrepareFns, "llvm.coro.prepare.retcon");
@@ -2252,7 +2284,7 @@ PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
SmallVector<Function *, 4> Clones;
auto &TTI = FAM.getResult<TargetIRAnalysis>(F);
- doSplitCoroutine(F, Clones, *ABI, TTI, OptimizeFrame);
+ doSplitCoroutine(F, Clones, *ABI, TTI, OptimizeFrame, DICache);
CurrentSCC = &updateCallGraphAfterCoroutineSplit(
*N, Shape, Clones, *CurrentSCC, CG, AM, UR, FAM);
diff --git a/llvm/test/Other/new-pass-manager.ll b/llvm/test/Other/new-pass-manager.ll
index f0fe708806f1b6..53fd6fe2a317ec 100644
--- a/llvm/test/Other/new-pass-manager.ll
+++ b/llvm/test/Other/new-pass-manager.ll
@@ -23,6 +23,7 @@
; CHECK-CGSCC-PASS-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*(FunctionAnalysisManager|AnalysisManager<.*Function.*>).*}},{{.*}}Module>
; CHECK-CGSCC-PASS-NEXT: Running analysis: LazyCallGraphAnalysis
; CHECK-CGSCC-PASS-NEXT: Running analysis: TargetLibraryAnalysis
+; CHECK-CGSCC-PASS-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-CGSCC-PASS-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-CGSCC-PASS-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-CGSCC-PASS-NEXT: Running pass: NoOpCGSCCPass
diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll
index 7cf035b0c6f376..19bfec3ab718e0 100644
--- a/llvm/test/Other/new-pm-defaults.ll
+++ b/llvm/test/Other/new-pm-defaults.ll
@@ -139,6 +139,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-lto-defaults.ll b/llvm/test/Other/new-pm-lto-defaults.ll
index f788db1e338a1e..8f4fa763b52098 100644
--- a/llvm/test/Other/new-pm-lto-defaults.ll
+++ b/llvm/test/Other/new-pm-lto-defaults.ll
@@ -43,6 +43,7 @@
; CHECK-O23SZ-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
; CHECK-O23SZ-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}SCC
; CHECK-O23SZ-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O23SZ-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O23SZ-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O23SZ-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph{{.*}}>
; CHECK-O23SZ-NEXT: Running pass: PostOrderFunctionAttrsPass
diff --git a/llvm/test/Other/new-pm-pgo-preinline.ll b/llvm/test/Other/new-pm-pgo-preinline.ll
index f07a3728ba3d48..97813bb2433642 100644
--- a/llvm/test/Other/new-pm-pgo-preinline.ll
+++ b/llvm/test/Other/new-pm-pgo-preinline.ll
@@ -5,6 +5,7 @@
; CHECK-Osz-NEXT: Running analysis: InlineAdvisorAnalysis
; CHECK-Osz-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-Osz-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-Osz-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-Osz-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy on (foo)
; CHECK-Osz-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-Osz-NEXT: Running pass: InlinerPass on (foo)
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
index ed13402e1c4b15..e1ad86015fda95 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
@@ -74,6 +74,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
index c82c34f7ff01e7..3f6c5351e8e8e0 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
@@ -62,6 +62,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
index d375747547d61f..371dde305b0990 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
@@ -71,6 +71,7 @@
; CHECK-O-NEXT: Invalidating analysis: AAManager
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
index 5aacd26def2be5..860fb99525030f 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
@@ -106,6 +106,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
index f6a94065968038..d97cc97169b56b 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
@@ -62,6 +62,7 @@
; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy on (foo)
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: InlinerPass on (foo)
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
index 48a9433d249996..d338817d076463 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
@@ -76,6 +76,7 @@
; CHECK-O-NEXT: Invalidating analysis: AAManager
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp b/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
index 5c71bc8063d6c9..7212107d992638 100644
--- a/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
+++ b/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
@@ -7,6 +7,7 @@
//===----------------------------------------------------------------------===//
#include "llvm/Analysis/CGSCCPassManager.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/AsmParser/Parser.h"
@@ -16,8 +17,8 @@
#include "llvm/IR/Instructions.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
-#include "llvm/IR/PassManager.h"
#include "llvm/IR/PassInstrumentation.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/IR/Verifier.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/Transforms/Utils/CallGraphUpdater.h"
@@ -255,6 +256,7 @@ class CGSCCPassManagerTest : public ::testing::Test {
"}\n")) {
FAM.registerPass([&] { return TargetLibraryAnalysis(); });
MAM.registerPass([&] { return LazyCallGraphAnalysis(); });
+ MAM.registerPass([&] { return DebugInfoCacheAnalysis(); });
MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); });
// Register required pass instrumentation analysis.
|
@llvm/pr-subscribers-coroutines Author: Artem Pianykh (artempyanykh) Changes[Coro] Use DebugInfoCache to speed up cloning in CoroSplitPass Summary: The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample Test Plan: Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. Full diff: https://github.com/llvm/llvm-project/pull/118630.diff 14 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Coroutines/ABI.h b/llvm/include/llvm/Transforms/Coroutines/ABI.h
index 0b2d405f3caec4..2cf614b6bb1e2a 100644
--- a/llvm/include/llvm/Transforms/Coroutines/ABI.h
+++ b/llvm/include/llvm/Transforms/Coroutines/ABI.h
@@ -15,6 +15,7 @@
#ifndef LLVM_TRANSFORMS_COROUTINES_ABI_H
#define LLVM_TRANSFORMS_COROUTINES_ABI_H
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Transforms/Coroutines/CoroShape.h"
#include "llvm/Transforms/Coroutines/MaterializationUtils.h"
@@ -53,7 +54,8 @@ class BaseABI {
// Perform the function splitting according to the ABI.
virtual void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) = 0;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) = 0;
Function &F;
coro::Shape &Shape;
@@ -73,7 +75,8 @@ class SwitchABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
class AsyncABI : public BaseABI {
@@ -86,7 +89,8 @@ class AsyncABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
class AnyRetconABI : public BaseABI {
@@ -99,7 +103,8 @@ class AnyRetconABI : public BaseABI {
void splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) override;
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) override;
};
} // end namespace coro
diff --git a/llvm/lib/Analysis/CGSCCPassManager.cpp b/llvm/lib/Analysis/CGSCCPassManager.cpp
index 948bc2435ab275..3ba085cdb0be8b 100644
--- a/llvm/lib/Analysis/CGSCCPassManager.cpp
+++ b/llvm/lib/Analysis/CGSCCPassManager.cpp
@@ -14,6 +14,7 @@
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/iterator_range.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/InstIterator.h"
@@ -139,6 +140,11 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
// Get the call graph for this module.
LazyCallGraph &CG = AM.getResult<LazyCallGraphAnalysis>(M);
+ // Prime DebugInfoCache.
+ // TODO: Currently, the only user is CoroSplitPass. Consider running
+ // conditionally.
+ AM.getResult<DebugInfoCacheAnalysis>(M);
+
// Get Function analysis manager from its proxy.
FunctionAnalysisManager &FAM =
AM.getCachedResult<FunctionAnalysisManagerModuleProxy>(M)->getManager();
@@ -350,6 +356,7 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
// analysis proxies by handling them above and in any nested pass managers.
PA.preserveSet<AllAnalysesOn<LazyCallGraph::SCC>>();
PA.preserve<LazyCallGraphAnalysis>();
+ PA.preserve<DebugInfoCacheAnalysis>();
PA.preserve<CGSCCAnalysisManagerModuleProxy>();
PA.preserve<FunctionAnalysisManagerModuleProxy>();
return PA;
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 2803b340bd22e0..96e531d498d722 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -32,6 +32,7 @@
#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/TargetTransformInfo.h"
@@ -79,15 +80,39 @@ using namespace llvm;
#define DEBUG_TYPE "coro-split"
namespace {
+const DebugInfoFinder *cachedDIFinder(Function &F,
+ const DebugInfoCache *DICache) {
+ if (!DICache)
+ return nullptr;
+
+ auto *SP = F.getSubprogram();
+ auto *CU = SP ? SP->getUnit() : nullptr;
+ if (!CU)
+ return nullptr;
+
+ auto Found = DICache->Result.find(CU);
+ if (Found == DICache->Result.end())
+ return nullptr;
+
+ return &Found->getSecond();
+}
+
/// Collect (a known) subset of global debug info metadata potentially used by
/// the function \p F.
///
/// This metadata set can be used to avoid cloning debug info not owned by \p F
/// and is shared among all potential clones \p F.
-void collectGlobalDebugInfo(Function &F, MetadataSetTy &GlobalDebugInfo) {
+void collectGlobalDebugInfo(Function &F, MetadataSetTy &GlobalDebugInfo,
+ const DebugInfoCache *DICache) {
TimeTraceScope FunctionScope("CollectGlobalDebugInfo");
DebugInfoFinder DIFinder;
+
+ // Copy DIFinder from cache which is primed on F's compile unit when available
+ auto *PrimedDIFinder = cachedDIFinder(F, DICache);
+ if (PrimedDIFinder)
+ DIFinder = *PrimedDIFinder;
+
DISubprogram *SPClonedWithinModule = CollectDebugInfoForCloning(
F, CloneFunctionChangeType::LocalChangesOnly, DIFinder);
@@ -1394,11 +1419,11 @@ namespace {
struct SwitchCoroutineSplitter {
static void split(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI, const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Switch);
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
// Create a resume clone by cloning the body of the original function,
// setting new entry block and replacing coro.suspend an appropriate value
@@ -1712,7 +1737,8 @@ CallInst *coro::createMustTailCall(DebugLoc Loc, Function *MustTailCallFn,
void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Async);
assert(Clones.empty());
// Reset various things that the optimizer might have decided it
@@ -1799,7 +1825,7 @@ void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
assert(Clones.size() == Shape.CoroSuspends.size());
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
for (auto [Idx, CS] : llvm::enumerate(Shape.CoroSuspends)) {
auto *Suspend = CS;
@@ -1812,7 +1838,8 @@ void coro::AsyncABI::splitCoroutine(Function &F, coro::Shape &Shape,
void coro::AnyRetconABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
assert(Shape.ABI == coro::ABI::Retcon || Shape.ABI == coro::ABI::RetconOnce);
assert(Clones.empty());
@@ -1934,7 +1961,7 @@ void coro::AnyRetconABI::splitCoroutine(Function &F, coro::Shape &Shape,
assert(Clones.size() == Shape.CoroSuspends.size());
MetadataSetTy GlobalDebugInfo;
- collectGlobalDebugInfo(F, GlobalDebugInfo);
+ collectGlobalDebugInfo(F, GlobalDebugInfo, DICache);
for (auto [Idx, CS] : llvm::enumerate(Shape.CoroSuspends)) {
auto Suspend = CS;
@@ -1988,13 +2015,15 @@ static bool hasSafeElideCaller(Function &F) {
void coro::SwitchABI::splitCoroutine(Function &F, coro::Shape &Shape,
SmallVectorImpl<Function *> &Clones,
- TargetTransformInfo &TTI) {
- SwitchCoroutineSplitter::split(F, Shape, Clones, TTI);
+ TargetTransformInfo &TTI,
+ const DebugInfoCache *DICache) {
+ SwitchCoroutineSplitter::split(F, Shape, Clones, TTI, DICache);
}
static void doSplitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
coro::BaseABI &ABI, TargetTransformInfo &TTI,
- bool OptimizeFrame) {
+ bool OptimizeFrame,
+ const DebugInfoCache *DICache) {
PrettyStackTraceFunction prettyStackTrace(F);
auto &Shape = ABI.Shape;
@@ -2019,7 +2048,7 @@ static void doSplitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
if (isNoSuspendCoroutine) {
handleNoSuspendCoroutine(Shape);
} else {
- ABI.splitCoroutine(F, Shape, Clones, TTI);
+ ABI.splitCoroutine(F, Shape, Clones, TTI, DICache);
}
// Replace all the swifterror operations in the original function.
@@ -2216,6 +2245,9 @@ PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
auto &FAM =
AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager();
+ const auto &MAMProxy = AM.getResult<ModuleAnalysisManagerCGSCCProxy>(C, CG);
+ const auto *DICache = MAMProxy.getCachedResult<DebugInfoCacheAnalysis>(M);
+
// Check for uses of llvm.coro.prepare.retcon/async.
SmallVector<Function *, 2> PrepareFns;
addPrepareFunction(M, PrepareFns, "llvm.coro.prepare.retcon");
@@ -2252,7 +2284,7 @@ PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
SmallVector<Function *, 4> Clones;
auto &TTI = FAM.getResult<TargetIRAnalysis>(F);
- doSplitCoroutine(F, Clones, *ABI, TTI, OptimizeFrame);
+ doSplitCoroutine(F, Clones, *ABI, TTI, OptimizeFrame, DICache);
CurrentSCC = &updateCallGraphAfterCoroutineSplit(
*N, Shape, Clones, *CurrentSCC, CG, AM, UR, FAM);
diff --git a/llvm/test/Other/new-pass-manager.ll b/llvm/test/Other/new-pass-manager.ll
index f0fe708806f1b6..53fd6fe2a317ec 100644
--- a/llvm/test/Other/new-pass-manager.ll
+++ b/llvm/test/Other/new-pass-manager.ll
@@ -23,6 +23,7 @@
; CHECK-CGSCC-PASS-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*(FunctionAnalysisManager|AnalysisManager<.*Function.*>).*}},{{.*}}Module>
; CHECK-CGSCC-PASS-NEXT: Running analysis: LazyCallGraphAnalysis
; CHECK-CGSCC-PASS-NEXT: Running analysis: TargetLibraryAnalysis
+; CHECK-CGSCC-PASS-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-CGSCC-PASS-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-CGSCC-PASS-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-CGSCC-PASS-NEXT: Running pass: NoOpCGSCCPass
diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll
index 7cf035b0c6f376..19bfec3ab718e0 100644
--- a/llvm/test/Other/new-pm-defaults.ll
+++ b/llvm/test/Other/new-pm-defaults.ll
@@ -139,6 +139,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-lto-defaults.ll b/llvm/test/Other/new-pm-lto-defaults.ll
index f788db1e338a1e..8f4fa763b52098 100644
--- a/llvm/test/Other/new-pm-lto-defaults.ll
+++ b/llvm/test/Other/new-pm-lto-defaults.ll
@@ -43,6 +43,7 @@
; CHECK-O23SZ-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
; CHECK-O23SZ-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}SCC
; CHECK-O23SZ-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O23SZ-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O23SZ-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O23SZ-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph{{.*}}>
; CHECK-O23SZ-NEXT: Running pass: PostOrderFunctionAttrsPass
diff --git a/llvm/test/Other/new-pm-pgo-preinline.ll b/llvm/test/Other/new-pm-pgo-preinline.ll
index f07a3728ba3d48..97813bb2433642 100644
--- a/llvm/test/Other/new-pm-pgo-preinline.ll
+++ b/llvm/test/Other/new-pm-pgo-preinline.ll
@@ -5,6 +5,7 @@
; CHECK-Osz-NEXT: Running analysis: InlineAdvisorAnalysis
; CHECK-Osz-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-Osz-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-Osz-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-Osz-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy on (foo)
; CHECK-Osz-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-Osz-NEXT: Running pass: InlinerPass on (foo)
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
index ed13402e1c4b15..e1ad86015fda95 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
@@ -74,6 +74,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
index c82c34f7ff01e7..3f6c5351e8e8e0 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
@@ -62,6 +62,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
index d375747547d61f..371dde305b0990 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
@@ -71,6 +71,7 @@
; CHECK-O-NEXT: Invalidating analysis: AAManager
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
index 5aacd26def2be5..860fb99525030f 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
@@ -106,6 +106,7 @@
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
index f6a94065968038..d97cc97169b56b 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
@@ -62,6 +62,7 @@
; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy on (foo)
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: InlinerPass on (foo)
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
index 48a9433d249996..d338817d076463 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
@@ -76,6 +76,7 @@
; CHECK-O-NEXT: Invalidating analysis: AAManager
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
+; CHECK-O-NEXT: Running analysis: DebugInfoCacheAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.*}}LazyCallGraph::SCC{{.*}}>
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
diff --git a/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp b/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
index 5c71bc8063d6c9..7212107d992638 100644
--- a/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
+++ b/llvm/unittests/Analysis/CGSCCPassManagerTest.cpp
@@ -7,6 +7,7 @@
//===----------------------------------------------------------------------===//
#include "llvm/Analysis/CGSCCPassManager.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/AsmParser/Parser.h"
@@ -16,8 +17,8 @@
#include "llvm/IR/Instructions.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
-#include "llvm/IR/PassManager.h"
#include "llvm/IR/PassInstrumentation.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/IR/Verifier.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/Transforms/Utils/CallGraphUpdater.h"
@@ -255,6 +256,7 @@ class CGSCCPassManagerTest : public ::testing::Test {
"}\n")) {
FAM.registerPass([&] { return TargetLibraryAnalysis(); });
MAM.registerPass([&] { return LazyCallGraphAnalysis(); });
+ MAM.registerPass([&] { return DebugInfoCacheAnalysis(); });
MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); });
// Register required pass instrumentation analysis.
|
bdb7970
to
22aaf36
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of global debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt GlobalDI | Cached CU DIFinder (cur.) | |-----------------+----------+----------------+-------------------+---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | |-----------------+----------+----------------+-------------------+---------------------------| | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
bec4d02
to
4cf7f3f
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of global debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt GlobalDI | Cached CU DIFinder (cur.) | |-----------------+----------+----------------+-------------------+---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | |-----------------+----------+----------------+-------------------+---------------------------| | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
4cf7f3f
to
5d378de
Compare
b316815
to
c32803f
Compare
40407df
to
2190fe0
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
b81f6f7
to
daedf07
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
daedf07
to
d2fd415
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
d2fd415
to
83cfc8d
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
83cfc8d
to
809206d
Compare
5402d09
to
54bc13d
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of global debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectGlobalDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
809206d
to
fc245ef
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of common debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectCommonDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
fc245ef
to
81f1380
Compare
f727512
to
7d38fe3
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of common debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectCommonDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
81f1380
to
27e9907
Compare
Summary: We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed up collection of module-level debug info. The pass could likely be another 2x+ faster if we avoid rebuilding the set of common debug info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done incrementally on top of this. Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample cpp file compiled with full debug info: | | Baseline | IdentityMD set | Prebuilt CommonDI | Cached CU DIFinder (cur.) | |-----------------|----------|----------------|-------------------|---------------------------| | CoroSplitPass | 306ms | 221ms | 68ms | 17ms | | CoroCloner | 101ms | 72ms | 0.5ms | 0.5ms | | CollectCommonDI | - | - | 63ms | 13ms | | Speed up | 1x | 1.4x | 4.5x | 18x | Test Plan: ninja check-llvm-unit ninja check-llvm Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes. stack-info: PR: #118630, branch: users/artempyanykh/fast-coro-upstream/11
27e9907
to
78eb7e9
Compare
Dropping this in favor of #129148 |
Stacked PRs:
[Coro] Use DebugInfoCache to speed up cloning in CoroSplitPass
Summary:
We can use a DebugInfoFinder from DebugInfoCache which is already primed on a compile unit to speed
up collection of module-level debug info.
The pass could likely be another 2x+ faster if we avoid rebuilding the set of common debug
info. This needs further massaging of CloneFunction and ValueMapper, though, and can be done
incrementally on top of this.
Comparing performance of CoroSplitPass at various points in this stack, this is anecdata from a sample
cpp file compiled with full debug info:
Test Plan:
ninja check-llvm-unit
ninja check-llvm
Compiled a sample cpp file with time trace to get the avg. duration of the pass and inner scopes.