-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Analysis] Add DebugInfoCache analysis #118629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute common debug info metadata (required for coroutine function cloning) much faster. Specifically, this means paying the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
77d348f
to
aa6401c
Compare
784c570
to
bdb7970
Compare
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-ir Author: Artem Pianykh (artempyanykh) ChangesStacked PRs:
[Analysis] Add DebugInfoCache analysisSummary: Test Plan: Full diff: https://github.com/llvm/llvm-project/pull/118629.diff 8 Files Affected:
diff --git a/llvm/include/llvm/Analysis/DebugInfoCache.h b/llvm/include/llvm/Analysis/DebugInfoCache.h
new file mode 100644
index 00000000000000..dbd6802c99ea01
--- /dev/null
+++ b/llvm/include/llvm/Analysis/DebugInfoCache.h
@@ -0,0 +1,50 @@
+//===- llvm/Analysis/DebugInfoCache.h - debug info cache --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains an analysis that builds a cache of debug info for each
+// DICompileUnit in a module.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_ANALYSIS_DEBUGINFOCACHE_H
+#define LLVM_ANALYSIS_DEBUGINFOCACHE_H
+
+#include "llvm/IR/DebugInfo.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+/// Processes and caches debug info for each DICompileUnit in a module.
+///
+/// The result of the analysis is a set of DebugInfoFinders primed on their
+/// respective DICompileUnit. Such DebugInfoFinders can be used to speed up
+/// function cloning which otherwise requires an expensive traversal of
+/// DICompileUnit-level debug info. See an example usage in CoroSplit.
+class DebugInfoCache {
+public:
+ using DIFinderCache = SmallDenseMap<const DICompileUnit *, DebugInfoFinder>;
+ DIFinderCache Result;
+
+ DebugInfoCache(const Module &M);
+
+ bool invalidate(Module &, const PreservedAnalyses &,
+ ModuleAnalysisManager::Invalidator &);
+};
+
+class DebugInfoCacheAnalysis
+ : public AnalysisInfoMixin<DebugInfoCacheAnalysis> {
+ friend AnalysisInfoMixin<DebugInfoCacheAnalysis>;
+ static AnalysisKey Key;
+
+public:
+ using Result = DebugInfoCache;
+ Result run(Module &M, ModuleAnalysisManager &);
+};
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/IR/DebugInfo.h b/llvm/include/llvm/IR/DebugInfo.h
index 73f45c3769be44..11907fbb7f20b3 100644
--- a/llvm/include/llvm/IR/DebugInfo.h
+++ b/llvm/include/llvm/IR/DebugInfo.h
@@ -120,11 +120,13 @@ class DebugInfoFinder {
/// Process subprogram.
void processSubprogram(DISubprogram *SP);
+ /// Process a compile unit.
+ void processCompileUnit(DICompileUnit *CU);
+
/// Clear all lists.
void reset();
private:
- void processCompileUnit(DICompileUnit *CU);
void processScope(DIScope *Scope);
void processType(DIType *DT);
bool addCompileUnit(DICompileUnit *CU);
diff --git a/llvm/lib/Analysis/CMakeLists.txt b/llvm/lib/Analysis/CMakeLists.txt
index 0db5b80f336cb5..db9a569e301563 100644
--- a/llvm/lib/Analysis/CMakeLists.txt
+++ b/llvm/lib/Analysis/CMakeLists.txt
@@ -52,6 +52,7 @@ add_llvm_component_library(LLVMAnalysis
DDGPrinter.cpp
ConstraintSystem.cpp
Delinearization.cpp
+ DebugInfoCache.cpp
DemandedBits.cpp
DependenceAnalysis.cpp
DependenceGraphBuilder.cpp
diff --git a/llvm/lib/Analysis/DebugInfoCache.cpp b/llvm/lib/Analysis/DebugInfoCache.cpp
new file mode 100644
index 00000000000000..c1a3e89f0a6ccf
--- /dev/null
+++ b/llvm/lib/Analysis/DebugInfoCache.cpp
@@ -0,0 +1,47 @@
+//===- llvm/Analysis/DebugInfoCache.cpp - debug info cache ----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains an analysis that builds a cache of debug info for each
+// DICompileUnit in a module.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/DebugInfoCache.h"
+#include "llvm/IR/Module.h"
+
+using namespace llvm;
+
+namespace {
+DebugInfoFinder processCompileUnit(DICompileUnit *CU) {
+ DebugInfoFinder DIFinder;
+ DIFinder.processCompileUnit(CU);
+
+ return DIFinder;
+}
+} // namespace
+
+DebugInfoCache::DebugInfoCache(const Module &M) {
+ for (const auto CU : M.debug_compile_units()) {
+ auto DIFinder = processCompileUnit(CU);
+ Result[CU] = std::move(DIFinder);
+ }
+}
+
+bool DebugInfoCache::invalidate(Module &M, const PreservedAnalyses &PA,
+ ModuleAnalysisManager::Invalidator &) {
+ // Check whether the analysis has been explicitly invalidated. Otherwise, it's
+ // stateless and remains preserved.
+ auto PAC = PA.getChecker<DebugInfoCacheAnalysis>();
+ return !PAC.preservedWhenStateless();
+}
+
+AnalysisKey DebugInfoCacheAnalysis::Key;
+
+DebugInfoCache DebugInfoCacheAnalysis::run(Module &M, ModuleAnalysisManager &) {
+ return DebugInfoCache(M);
+}
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index ba52a37df9c25d..a900be130bdeb5 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -34,6 +34,7 @@
#include "llvm/Analysis/DDGPrinter.h"
#include "llvm/Analysis/DXILMetadataAnalysis.h"
#include "llvm/Analysis/DXILResource.h"
+#include "llvm/Analysis/DebugInfoCache.h"
#include "llvm/Analysis/Delinearization.h"
#include "llvm/Analysis/DemandedBits.h"
#include "llvm/Analysis/DependenceAnalysis.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 7c3798f6462a46..d697c7c8ab83b8 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -21,6 +21,7 @@
MODULE_ANALYSIS("callgraph", CallGraphAnalysis())
MODULE_ANALYSIS("collector-metadata", CollectorMetadataAnalysis())
MODULE_ANALYSIS("ctx-prof-analysis", CtxProfAnalysis())
+MODULE_ANALYSIS("debug-info-cache", DebugInfoCacheAnalysis())
MODULE_ANALYSIS("dxil-metadata", DXILMetadataAnalysis())
MODULE_ANALYSIS("dxil-resource", DXILResourceAnalysis())
MODULE_ANALYSIS("inline-advisor", InlineAdvisorAnalysis())
diff --git a/llvm/unittests/Analysis/CMakeLists.txt b/llvm/unittests/Analysis/CMakeLists.txt
index 76d16513d93417..73694a1d6ba297 100644
--- a/llvm/unittests/Analysis/CMakeLists.txt
+++ b/llvm/unittests/Analysis/CMakeLists.txt
@@ -25,6 +25,7 @@ set(ANALYSIS_TEST_SOURCES
ConstraintSystemTest.cpp
CtxProfAnalysisTest.cpp
DDGTest.cpp
+ DebugInfoCacheTest.cpp
DomTreeUpdaterTest.cpp
DXILResourceTest.cpp
GraphWriterTest.cpp
diff --git a/llvm/unittests/Analysis/DebugInfoCacheTest.cpp b/llvm/unittests/Analysis/DebugInfoCacheTest.cpp
new file mode 100644
index 00000000000000..b32e4cb543158a
--- /dev/null
+++ b/llvm/unittests/Analysis/DebugInfoCacheTest.cpp
@@ -0,0 +1,211 @@
+//===- DebugInfoCacheTest.cpp - DebugInfoCache unit tests -----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/DebugInfoCache.h"
+#include "llvm/AsmParser/Parser.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/Verifier.h"
+#include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+
+namespace llvm {
+namespace {
+
+// Forward declare the assembly
+extern StringRef MultiCUModule;
+
+const DICompileUnit *findCU(const Module &M, StringRef FileName) {
+ for (const auto CU : M.debug_compile_units()) {
+ if (CU->getFilename() == FileName)
+ return CU;
+ }
+
+ return nullptr;
+}
+
+class DebugInfoCacheTest : public testing::Test {
+protected:
+ LLVMContext C;
+
+ std::unique_ptr<Module> makeModule(StringRef Assembly) {
+ SMDiagnostic Err;
+ auto M = parseAssemblyString(Assembly, Err, C);
+ if (!M)
+ Err.print("DebugInfoCacheTest", errs());
+
+ verifyModule(*M, &errs());
+ return M;
+ }
+};
+
+TEST_F(DebugInfoCacheTest, TestEmpty) {
+ auto M = makeModule("");
+ DebugInfoCache DIC{*M};
+ EXPECT_EQ(DIC.Result.size(), 0u);
+}
+
+TEST_F(DebugInfoCacheTest, TestMultiCU) {
+ auto M = makeModule(MultiCUModule);
+ DebugInfoCache DIC{*M};
+ EXPECT_EQ(DIC.Result.size(), 2u);
+
+ auto *File1CU = findCU(*M, "file1.cpp");
+ EXPECT_NE(File1CU, nullptr);
+
+ auto File1DIFinder = DIC.Result.find(File1CU);
+ EXPECT_NE(File1DIFinder, DIC.Result.end());
+
+ EXPECT_EQ(File1DIFinder->getSecond().compile_unit_count(), 1u);
+ EXPECT_EQ(File1DIFinder->getSecond().type_count(), 6u);
+ EXPECT_EQ(File1DIFinder->getSecond().subprogram_count(), 0u);
+ EXPECT_EQ(File1DIFinder->getSecond().scope_count(), 1u);
+
+ auto *File2CU = findCU(*M, "file2.cpp");
+ EXPECT_NE(File1CU, nullptr);
+
+ auto File2DIFinder = DIC.Result.find(File2CU);
+ EXPECT_NE(File2DIFinder, DIC.Result.end());
+
+ EXPECT_EQ(File2DIFinder->getSecond().compile_unit_count(), 1u);
+ EXPECT_EQ(File2DIFinder->getSecond().type_count(), 2u);
+ EXPECT_EQ(File2DIFinder->getSecond().subprogram_count(), 0u);
+ EXPECT_EQ(File2DIFinder->getSecond().scope_count(), 2u);
+}
+
+/* Generated roughly by
+file1.cpp:
+struct file1_extern_type1;
+struct file1_extern_type2;
+
+namespace file1 {
+typedef struct file1_type1 { int x; float y; } file1_type1;
+file1_type1 global{0, 1.};
+} // file1
+
+extern struct file1_extern_type1 *file1_extern_func1(struct
+file1_extern_type2*);
+
+file1::file1_type1 file1_func1(file1::file1_type1 x) { return x; }
+--------
+file2.cpp:
+struct file2_extern_type1;
+struct file2_extern_type2;
+
+namespace file2 {
+typedef struct file2_type1 { float x; float y; } file2_type1;
+enum class file2_type2 { opt1, opt2 };
+
+namespace inner {
+file2_type2 inner_global{file2_type2::opt2};
+} // inner
+} // file2
+
+extern struct file2_extern_type1 *file2_extern_func1(struct
+file2_extern_type2*);
+
+file2::file2_type1 file2_func1(file2::file2_type1 x, file2::file2_type2 y) {
+return x; }
+--------
+$ clang -S -emit-llvm file*.cpp
+$ llvm-link -S -o single.ll file*.ll
+*/
+StringRef MultiCUModule = R"""(
+%"struct.file1::file1_type1" = type { i32, float }
+%"struct.file2::file2_type1" = type { float, float }
+
+@_ZN5file16globalE = dso_local global %"struct.file1::file1_type1" { i32 0, float 1.000000e+00 }, align 4, !dbg !0
+@_ZN5file25inner12inner_globalE = dso_local global i32 1, align 4, !dbg !11
+
+define dso_local i64 @_Z11file1_func1N5file111file1_type1E(i64 %0) !dbg !33 {
+ %2 = alloca %"struct.file1::file1_type1", align 4
+ %3 = alloca %"struct.file1::file1_type1", align 4
+ store i64 %0, ptr %3, align 4
+ #dbg_declare(ptr %3, !37, !DIExpression(), !38)
+ call void @llvm.memcpy.p0.p0.i64(ptr align 4 %2, ptr align 4 %3, i64 8, i1 false), !dbg !39
+ %4 = load i64, ptr %2, align 4, !dbg !40
+ ret i64 %4, !dbg !40
+}
+
+declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg)
+
+define dso_local <2 x float> @_Z11file2_func1N5file211file2_type1ENS_11file2_type2E(<2 x float> %0, i32 noundef %1) !dbg !41 {
+ %3 = alloca %"struct.file2::file2_type1", align 4
+ %4 = alloca %"struct.file2::file2_type1", align 4
+ %5 = alloca i32, align 4
+ store <2 x float> %0, ptr %4, align 4
+ #dbg_declare(ptr %4, !49, !DIExpression(), !50)
+ store i32 %1, ptr %5, align 4
+ #dbg_declare(ptr %5, !51, !DIExpression(), !52)
+ call void @llvm.memcpy.p0.p0.i64(ptr align 4 %3, ptr align 4 %4, i64 8, i1 false), !dbg !53
+ %6 = load <2 x float>, ptr %3, align 4, !dbg !54
+ ret <2 x float> %6, !dbg !54
+}
+
+!llvm.dbg.cu = !{!20, !22}
+!llvm.ident = !{!25, !25}
+!llvm.module.flags = !{!26, !27, !28, !29, !30, !31, !32}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "global", linkageName: "_ZN5file16globalE", scope: !2, file: !3, line: 6, type: !4, isLocal: false, isDefinition: true)
+!2 = !DINamespace(name: "file1", scope: null)
+!3 = !DIFile(filename: "file1.cpp", directory: "")
+!4 = !DIDerivedType(tag: DW_TAG_typedef, name: "file1_type1", scope: !2, file: !3, line: 5, baseType: !5)
+!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "file1_type1", scope: !2, file: !3, line: 5, size: 64, flags: DIFlagTypePassByValue, elements: !6, identifier: "_ZTSN5file111file1_type1E")
+!6 = !{!7, !9}
+!7 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !3, line: 5, baseType: !8, size: 32)
+!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!9 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !5, file: !3, line: 5, baseType: !10, size: 32, offset: 32)
+!10 = !DIBasicType(name: "float", size: 32, encoding: DW_ATE_float)
+!11 = !DIGlobalVariableExpression(var: !12, expr: !DIExpression())
+!12 = distinct !DIGlobalVariable(name: "inner_global", linkageName: "_ZN5file25inner12inner_globalE", scope: !13, file: !15, line: 9, type: !16, isLocal: false, isDefinition: true)
+!13 = !DINamespace(name: "inner", scope: !14)
+!14 = !DINamespace(name: "file2", scope: null)
+!15 = !DIFile(filename: "file2.cpp", directory: "")
+!16 = distinct !DICompositeType(tag: DW_TAG_enumeration_type, name: "file2_type2", scope: !14, file: !15, line: 6, baseType: !8, size: 32, flags: DIFlagEnumClass, elements: !17, identifier: "_ZTSN5file211file2_type2E")
+!17 = !{!18, !19}
+!18 = !DIEnumerator(name: "opt1", value: 0)
+!19 = !DIEnumerator(name: "opt2", value: 1)
+!20 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !21, splitDebugInlining: false, nameTableKind: None)
+!21 = !{!0}
+!22 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !15, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !23, globals: !24, splitDebugInlining: false, nameTableKind: None)
+!23 = !{!16}
+!24 = !{!11}
+!25 = !{!"clang"}
+!26 = !{i32 7, !"Dwarf Version", i32 5}
+!27 = !{i32 2, !"Debug Info Version", i32 3}
+!28 = !{i32 1, !"wchar_size", i32 4}
+!29 = !{i32 8, !"PIC Level", i32 2}
+!30 = !{i32 7, !"PIE Level", i32 2}
+!31 = !{i32 7, !"uwtable", i32 2}
+!32 = !{i32 7, !"frame-pointer", i32 2}
+!33 = distinct !DISubprogram(name: "file1_func1", linkageName: "_Z11file1_func1N5file111file1_type1E", scope: !3, file: !3, line: 11, type: !34, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !20, retainedNodes: !36)
+!34 = !DISubroutineType(types: !35)
+!35 = !{!4, !4}
+!36 = !{}
+!37 = !DILocalVariable(name: "x", arg: 1, scope: !33, file: !3, line: 11, type: !4)
+!38 = !DILocation(line: 11, column: 51, scope: !33)
+!39 = !DILocation(line: 11, column: 63, scope: !33)
+!40 = !DILocation(line: 11, column: 56, scope: !33)
+!41 = distinct !DISubprogram(name: "file2_func1", linkageName: "_Z11file2_func1N5file211file2_type1ENS_11file2_type2E", scope: !15, file: !15, line: 15, type: !42, scopeLine: 15, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !22, retainedNodes: !36)
+!42 = !DISubroutineType(types: !43)
+!43 = !{!44, !44, !16}
+!44 = !DIDerivedType(tag: DW_TAG_typedef, name: "file2_type1", scope: !14, file: !15, line: 5, baseType: !45)
+!45 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "file2_type1", scope: !14, file: !15, line: 5, size: 64, flags: DIFlagTypePassByValue, elements: !46, identifier: "_ZTSN5file211file2_type1E")
+!46 = !{!47, !48}
+!47 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !45, file: !15, line: 5, baseType: !10, size: 32)
+!48 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !45, file: !15, line: 5, baseType: !10, size: 32, offset: 32)
+!49 = !DILocalVariable(name: "x", arg: 1, scope: !41, file: !15, line: 15, type: !44)
+!50 = !DILocation(line: 15, column: 51, scope: !41)
+!51 = !DILocalVariable(name: "y", arg: 2, scope: !41, file: !15, line: 15, type: !16)
+!52 = !DILocation(line: 15, column: 73, scope: !41)
+!53 = !DILocation(line: 15, column: 85, scope: !41)
+!54 = !DILocation(line: 15, column: 78, scope: !41)
+)""";
+} // namespace
+} // namespace llvm
|
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
aa6401c
to
24860c1
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
bdb7970
to
22aaf36
Compare
22aaf36
to
b316815
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
48abbb2
to
77892ea
Compare
b743d6c
to
40407df
Compare
c613ae3
to
a49247f
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
40407df
to
2190fe0
Compare
2190fe0
to
1f8b67b
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
1f8b67b
to
aaeec0b
Compare
aaeec0b
to
5402d09
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
83dea70
to
5bbda30
Compare
Summary: The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This allows (future) callers like CoroSplitPass to compute global debug info metadata (required for coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing only once per compile unit, rather than once per coroutine. Test Plan: Added a smoke test for the new analysis ninja check-llvm-unit check-llvm stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
5402d09
to
54bc13d
Compare
54bc13d
to
f727512
Compare
f727512
to
7d38fe3
Compare
@felipepiovezan, @TylerNowicki could you please have a look at #118629 and #118630? These are the 2 remaining bits of #109032. |
llvm/lib/Analysis/DebugInfoCache.cpp
Outdated
// Check whether the analysis has been explicitly invalidated. Otherwise, it's | ||
// stateless and remains preserved. | ||
auto PAC = PA.getChecker<DebugInfoCacheAnalysis>(); | ||
return !PAC.preservedWhenStateless(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this invalidating the analysis when the module changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! I would appreciate help/guidance here, @felipepiovezan. IIUC passes that may formally change the module won't necessarily invalidate the analysis:
- changing globals, enum types, imported entities in a CU would invalidate it
- adding/removing CUs would too.
However, I'm not sure what's the proper way to express this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the docs in https://llvm.org/docs/NewPassManager.html#implementing-analysis-invalidation
What you wrote seems to say that your analysis is only invalidated when explicitly invalidated, which is certainly not true (see "If an analysis is stateless and generally shouldn’t be invalidated, use the following:"). Your analysis is invalidated whenever debug metadata is changed/added/removed by any pass.
I think we should simply not implement this method, which implies the analysis is invalidated whenever the module changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it's still a bit fuzzy, but I think I understand better now. I was worried about unnecessary invalidations, but all CoroSplit passes will share the same analysis results as part of CGSCC pipeline. So, it's ok to have the default invalidate
impl.
I'll push the fixup momentarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felipepiovezan could you please have a look at #118630 where this analysis is actually wired in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why is the copy needed? A const reference would not suffice?
// Copy DIFinder from cache which is primed on F's compile unit when available
auto *PrimedDIFinder = cachedDIFinder(F, DICache);
if (PrimedDIFinder)
DIFinder = *PrimedDIFinder;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for comments and questions @felipepiovezan! Looks like the question is whether CoroSplit would invalidate analysis results. There are tests for coroutine transforms that check debug info, but let me try to write a more targeted test.
This is not really an area I'm very familiar with, but there are some things that I found odd about that PR. For example, why do we need this?
// Prime DebugInfoCache. // TODO: Currently, the only user is CoroSplitPass. Consider running // conditionally. AM.getResult<DebugInfoCacheAnalysis>(M);
This is based on my understanding of this part of the manual. Inner level passes (like CGSCC-level) can only ask for a cached result of an outer level analysis (like Module-level). They can't ask to compute outer analysis results on demand.
Unless I explicitly run the analysis there, there won't be any results to consume from inside CoroSplit.
The 'conditional' part is based on what I saw in CoroConditionalWrapper. Since CoroSplit is the only user, we could gate the analysis. OTOH, gating could prevent the adoption of the analysis in other places. So, unclear.
I was worried about unnecessary invalidations, but all CoroSplit passes will share the same analysis results as part of CGSCC pipeline.
Isn't this dangerous though? Each CoroSplit pass will modify the Module and its metadata, so the analysis need to be invalidated after every run of CoroSplit
Let me try to write a test for it.
the pass ends with a return PreservedAnalyses::none();
I think this is for a specific unit of IR (SCC in this case)? In traces that I get the analysis runs once per ModuleToPostOrderCGSCCPassAdaptor
.
This implies we're doing a lot of unnecessary work that will be invalidated.
Should this new analysis be lazy?
Ideally! But given the caveats in the doc about on demand computations and "future concurrency", I'm not sure what's the best way to proceed; and if it'll be worth it. In my testing even on our larger sources (that otherwise take a few minutes to compile) the pass takes 60ms.
I.e. only visit a CU when it is queried on it.
This would also depend on how common multi-CU modules are.
Also, why is the copy needed? A const reference would not suffice?
// Copy DIFinder from cache which is primed on F's compile unit when available auto *PrimedDIFinder = cachedDIFinder(F, DICache); if (PrimedDIFinder) DIFinder = *PrimedDIFinder;
DIFinder is mutable, it gets hydrated by visiting debug info attached to an element, but it won't revisit what's already been visited. For a function:
- it visits debug info attached to function's DISubprogram,
- it visits debug into attached to function's CU (see this).
We use PrimedDIFinder
hydrated on a CU to cut the expensive p.2 but we can't pass it by reference (and copy instead) because it gets mutated down the line when doing p.1. It'd be great to refactor function cloning more, but at this point I'd like to validate the caching part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to summarize what may or may not be an issue here. Imagine a module with two CGSCCs, but one CU.
We have a module level analysis that calls DebugInfoFinder::processCompileUnit
on that one CU.
This happens before the first run of CoroSplit.
The first run of CoroSplit will modify a lot of module level debug information.
But the first run of CoroSplit is not allowed to modify the result of the analysis.
Will the second run of CoroSplit, on the second CGSCC, use an outdated analysis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felipepiovezan yes, thanks for the summary! I want to do some extra validation. Will report back shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felipepiovezan I ran validation on a large chunk of our internal codebase with a patched version of LLVM that had an extra set of checks to compare cached DIFinder with the one built from scratch (roughly this diff). No mismatches found.
The line of reasoning here is that in the context of CoroSplit pass this analysis result (cached DIFinder) is not invalidated since:
- it doesn't add/remove debug info attached specifically to Compile Unit (this is what gets cached by running
DebugInfoFinder::processCompileUnit
; note that these are only things reachable directly from a CU, rather than everything inside the CU), - the contents of the cachedDIFinder get identity mapped and hence are unchanged by construction.
Seems safe to have this analysis on specifically for CoroSplit at least.
With that said, I had another idea. I refactored the code a bit more and it seems that whatever is going on now with the DIFinder and collecting IdentityMD metadata set for identity mapping basically reduces to a predicate check on metadata to
identity map compile units, types, other subprograms, and lexical blocks of other subprograms.
I have a stack that implements this. Most of the stack is refactoring to prepare for the change and then cleaning up after the change. The two important PRs are:
- [NFC][Cloning] Replace IdentityMD set with a predicate in ValueMapper #129147 (set.contains replaced with
predicate(metadata)
check in ValueMapper interface). - [NFC][Cloning] Replace DIFinder usage in CloneFunctionInto with a MetadataPredicate #129148 (a generic change to CloneFunction.cpp; a predicate based on set.contains with the set extracted via DIFinder is replaced with another predicate that just checks metadata kind). This is the only non-trivial transformation in the whole stack
All existing tests pass incl. cloning and coroutine tests, which is a good indication that #129148 is functionally equivalent to the current version. @felipepiovezan could you have a look at #129148? If looks good, I'd prefer this new version to the DebugInfoCacheAnalysis.
Dropping this in favor of #129148 |
Stacked PRs:
[Analysis] Add DebugInfoCache analysis
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute common debug info metadata (required for
coroutine function cloning) much faster. Specifically, this means paying the price of DICompileUnit
processing only once per compile unit, rather than once per coroutine.
Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm