Skip to content

[Analysis] Add DebugInfoCache analysis #118629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

artempyanykh
Copy link
Contributor

@artempyanykh artempyanykh commented Dec 4, 2024

Stacked PRs:


[Analysis] Add DebugInfoCache analysis

Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute common debug info metadata (required for
coroutine function cloning) much faster. Specifically, this means paying the price of DICompileUnit
processing only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute common debug info metadata (required for
coroutine function cloning) much faster. Specifically, this means paying the price of DICompileUnit
processing only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-ir

Author: Artem Pianykh (artempyanykh)

Changes

Stacked PRs:

  • #118630
  • ->#118629
  • #118628
  • #118627
  • #118626
  • #118625
  • #118624
  • #118623
  • #118622
  • #118621
  • #118620

[Analysis] Add DebugInfoCache analysis

Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm


Full diff: https://github.com/llvm/llvm-project/pull/118629.diff

8 Files Affected:

  • (added) llvm/include/llvm/Analysis/DebugInfoCache.h (+50)
  • (modified) llvm/include/llvm/IR/DebugInfo.h (+3-1)
  • (modified) llvm/lib/Analysis/CMakeLists.txt (+1)
  • (added) llvm/lib/Analysis/DebugInfoCache.cpp (+47)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassRegistry.def (+1)
  • (modified) llvm/unittests/Analysis/CMakeLists.txt (+1)
  • (added) llvm/unittests/Analysis/DebugInfoCacheTest.cpp (+211)
diff --git a/llvm/include/llvm/Analysis/DebugInfoCache.h b/llvm/include/llvm/Analysis/DebugInfoCache.h
new file mode 100644
index 00000000000000..dbd6802c99ea01
--- /dev/null
+++ b/llvm/include/llvm/Analysis/DebugInfoCache.h
@@ -0,0 +1,50 @@
+//===- llvm/Analysis/DebugInfoCache.h - debug info cache --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains an analysis that builds a cache of debug info for each
+// DICompileUnit in a module.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_ANALYSIS_DEBUGINFOCACHE_H
+#define LLVM_ANALYSIS_DEBUGINFOCACHE_H
+
+#include "llvm/IR/DebugInfo.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+/// Processes and caches debug info for each DICompileUnit in a module.
+///
+/// The result of the analysis is a set of DebugInfoFinders primed on their
+/// respective DICompileUnit. Such DebugInfoFinders can be used to speed up
+/// function cloning which otherwise requires an expensive traversal of
+/// DICompileUnit-level debug info. See an example usage in CoroSplit.
+class DebugInfoCache {
+public:
+  using DIFinderCache = SmallDenseMap<const DICompileUnit *, DebugInfoFinder>;
+  DIFinderCache Result;
+
+  DebugInfoCache(const Module &M);
+
+  bool invalidate(Module &, const PreservedAnalyses &,
+                  ModuleAnalysisManager::Invalidator &);
+};
+
+class DebugInfoCacheAnalysis
+    : public AnalysisInfoMixin<DebugInfoCacheAnalysis> {
+  friend AnalysisInfoMixin<DebugInfoCacheAnalysis>;
+  static AnalysisKey Key;
+
+public:
+  using Result = DebugInfoCache;
+  Result run(Module &M, ModuleAnalysisManager &);
+};
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/IR/DebugInfo.h b/llvm/include/llvm/IR/DebugInfo.h
index 73f45c3769be44..11907fbb7f20b3 100644
--- a/llvm/include/llvm/IR/DebugInfo.h
+++ b/llvm/include/llvm/IR/DebugInfo.h
@@ -120,11 +120,13 @@ class DebugInfoFinder {
   /// Process subprogram.
   void processSubprogram(DISubprogram *SP);
 
+  /// Process a compile unit.
+  void processCompileUnit(DICompileUnit *CU);
+
   /// Clear all lists.
   void reset();
 
 private:
-  void processCompileUnit(DICompileUnit *CU);
   void processScope(DIScope *Scope);
   void processType(DIType *DT);
   bool addCompileUnit(DICompileUnit *CU);
diff --git a/llvm/lib/Analysis/CMakeLists.txt b/llvm/lib/Analysis/CMakeLists.txt
index 0db5b80f336cb5..db9a569e301563 100644
--- a/llvm/lib/Analysis/CMakeLists.txt
+++ b/llvm/lib/Analysis/CMakeLists.txt
@@ -52,6 +52,7 @@ add_llvm_component_library(LLVMAnalysis
   DDGPrinter.cpp
   ConstraintSystem.cpp
   Delinearization.cpp
+  DebugInfoCache.cpp
   DemandedBits.cpp
   DependenceAnalysis.cpp
   DependenceGraphBuilder.cpp
diff --git a/llvm/lib/Analysis/DebugInfoCache.cpp b/llvm/lib/Analysis/DebugInfoCache.cpp
new file mode 100644
index 00000000000000..c1a3e89f0a6ccf
--- /dev/null
+++ b/llvm/lib/Analysis/DebugInfoCache.cpp
@@ -0,0 +1,47 @@
+//===- llvm/Analysis/DebugInfoCache.cpp - debug info cache ----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains an analysis that builds a cache of debug info for each
+// DICompileUnit in a module.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/DebugInfoCache.h"
+#include "llvm/IR/Module.h"
+
+using namespace llvm;
+
+namespace {
+DebugInfoFinder processCompileUnit(DICompileUnit *CU) {
+  DebugInfoFinder DIFinder;
+  DIFinder.processCompileUnit(CU);
+
+  return DIFinder;
+}
+} // namespace
+
+DebugInfoCache::DebugInfoCache(const Module &M) {
+  for (const auto CU : M.debug_compile_units()) {
+    auto DIFinder = processCompileUnit(CU);
+    Result[CU] = std::move(DIFinder);
+  }
+}
+
+bool DebugInfoCache::invalidate(Module &M, const PreservedAnalyses &PA,
+                                ModuleAnalysisManager::Invalidator &) {
+  // Check whether the analysis has been explicitly invalidated. Otherwise, it's
+  // stateless and remains preserved.
+  auto PAC = PA.getChecker<DebugInfoCacheAnalysis>();
+  return !PAC.preservedWhenStateless();
+}
+
+AnalysisKey DebugInfoCacheAnalysis::Key;
+
+DebugInfoCache DebugInfoCacheAnalysis::run(Module &M, ModuleAnalysisManager &) {
+  return DebugInfoCache(M);
+}
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index ba52a37df9c25d..a900be130bdeb5 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -34,6 +34,7 @@
 #include "llvm/Analysis/DDGPrinter.h"
 #include "llvm/Analysis/DXILMetadataAnalysis.h"
 #include "llvm/Analysis/DXILResource.h"
+#include "llvm/Analysis/DebugInfoCache.h"
 #include "llvm/Analysis/Delinearization.h"
 #include "llvm/Analysis/DemandedBits.h"
 #include "llvm/Analysis/DependenceAnalysis.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 7c3798f6462a46..d697c7c8ab83b8 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -21,6 +21,7 @@
 MODULE_ANALYSIS("callgraph", CallGraphAnalysis())
 MODULE_ANALYSIS("collector-metadata", CollectorMetadataAnalysis())
 MODULE_ANALYSIS("ctx-prof-analysis", CtxProfAnalysis())
+MODULE_ANALYSIS("debug-info-cache", DebugInfoCacheAnalysis())
 MODULE_ANALYSIS("dxil-metadata", DXILMetadataAnalysis())
 MODULE_ANALYSIS("dxil-resource", DXILResourceAnalysis())
 MODULE_ANALYSIS("inline-advisor", InlineAdvisorAnalysis())
diff --git a/llvm/unittests/Analysis/CMakeLists.txt b/llvm/unittests/Analysis/CMakeLists.txt
index 76d16513d93417..73694a1d6ba297 100644
--- a/llvm/unittests/Analysis/CMakeLists.txt
+++ b/llvm/unittests/Analysis/CMakeLists.txt
@@ -25,6 +25,7 @@ set(ANALYSIS_TEST_SOURCES
   ConstraintSystemTest.cpp
   CtxProfAnalysisTest.cpp
   DDGTest.cpp
+  DebugInfoCacheTest.cpp
   DomTreeUpdaterTest.cpp
   DXILResourceTest.cpp
   GraphWriterTest.cpp
diff --git a/llvm/unittests/Analysis/DebugInfoCacheTest.cpp b/llvm/unittests/Analysis/DebugInfoCacheTest.cpp
new file mode 100644
index 00000000000000..b32e4cb543158a
--- /dev/null
+++ b/llvm/unittests/Analysis/DebugInfoCacheTest.cpp
@@ -0,0 +1,211 @@
+//===- DebugInfoCacheTest.cpp - DebugInfoCache unit tests -----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/DebugInfoCache.h"
+#include "llvm/AsmParser/Parser.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/Verifier.h"
+#include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+
+namespace llvm {
+namespace {
+
+// Forward declare the assembly
+extern StringRef MultiCUModule;
+
+const DICompileUnit *findCU(const Module &M, StringRef FileName) {
+  for (const auto CU : M.debug_compile_units()) {
+    if (CU->getFilename() == FileName)
+      return CU;
+  }
+
+  return nullptr;
+}
+
+class DebugInfoCacheTest : public testing::Test {
+protected:
+  LLVMContext C;
+
+  std::unique_ptr<Module> makeModule(StringRef Assembly) {
+    SMDiagnostic Err;
+    auto M = parseAssemblyString(Assembly, Err, C);
+    if (!M)
+      Err.print("DebugInfoCacheTest", errs());
+
+    verifyModule(*M, &errs());
+    return M;
+  }
+};
+
+TEST_F(DebugInfoCacheTest, TestEmpty) {
+  auto M = makeModule("");
+  DebugInfoCache DIC{*M};
+  EXPECT_EQ(DIC.Result.size(), 0u);
+}
+
+TEST_F(DebugInfoCacheTest, TestMultiCU) {
+  auto M = makeModule(MultiCUModule);
+  DebugInfoCache DIC{*M};
+  EXPECT_EQ(DIC.Result.size(), 2u);
+
+  auto *File1CU = findCU(*M, "file1.cpp");
+  EXPECT_NE(File1CU, nullptr);
+
+  auto File1DIFinder = DIC.Result.find(File1CU);
+  EXPECT_NE(File1DIFinder, DIC.Result.end());
+
+  EXPECT_EQ(File1DIFinder->getSecond().compile_unit_count(), 1u);
+  EXPECT_EQ(File1DIFinder->getSecond().type_count(), 6u);
+  EXPECT_EQ(File1DIFinder->getSecond().subprogram_count(), 0u);
+  EXPECT_EQ(File1DIFinder->getSecond().scope_count(), 1u);
+
+  auto *File2CU = findCU(*M, "file2.cpp");
+  EXPECT_NE(File1CU, nullptr);
+
+  auto File2DIFinder = DIC.Result.find(File2CU);
+  EXPECT_NE(File2DIFinder, DIC.Result.end());
+
+  EXPECT_EQ(File2DIFinder->getSecond().compile_unit_count(), 1u);
+  EXPECT_EQ(File2DIFinder->getSecond().type_count(), 2u);
+  EXPECT_EQ(File2DIFinder->getSecond().subprogram_count(), 0u);
+  EXPECT_EQ(File2DIFinder->getSecond().scope_count(), 2u);
+}
+
+/* Generated roughly by
+file1.cpp:
+struct file1_extern_type1;
+struct file1_extern_type2;
+
+namespace file1 {
+typedef struct file1_type1 { int x; float y; } file1_type1;
+file1_type1 global{0, 1.};
+} // file1
+
+extern struct file1_extern_type1 *file1_extern_func1(struct
+file1_extern_type2*);
+
+file1::file1_type1 file1_func1(file1::file1_type1 x) { return x; }
+--------
+file2.cpp:
+struct file2_extern_type1;
+struct file2_extern_type2;
+
+namespace file2 {
+typedef struct file2_type1 { float x; float y; } file2_type1;
+enum class file2_type2 { opt1, opt2 };
+
+namespace inner {
+file2_type2 inner_global{file2_type2::opt2};
+} // inner
+} // file2
+
+extern struct file2_extern_type1 *file2_extern_func1(struct
+file2_extern_type2*);
+
+file2::file2_type1 file2_func1(file2::file2_type1 x, file2::file2_type2 y) {
+return x; }
+--------
+$ clang -S -emit-llvm file*.cpp
+$ llvm-link -S -o single.ll file*.ll
+*/
+StringRef MultiCUModule = R"""(
+%"struct.file1::file1_type1" = type { i32, float }
+%"struct.file2::file2_type1" = type { float, float }
+
+@_ZN5file16globalE = dso_local global %"struct.file1::file1_type1" { i32 0, float 1.000000e+00 }, align 4, !dbg !0
+@_ZN5file25inner12inner_globalE = dso_local global i32 1, align 4, !dbg !11
+
+define dso_local i64 @_Z11file1_func1N5file111file1_type1E(i64 %0) !dbg !33 {
+  %2 = alloca %"struct.file1::file1_type1", align 4
+  %3 = alloca %"struct.file1::file1_type1", align 4
+  store i64 %0, ptr %3, align 4
+    #dbg_declare(ptr %3, !37, !DIExpression(), !38)
+  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %2, ptr align 4 %3, i64 8, i1 false), !dbg !39
+  %4 = load i64, ptr %2, align 4, !dbg !40
+  ret i64 %4, !dbg !40
+}
+
+declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg)
+
+define dso_local <2 x float> @_Z11file2_func1N5file211file2_type1ENS_11file2_type2E(<2 x float> %0, i32 noundef %1) !dbg !41 {
+  %3 = alloca %"struct.file2::file2_type1", align 4
+  %4 = alloca %"struct.file2::file2_type1", align 4
+  %5 = alloca i32, align 4
+  store <2 x float> %0, ptr %4, align 4
+    #dbg_declare(ptr %4, !49, !DIExpression(), !50)
+  store i32 %1, ptr %5, align 4
+    #dbg_declare(ptr %5, !51, !DIExpression(), !52)
+  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %3, ptr align 4 %4, i64 8, i1 false), !dbg !53
+  %6 = load <2 x float>, ptr %3, align 4, !dbg !54
+  ret <2 x float> %6, !dbg !54
+}
+
+!llvm.dbg.cu = !{!20, !22}
+!llvm.ident = !{!25, !25}
+!llvm.module.flags = !{!26, !27, !28, !29, !30, !31, !32}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "global", linkageName: "_ZN5file16globalE", scope: !2, file: !3, line: 6, type: !4, isLocal: false, isDefinition: true)
+!2 = !DINamespace(name: "file1", scope: null)
+!3 = !DIFile(filename: "file1.cpp", directory: "")
+!4 = !DIDerivedType(tag: DW_TAG_typedef, name: "file1_type1", scope: !2, file: !3, line: 5, baseType: !5)
+!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "file1_type1", scope: !2, file: !3, line: 5, size: 64, flags: DIFlagTypePassByValue, elements: !6, identifier: "_ZTSN5file111file1_type1E")
+!6 = !{!7, !9}
+!7 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !3, line: 5, baseType: !8, size: 32)
+!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!9 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !5, file: !3, line: 5, baseType: !10, size: 32, offset: 32)
+!10 = !DIBasicType(name: "float", size: 32, encoding: DW_ATE_float)
+!11 = !DIGlobalVariableExpression(var: !12, expr: !DIExpression())
+!12 = distinct !DIGlobalVariable(name: "inner_global", linkageName: "_ZN5file25inner12inner_globalE", scope: !13, file: !15, line: 9, type: !16, isLocal: false, isDefinition: true)
+!13 = !DINamespace(name: "inner", scope: !14)
+!14 = !DINamespace(name: "file2", scope: null)
+!15 = !DIFile(filename: "file2.cpp", directory: "")
+!16 = distinct !DICompositeType(tag: DW_TAG_enumeration_type, name: "file2_type2", scope: !14, file: !15, line: 6, baseType: !8, size: 32, flags: DIFlagEnumClass, elements: !17, identifier: "_ZTSN5file211file2_type2E")
+!17 = !{!18, !19}
+!18 = !DIEnumerator(name: "opt1", value: 0)
+!19 = !DIEnumerator(name: "opt2", value: 1)
+!20 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !21, splitDebugInlining: false, nameTableKind: None)
+!21 = !{!0}
+!22 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !15, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !23, globals: !24, splitDebugInlining: false, nameTableKind: None)
+!23 = !{!16}
+!24 = !{!11}
+!25 = !{!"clang"}
+!26 = !{i32 7, !"Dwarf Version", i32 5}
+!27 = !{i32 2, !"Debug Info Version", i32 3}
+!28 = !{i32 1, !"wchar_size", i32 4}
+!29 = !{i32 8, !"PIC Level", i32 2}
+!30 = !{i32 7, !"PIE Level", i32 2}
+!31 = !{i32 7, !"uwtable", i32 2}
+!32 = !{i32 7, !"frame-pointer", i32 2}
+!33 = distinct !DISubprogram(name: "file1_func1", linkageName: "_Z11file1_func1N5file111file1_type1E", scope: !3, file: !3, line: 11, type: !34, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !20, retainedNodes: !36)
+!34 = !DISubroutineType(types: !35)
+!35 = !{!4, !4}
+!36 = !{}
+!37 = !DILocalVariable(name: "x", arg: 1, scope: !33, file: !3, line: 11, type: !4)
+!38 = !DILocation(line: 11, column: 51, scope: !33)
+!39 = !DILocation(line: 11, column: 63, scope: !33)
+!40 = !DILocation(line: 11, column: 56, scope: !33)
+!41 = distinct !DISubprogram(name: "file2_func1", linkageName: "_Z11file2_func1N5file211file2_type1ENS_11file2_type2E", scope: !15, file: !15, line: 15, type: !42, scopeLine: 15, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !22, retainedNodes: !36)
+!42 = !DISubroutineType(types: !43)
+!43 = !{!44, !44, !16}
+!44 = !DIDerivedType(tag: DW_TAG_typedef, name: "file2_type1", scope: !14, file: !15, line: 5, baseType: !45)
+!45 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "file2_type1", scope: !14, file: !15, line: 5, size: 64, flags: DIFlagTypePassByValue, elements: !46, identifier: "_ZTSN5file211file2_type1E")
+!46 = !{!47, !48}
+!47 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !45, file: !15, line: 5, baseType: !10, size: 32)
+!48 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !45, file: !15, line: 5, baseType: !10, size: 32, offset: 32)
+!49 = !DILocalVariable(name: "x", arg: 1, scope: !41, file: !15, line: 15, type: !44)
+!50 = !DILocation(line: 15, column: 51, scope: !41)
+!51 = !DILocalVariable(name: "y", arg: 2, scope: !41, file: !15, line: 15, type: !16)
+!52 = !DILocation(line: 15, column: 73, scope: !41)
+!53 = !DILocation(line: 15, column: 85, scope: !41)
+!54 = !DILocation(line: 15, column: 78, scope: !41)
+)""";
+} // namespace
+} // namespace llvm

jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request Dec 4, 2024
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/9 branch from aa6401c to 24860c1 Compare December 6, 2024 12:44
artempyanykh added a commit that referenced this pull request Dec 6, 2024
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from bdb7970 to 22aaf36 Compare December 6, 2024 12:44
@artempyanykh artempyanykh changed the base branch from users/artempyanykh/fast-coro-upstream/9 to main December 6, 2024 14:03
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 22aaf36 to b316815 Compare December 6, 2024 14:04
@artempyanykh artempyanykh changed the base branch from main to users/artempyanykh/fast-coro-upstream/9 December 6, 2024 14:05
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request Dec 6, 2024
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/9 branch from 48abbb2 to 77892ea Compare December 9, 2024 12:41
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from b743d6c to 40407df Compare December 16, 2024 22:31
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/9 branch from c613ae3 to a49247f Compare December 17, 2024 09:00
artempyanykh added a commit that referenced this pull request Dec 17, 2024
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 40407df to 2190fe0 Compare December 17, 2024 09:00
@artempyanykh artempyanykh changed the base branch from users/artempyanykh/fast-coro-upstream/9 to main January 9, 2025 20:48
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 2190fe0 to 1f8b67b Compare January 9, 2025 20:48
@artempyanykh artempyanykh changed the base branch from main to users/artempyanykh/fast-coro-upstream/9 January 9, 2025 20:48
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request Jan 10, 2025
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh changed the base branch from users/artempyanykh/fast-coro-upstream/9 to main January 23, 2025 21:15
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 1f8b67b to aaeec0b Compare January 23, 2025 21:16
@artempyanykh artempyanykh changed the base branch from main to users/artempyanykh/fast-coro-upstream/9 January 23, 2025 21:16
@artempyanykh artempyanykh changed the base branch from users/artempyanykh/fast-coro-upstream/9 to main January 23, 2025 21:55
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from aaeec0b to 5402d09 Compare January 23, 2025 21:55
@artempyanykh artempyanykh changed the base branch from main to users/artempyanykh/fast-coro-upstream/9 January 23, 2025 21:56
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request Jan 24, 2025
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: llvm/llvm-project#118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/9 branch from 83dea70 to 5bbda30 Compare January 24, 2025 08:39
artempyanykh added a commit that referenced this pull request Jan 24, 2025
Summary:
The analysis simply primes and caches DebugInfoFinders for each DICompileUnit in a module. This
allows (future) callers like CoroSplitPass to compute global debug info metadata (required for
coroutine function cloning) much faster. Specifically, pay the price of DICompileUnit processing
only once per compile unit, rather than once per coroutine.

Test Plan:
Added a smoke test for the new analysis
ninja check-llvm-unit check-llvm

stack-info: PR: #118629, branch: users/artempyanykh/fast-coro-upstream/10
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 5402d09 to 54bc13d Compare January 24, 2025 08:40
@artempyanykh artempyanykh changed the base branch from users/artempyanykh/fast-coro-upstream/9 to main January 24, 2025 09:57
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from 54bc13d to f727512 Compare January 24, 2025 09:57
@artempyanykh artempyanykh changed the base branch from main to users/artempyanykh/fast-coro-upstream/9 January 24, 2025 09:57
Base automatically changed from users/artempyanykh/fast-coro-upstream/9 to main January 24, 2025 11:42
@artempyanykh artempyanykh force-pushed the users/artempyanykh/fast-coro-upstream/10 branch from f727512 to 7d38fe3 Compare January 24, 2025 11:43
@artempyanykh
Copy link
Contributor Author

@felipepiovezan, @TylerNowicki could you please have a look at #118629 and #118630? These are the 2 remaining bits of #109032.

// Check whether the analysis has been explicitly invalidated. Otherwise, it's
// stateless and remains preserved.
auto PAC = PA.getChecker<DebugInfoCacheAnalysis>();
return !PAC.preservedWhenStateless();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this invalidating the analysis when the module changes?

Copy link
Contributor Author

@artempyanykh artempyanykh Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! I would appreciate help/guidance here, @felipepiovezan. IIUC passes that may formally change the module won't necessarily invalidate the analysis:

  • changing globals, enum types, imported entities in a CU would invalidate it
  • adding/removing CUs would too.

However, I'm not sure what's the proper way to express this here.

Copy link
Contributor

@felipepiovezan felipepiovezan Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the docs in https://llvm.org/docs/NewPassManager.html#implementing-analysis-invalidation

What you wrote seems to say that your analysis is only invalidated when explicitly invalidated, which is certainly not true (see "If an analysis is stateless and generally shouldn’t be invalidated, use the following:"). Your analysis is invalidated whenever debug metadata is changed/added/removed by any pass.

I think we should simply not implement this method, which implies the analysis is invalidated whenever the module changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's still a bit fuzzy, but I think I understand better now. I was worried about unnecessary invalidations, but all CoroSplit passes will share the same analysis results as part of CGSCC pipeline. So, it's ok to have the default invalidate impl.

I'll push the fixup momentarily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipepiovezan could you please have a look at #118630 where this analysis is actually wired in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is the copy needed? A const reference would not suffice?

 // Copy DIFinder from cache which is primed on F's compile unit when available
   auto *PrimedDIFinder = cachedDIFinder(F, DICache);
   if (PrimedDIFinder)
     DIFinder = *PrimedDIFinder;

Copy link
Contributor Author

@artempyanykh artempyanykh Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for comments and questions @felipepiovezan! Looks like the question is whether CoroSplit would invalidate analysis results. There are tests for coroutine transforms that check debug info, but let me try to write a more targeted test.

This is not really an area I'm very familiar with, but there are some things that I found odd about that PR. For example, why do we need this?

  // Prime DebugInfoCache.
   // TODO: Currently, the only user is CoroSplitPass. Consider running
   // conditionally.
   AM.getResult<DebugInfoCacheAnalysis>(M);

This is based on my understanding of this part of the manual. Inner level passes (like CGSCC-level) can only ask for a cached result of an outer level analysis (like Module-level). They can't ask to compute outer analysis results on demand.

Unless I explicitly run the analysis there, there won't be any results to consume from inside CoroSplit.

The 'conditional' part is based on what I saw in CoroConditionalWrapper. Since CoroSplit is the only user, we could gate the analysis. OTOH, gating could prevent the adoption of the analysis in other places. So, unclear.

I was worried about unnecessary invalidations, but all CoroSplit passes will share the same analysis results as part of CGSCC pipeline.

Isn't this dangerous though? Each CoroSplit pass will modify the Module and its metadata, so the analysis need to be invalidated after every run of CoroSplit

Let me try to write a test for it.

the pass ends with a return PreservedAnalyses::none();

I think this is for a specific unit of IR (SCC in this case)? In traces that I get the analysis runs once per ModuleToPostOrderCGSCCPassAdaptor.

This implies we're doing a lot of unnecessary work that will be invalidated.
Should this new analysis be lazy?

Ideally! But given the caveats in the doc about on demand computations and "future concurrency", I'm not sure what's the best way to proceed; and if it'll be worth it. In my testing even on our larger sources (that otherwise take a few minutes to compile) the pass takes 60ms.

I.e. only visit a CU when it is queried on it.

This would also depend on how common multi-CU modules are.

Also, why is the copy needed? A const reference would not suffice?

 // Copy DIFinder from cache which is primed on F's compile unit when available
   auto *PrimedDIFinder = cachedDIFinder(F, DICache);
   if (PrimedDIFinder)
     DIFinder = *PrimedDIFinder;

DIFinder is mutable, it gets hydrated by visiting debug info attached to an element, but it won't revisit what's already been visited. For a function:

  1. it visits debug info attached to function's DISubprogram,
  2. it visits debug into attached to function's CU (see this).

We use PrimedDIFinder hydrated on a CU to cut the expensive p.2 but we can't pass it by reference (and copy instead) because it gets mutated down the line when doing p.1. It'd be great to refactor function cloning more, but at this point I'd like to validate the caching part.

Copy link
Contributor

@felipepiovezan felipepiovezan Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to summarize what may or may not be an issue here. Imagine a module with two CGSCCs, but one CU.

We have a module level analysis that calls DebugInfoFinder::processCompileUnit on that one CU.
This happens before the first run of CoroSplit.
The first run of CoroSplit will modify a lot of module level debug information.
But the first run of CoroSplit is not allowed to modify the result of the analysis.

Will the second run of CoroSplit, on the second CGSCC, use an outdated analysis?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipepiovezan yes, thanks for the summary! I want to do some extra validation. Will report back shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipepiovezan I ran validation on a large chunk of our internal codebase with a patched version of LLVM that had an extra set of checks to compare cached DIFinder with the one built from scratch (roughly this diff). No mismatches found.

The line of reasoning here is that in the context of CoroSplit pass this analysis result (cached DIFinder) is not invalidated since:

  • it doesn't add/remove debug info attached specifically to Compile Unit (this is what gets cached by running DebugInfoFinder::processCompileUnit; note that these are only things reachable directly from a CU, rather than everything inside the CU),
  • the contents of the cachedDIFinder get identity mapped and hence are unchanged by construction.

Seems safe to have this analysis on specifically for CoroSplit at least.


With that said, I had another idea. I refactored the code a bit more and it seems that whatever is going on now with the DIFinder and collecting IdentityMD metadata set for identity mapping basically reduces to a predicate check on metadata to

identity map compile units, types, other subprograms, and lexical blocks of other subprograms.

I have a stack that implements this. Most of the stack is refactoring to prepare for the change and then cleaning up after the change. The two important PRs are:

All existing tests pass incl. cloning and coroutine tests, which is a good indication that #129148 is functionally equivalent to the current version. @felipepiovezan could you have a look at #129148? If looks good, I'd prefer this new version to the DebugInfoCacheAnalysis.

@artempyanykh
Copy link
Contributor Author

Dropping this in favor of #129148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants