Skip to content

[clang][deps] Store common, partially-formed invocation #65677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 7, 2023

Conversation

jansvoboda11
Copy link
Contributor

We create one CompilerInvocation for each modular dependency we discover. This means we create a lot of copies, even though most of the invocation is the same between modules. This patch makes use of the copy-on-write flavor of CompilerInvocation to share the common parts, reducing memory usage and speeding up the scan.

@jansvoboda11 jansvoboda11 requested a review from a team as a code owner September 7, 2023 21:20
CI.getHeaderSearchOpts().ModulesIgnoreMacros.clear();
}

return CI;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going through CowCompilerInvocation(const CompilerInvocation &X), right? Could we add an && version that steals the sub objects without copying them? Obviously less important than avoiding the per-module copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm happy to do that in a follow-up (constantly rebasing is kinda annoying).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in #66301.

@@ -117,14 +112,37 @@ ModuleDepCollector::makeInvocationForModuleBuildWithoutOutputs(
CI.getFrontendOpts().ARCMTAction = FrontendOptions::ARCMT_None;
CI.getFrontendOpts().ObjCMTAction = FrontendOptions::ObjCMT_None;
CI.getFrontendOpts().MTMigrateDir.clear();
CI.getLangOpts().ModuleName = Deps.ID.ModuleName;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could clear this string to make sure it won't get copied.

@jansvoboda11 jansvoboda11 merged commit 9208065 into llvm:main Sep 7, 2023
@jansvoboda11 jansvoboda11 deleted the common-ci branch September 7, 2023 22:49
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Sep 12, 2023
We create one `CompilerInvocation` for each modular dependency we
discover. This means we create a lot of copies, even though most of the
invocation is the same between modules. This patch makes use of the
copy-on-write flavor of `CompilerInvocation` to share the common parts,
reducing memory usage and speeding up the scan.
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Oct 5, 2023
We create one `CompilerInvocation` for each modular dependency we
discover. This means we create a lot of copies, even though most of the
invocation is the same between modules. This patch makes use of the
copy-on-write flavor of `CompilerInvocation` to share the common parts,
reducing memory usage and speeding up the scan.
@Endilll Endilll added the clang Clang issues not falling into any other category label Jul 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 14, 2025

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

We create one CompilerInvocation for each modular dependency we discover. This means we create a lot of copies, even though most of the invocation is the same between modules. This patch makes use of the copy-on-write flavor of CompilerInvocation to share the common parts, reducing memory usage and speeding up the scan.


Full diff: https://github.com/llvm/llvm-project/pull/65677.diff

2 Files Affected:

  • (modified) clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h (+14-9)
  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+71-47)
diff --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 914d55eadefe8..ef75c58055218 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -241,8 +241,11 @@ class ModuleDepCollector final : public DependencyCollector {
   llvm::SetVector<const Module *> DirectModularDeps;
   /// Options that control the dependency output generation.
   std::unique_ptr<DependencyOutputOptions> Opts;
-  /// The original Clang invocation passed to dependency scanner.
-  CompilerInvocation OriginalInvocation;
+  /// A Clang invocation that's based on the original TU invocation and that has
+  /// been partially transformed into one that can perform explicit build of
+  /// a discovered modular dependency. Note that this still needs to be adjusted
+  /// for each individual module.
+  CowCompilerInvocation CommonInvocation;
   /// Whether to optimize the modules' command-line arguments.
   bool OptimizeArgs;
   /// Whether to set up command-lines to load PCM files eagerly.
@@ -262,12 +265,11 @@ class ModuleDepCollector final : public DependencyCollector {
   /// Adds \p Path to \c MD.FileDeps, making it absolute if necessary.
   void addFileDep(ModuleDeps &MD, StringRef Path);
 
-  /// Constructs a CompilerInvocation that can be used to build the given
-  /// module, excluding paths to discovered modular dependencies that are yet to
-  /// be built.
-  CompilerInvocation makeInvocationForModuleBuildWithoutOutputs(
+  /// Get a Clang invocation adjusted to build the given modular dependency.
+  /// This excludes paths that are yet-to-be-provided by the build system.
+  CowCompilerInvocation getInvocationAdjustedForModuleBuildWithoutOutputs(
       const ModuleDeps &Deps,
-      llvm::function_ref<void(CompilerInvocation &)> Optimize) const;
+      llvm::function_ref<void(CowCompilerInvocation &)> Optimize) const;
 
   /// Collect module map files for given modules.
   llvm::DenseSet<const FileEntry *>
@@ -279,13 +281,16 @@ class ModuleDepCollector final : public DependencyCollector {
   /// Add module files (pcm) to the invocation, if needed.
   void addModuleFiles(CompilerInvocation &CI,
                       ArrayRef<ModuleID> ClangModuleDeps) const;
+  void addModuleFiles(CowCompilerInvocation &CI,
+                      ArrayRef<ModuleID> ClangModuleDeps) const;
 
   /// Add paths that require looking up outputs to the given dependencies.
-  void addOutputPaths(CompilerInvocation &CI, ModuleDeps &Deps);
+  void addOutputPaths(CowCompilerInvocation &CI, ModuleDeps &Deps);
 
   /// Compute the context hash for \p Deps, and create the mapping
   /// \c ModuleDepsByID[Deps.ID] = &Deps.
-  void associateWithContextHash(const CompilerInvocation &CI, ModuleDeps &Deps);
+  void associateWithContextHash(const CowCompilerInvocation &CI,
+                                ModuleDeps &Deps);
 };
 
 } // end namespace dependencies
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index e13f7c74e9b92..a7248860ad4b5 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -54,18 +54,18 @@ static std::vector<std::string> splitString(std::string S, char Separator) {
   return Result;
 }
 
-void ModuleDepCollector::addOutputPaths(CompilerInvocation &CI,
+void ModuleDepCollector::addOutputPaths(CowCompilerInvocation &CI,
                                         ModuleDeps &Deps) {
-  CI.getFrontendOpts().OutputFile =
+  CI.getMutFrontendOpts().OutputFile =
       Controller.lookupModuleOutput(Deps.ID, ModuleOutputKind::ModuleFile);
   if (!CI.getDiagnosticOpts().DiagnosticSerializationFile.empty())
-    CI.getDiagnosticOpts().DiagnosticSerializationFile =
+    CI.getMutDiagnosticOpts().DiagnosticSerializationFile =
         Controller.lookupModuleOutput(
             Deps.ID, ModuleOutputKind::DiagnosticSerializationFile);
   if (!CI.getDependencyOutputOpts().OutputFile.empty()) {
-    CI.getDependencyOutputOpts().OutputFile = Controller.lookupModuleOutput(
+    CI.getMutDependencyOutputOpts().OutputFile = Controller.lookupModuleOutput(
         Deps.ID, ModuleOutputKind::DependencyFile);
-    CI.getDependencyOutputOpts().Targets =
+    CI.getMutDependencyOutputOpts().Targets =
         splitString(Controller.lookupModuleOutput(
                         Deps.ID, ModuleOutputKind::DependencyTargets),
                     '\0');
@@ -74,18 +74,13 @@ void ModuleDepCollector::addOutputPaths(CompilerInvocation &CI,
       // Fallback to -o as dependency target, as in the driver.
       SmallString<128> Target;
       quoteMakeTarget(CI.getFrontendOpts().OutputFile, Target);
-      CI.getDependencyOutputOpts().Targets.push_back(std::string(Target));
+      CI.getMutDependencyOutputOpts().Targets.push_back(std::string(Target));
     }
   }
 }
 
-CompilerInvocation
-ModuleDepCollector::makeInvocationForModuleBuildWithoutOutputs(
-    const ModuleDeps &Deps,
-    llvm::function_ref<void(CompilerInvocation &)> Optimize) const {
-  // Make a deep copy of the original Clang invocation.
-  CompilerInvocation CI(OriginalInvocation);
-
+static CowCompilerInvocation
+makeCommonInvocationForModuleBuild(CompilerInvocation CI) {
   CI.resetNonModularOptions();
   CI.clearImplicitModuleBuildOptions();
 
@@ -117,14 +112,38 @@ ModuleDepCollector::makeInvocationForModuleBuildWithoutOutputs(
   CI.getFrontendOpts().ARCMTAction = FrontendOptions::ARCMT_None;
   CI.getFrontendOpts().ObjCMTAction = FrontendOptions::ObjCMT_None;
   CI.getFrontendOpts().MTMigrateDir.clear();
-  CI.getLangOpts().ModuleName = Deps.ID.ModuleName;
-  CI.getFrontendOpts().IsSystemModule = Deps.IsSystem;
+  CI.getLangOpts().ModuleName.clear();
+
+  // Remove any macro definitions that are explicitly ignored.
+  if (!CI.getHeaderSearchOpts().ModulesIgnoreMacros.empty()) {
+    llvm::erase_if(
+        CI.getPreprocessorOpts().Macros,
+        [&CI](const std::pair<std::string, bool> &Def) {
+          StringRef MacroDef = Def.first;
+          return CI.getHeaderSearchOpts().ModulesIgnoreMacros.contains(
+              llvm::CachedHashString(MacroDef.split('=').first));
+        });
+    // Remove the now unused option.
+    CI.getHeaderSearchOpts().ModulesIgnoreMacros.clear();
+  }
+
+  return CI;
+}
+
+CowCompilerInvocation
+ModuleDepCollector::getInvocationAdjustedForModuleBuildWithoutOutputs(
+    const ModuleDeps &Deps,
+    llvm::function_ref<void(CowCompilerInvocation &)> Optimize) const {
+  CowCompilerInvocation CI = CommonInvocation;
+
+  CI.getMutLangOpts().ModuleName = Deps.ID.ModuleName;
+  CI.getMutFrontendOpts().IsSystemModule = Deps.IsSystem;
 
   // Inputs
   InputKind ModuleMapInputKind(CI.getFrontendOpts().DashX.getLanguage(),
                                InputKind::Format::ModuleMap);
-  CI.getFrontendOpts().Inputs.emplace_back(Deps.ClangModuleMapFile,
-                                           ModuleMapInputKind);
+  CI.getMutFrontendOpts().Inputs.emplace_back(Deps.ClangModuleMapFile,
+                                              ModuleMapInputKind);
 
   auto CurrentModuleMapEntry =
       ScanInstance.getFileManager().getFile(Deps.ClangModuleMapFile);
@@ -150,36 +169,25 @@ ModuleDepCollector::makeInvocationForModuleBuildWithoutOutputs(
         !DepModuleMapFiles.contains(*ModuleMapEntry))
       continue;
 
-    CI.getFrontendOpts().ModuleMapFiles.emplace_back(ModuleMapFile);
+    CI.getMutFrontendOpts().ModuleMapFiles.emplace_back(ModuleMapFile);
   }
 
   // Report the prebuilt modules this module uses.
   for (const auto &PrebuiltModule : Deps.PrebuiltModuleDeps)
-    CI.getFrontendOpts().ModuleFiles.push_back(PrebuiltModule.PCMFile);
+    CI.getMutFrontendOpts().ModuleFiles.push_back(PrebuiltModule.PCMFile);
 
   // Add module file inputs from dependencies.
   addModuleFiles(CI, Deps.ClangModuleDeps);
 
-  // Remove any macro definitions that are explicitly ignored.
-  if (!CI.getHeaderSearchOpts().ModulesIgnoreMacros.empty()) {
-    llvm::erase_if(
-        CI.getPreprocessorOpts().Macros,
-        [&CI](const std::pair<std::string, bool> &Def) {
-          StringRef MacroDef = Def.first;
-          return CI.getHeaderSearchOpts().ModulesIgnoreMacros.contains(
-              llvm::CachedHashString(MacroDef.split('=').first));
-        });
-    // Remove the now unused option.
-    CI.getHeaderSearchOpts().ModulesIgnoreMacros.clear();
+  if (!CI.getDiagnosticOpts().SystemHeaderWarningsModules.empty()) {
+    // Apply -Wsystem-headers-in-module for the current module.
+    if (llvm::is_contained(CI.getDiagnosticOpts().SystemHeaderWarningsModules,
+                           Deps.ID.ModuleName))
+      CI.getMutDiagnosticOpts().Warnings.push_back("system-headers");
+    // Remove the now unused option(s).
+    CI.getMutDiagnosticOpts().SystemHeaderWarningsModules.clear();
   }
 
-  // Apply -Wsystem-headers-in-module for the current module.
-  if (llvm::is_contained(CI.getDiagnosticOpts().SystemHeaderWarningsModules,
-                         Deps.ID.ModuleName))
-    CI.getDiagnosticOpts().Warnings.push_back("system-headers");
-  // Remove the now unused option(s).
-  CI.getDiagnosticOpts().SystemHeaderWarningsModules.clear();
-
   Optimize(CI);
 
   return CI;
@@ -224,6 +232,19 @@ void ModuleDepCollector::addModuleFiles(
   }
 }
 
+void ModuleDepCollector::addModuleFiles(
+    CowCompilerInvocation &CI, ArrayRef<ModuleID> ClangModuleDeps) const {
+  for (const ModuleID &MID : ClangModuleDeps) {
+    std::string PCMPath =
+        Controller.lookupModuleOutput(MID, ModuleOutputKind::ModuleFile);
+    if (EagerLoadModules)
+      CI.getMutFrontendOpts().ModuleFiles.push_back(std::move(PCMPath));
+    else
+      CI.getMutHeaderSearchOpts().PrebuiltModuleFiles.insert(
+          {MID.ModuleName, std::move(PCMPath)});
+  }
+}
+
 static bool needsModules(FrontendInputFile FIF) {
   switch (FIF.getKind().getLanguage()) {
   case Language::Unknown:
@@ -264,7 +285,7 @@ void ModuleDepCollector::applyDiscoveredDependencies(CompilerInvocation &CI) {
 }
 
 static std::string getModuleContextHash(const ModuleDeps &MD,
-                                        const CompilerInvocation &CI,
+                                        const CowCompilerInvocation &CI,
                                         bool EagerLoadModules) {
   llvm::HashBuilder<llvm::TruncatedBLAKE3<16>,
                     llvm::support::endianness::native>
@@ -304,8 +325,8 @@ static std::string getModuleContextHash(const ModuleDeps &MD,
   return toString(llvm::APInt(sizeof(Words) * 8, Words), 36, /*Signed=*/false);
 }
 
-void ModuleDepCollector::associateWithContextHash(const CompilerInvocation &CI,
-                                                  ModuleDeps &Deps) {
+void ModuleDepCollector::associateWithContextHash(
+    const CowCompilerInvocation &CI, ModuleDeps &Deps) {
   Deps.ID.ContextHash = getModuleContextHash(Deps, CI, EagerLoadModules);
   bool Inserted = ModuleDepsByID.insert({Deps.ID, &Deps}).second;
   (void)Inserted;
@@ -498,12 +519,13 @@ ModuleDepCollectorPP::handleTopLevelModule(const Module *M) {
         MD.ModuleMapFileDeps.emplace_back(IFI.FilenameAsRequested);
       });
 
-  CompilerInvocation CI = MDC.makeInvocationForModuleBuildWithoutOutputs(
-      MD, [&](CompilerInvocation &BuildInvocation) {
-        if (MDC.OptimizeArgs)
-          optimizeHeaderSearchOpts(BuildInvocation.getHeaderSearchOpts(),
-                                   *MDC.ScanInstance.getASTReader(), *MF);
-      });
+  CowCompilerInvocation CI =
+      MDC.getInvocationAdjustedForModuleBuildWithoutOutputs(
+          MD, [&](CowCompilerInvocation &BuildInvocation) {
+            if (MDC.OptimizeArgs)
+              optimizeHeaderSearchOpts(BuildInvocation.getMutHeaderSearchOpts(),
+                                       *MDC.ScanInstance.getASTReader(), *MF);
+          });
 
   MDC.associateWithContextHash(CI, MD);
 
@@ -601,7 +623,9 @@ ModuleDepCollector::ModuleDepCollector(
     DependencyActionController &Controller, CompilerInvocation OriginalCI,
     bool OptimizeArgs, bool EagerLoadModules, bool IsStdModuleP1689Format)
     : ScanInstance(ScanInstance), Consumer(C), Controller(Controller),
-      Opts(std::move(Opts)), OriginalInvocation(std::move(OriginalCI)),
+      Opts(std::move(Opts)),
+      CommonInvocation(
+          makeCommonInvocationForModuleBuild(std::move(OriginalCI))),
       OptimizeArgs(OptimizeArgs), EagerLoadModules(EagerLoadModules),
       IsStdModuleP1689Format(IsStdModuleP1689Format) {}
 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants