Skip to content

IR: Store the default subtarget info in named metadata instead of the context. #98673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Jul 12, 2024

No description provided.

Created using spr 1.3.6-beta.1
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. LTO Link time optimization (regular/full LTO or ThinLTO) llvm:ir labels Jul 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 12, 2024

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-lto
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: None (pcc)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/98673.diff

8 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+2)
  • (modified) clang/lib/CodeGen/CodeGenAction.cpp (-6)
  • (modified) llvm/include/llvm/IR/LLVMContext.h (-16)
  • (modified) llvm/include/llvm/IR/Module.h (+18)
  • (modified) llvm/lib/IR/Function.cpp (+2-2)
  • (modified) llvm/lib/IR/LLVMContext.cpp (-16)
  • (modified) llvm/lib/IR/Module.cpp (+35)
  • (modified) llvm/lib/Linker/IRMover.cpp (+11)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index e765bbf637a66..5654530b268ca 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1337,6 +1337,8 @@ void clang::EmitBackendOutput(
     }
   }
 
+  M->setDefaultSubtargetInfo(TOpts.CPU, llvm::join(TOpts.Features, ","));
+
   EmitAssemblyHelper AsmHelper(Diags, HeaderOpts, CGOpts, TOpts, LOpts, M, VFS);
   AsmHelper.EmitAssembly(Action, std::move(OS), BC);
 
diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp
index e87226e60297c..6ffbbb5da0041 100644
--- a/clang/lib/CodeGen/CodeGenAction.cpp
+++ b/clang/lib/CodeGen/CodeGenAction.cpp
@@ -294,9 +294,6 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) {
   Ctx.setDiagnosticHandler(std::make_unique<ClangDiagnosticHandler>(
       CodeGenOpts, this));
 
-  Ctx.setDefaultTargetCPU(TargetOpts.CPU);
-  Ctx.setDefaultTargetFeatures(llvm::join(TargetOpts.Features, ","));
-
   Expected<std::unique_ptr<llvm::ToolOutputFile>> OptRecordFileOrErr =
     setupLLVMOptimizationRemarks(
       Ctx, CodeGenOpts.OptRecordFile, CodeGenOpts.OptRecordPasses,
@@ -1203,9 +1200,6 @@ void CodeGenAction::ExecuteAction() {
   Ctx.setDiagnosticHandler(
       std::make_unique<ClangDiagnosticHandler>(CodeGenOpts, &Result));
 
-  Ctx.setDefaultTargetCPU(TargetOpts.CPU);
-  Ctx.setDefaultTargetFeatures(llvm::join(TargetOpts.Features, ","));
-
   Expected<std::unique_ptr<llvm::ToolOutputFile>> OptRecordFileOrErr =
       setupLLVMOptimizationRemarks(
           Ctx, CodeGenOpts.OptRecordFile, CodeGenOpts.OptRecordPasses,
diff --git a/llvm/include/llvm/IR/LLVMContext.h b/llvm/include/llvm/IR/LLVMContext.h
index 6ffa2bdaa319a..89ad6f1572c67 100644
--- a/llvm/include/llvm/IR/LLVMContext.h
+++ b/llvm/include/llvm/IR/LLVMContext.h
@@ -327,22 +327,6 @@ class LLVMContext {
   [[deprecated("Always returns false")]]
   bool supportsTypedPointers() const;
 
-  /// Get or set the current "default" target CPU (target-cpu function
-  /// attribute). The intent is that compiler frontends will set this to a value
-  /// that reflects the attribute that a function would get "by default" without
-  /// any specific function attributes, and compiler passes will attach the
-  /// attribute to newly created functions that are not associated with a
-  /// particular function, such as global initializers.
-  /// Function::createWithDefaultAttr() will create functions with this
-  /// attribute. This function should only be called by passes that run at
-  /// compile time and not by the backend or LTO passes.
-  StringRef getDefaultTargetCPU();
-  void setDefaultTargetCPU(StringRef CPU);
-
-  /// Similar to {get,set}DefaultTargetCPU() but for default target-features.
-  StringRef getDefaultTargetFeatures();
-  void setDefaultTargetFeatures(StringRef Features);
-
 private:
   // Module needs access to the add/removeModule methods.
   friend class Module;
diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h
index d2b2fe40b1ada..b390cb156aa5b 100644
--- a/llvm/include/llvm/IR/Module.h
+++ b/llvm/include/llvm/IR/Module.h
@@ -1064,6 +1064,24 @@ class LLVM_EXTERNAL_VISIBILITY Module {
 
   /// Set the target variant version build SDK version metadata.
   void setDarwinTargetVariantSDKVersion(VersionTuple Version);
+
+  /// Get the current "default" target CPU (target-cpu function
+  /// attribute). The intent is that compiler frontends will set this to a value
+  /// that reflects the attribute that a function would get "by default" without
+  /// any specific function attributes, and compiler passes will attach the
+  /// attribute to newly created functions that are not associated with a
+  /// particular function, such as global initializers.
+  /// Function::createWithDefaultAttr() will create functions with this
+  /// attribute. This function should only be called by passes that run at
+  /// compile time and not by the backend or LTO passes.
+  StringRef getDefaultTargetCPU();
+
+  /// Similar to getDefaultTargetCPU() but for default target-features.
+  StringRef getDefaultTargetFeatures();
+
+  /// Set the data returned by getDefaultTargetCPU() and
+  /// getDefaultTargetFeatures().
+  void setDefaultSubtargetInfo(StringRef CPU, StringRef Features);
 };
 
 /// Given "llvm.used" or "llvm.compiler.used" as a global name, collect the
diff --git a/llvm/lib/IR/Function.cpp b/llvm/lib/IR/Function.cpp
index 5fb348a8bbcd4..8fdb564d8afec 100644
--- a/llvm/lib/IR/Function.cpp
+++ b/llvm/lib/IR/Function.cpp
@@ -401,10 +401,10 @@ Function *Function::createWithDefaultAttr(FunctionType *Ty,
   }
   if (M->getModuleFlag("function_return_thunk_extern"))
     B.addAttribute(Attribute::FnRetThunkExtern);
-  StringRef DefaultCPU = F->getContext().getDefaultTargetCPU();
+  StringRef DefaultCPU = M->getDefaultTargetCPU();
   if (!DefaultCPU.empty())
     B.addAttribute("target-cpu", DefaultCPU);
-  StringRef DefaultFeatures = F->getContext().getDefaultTargetFeatures();
+  StringRef DefaultFeatures = M->getDefaultTargetFeatures();
   if (!DefaultFeatures.empty())
     B.addAttribute("target-features", DefaultFeatures);
   F->addFnAttrs(B);
diff --git a/llvm/lib/IR/LLVMContext.cpp b/llvm/lib/IR/LLVMContext.cpp
index 194c7e7581cfb..8120cccace40b 100644
--- a/llvm/lib/IR/LLVMContext.cpp
+++ b/llvm/lib/IR/LLVMContext.cpp
@@ -390,19 +390,3 @@ void LLVMContext::setOpaquePointers(bool Enable) const {
 bool LLVMContext::supportsTypedPointers() const {
   return false;
 }
-
-StringRef LLVMContext::getDefaultTargetCPU() {
-  return pImpl->DefaultTargetCPU;
-}
-
-void LLVMContext::setDefaultTargetCPU(StringRef CPU) {
-  pImpl->DefaultTargetCPU = CPU;
-}
-
-StringRef LLVMContext::getDefaultTargetFeatures() {
-  return pImpl->DefaultTargetFeatures;
-}
-
-void LLVMContext::setDefaultTargetFeatures(StringRef Features) {
-  pImpl->DefaultTargetFeatures = Features;
-}
diff --git a/llvm/lib/IR/Module.cpp b/llvm/lib/IR/Module.cpp
index c966c53d09baf..e315f01be61ab 100644
--- a/llvm/lib/IR/Module.cpp
+++ b/llvm/lib/IR/Module.cpp
@@ -44,6 +44,7 @@
 #include "llvm/Support/Path.h"
 #include "llvm/Support/RandomNumberGenerator.h"
 #include "llvm/Support/VersionTuple.h"
+#include "llvm/TargetParser/Triple.h"
 #include <algorithm>
 #include <cassert>
 #include <cstdint>
@@ -903,3 +904,37 @@ VersionTuple Module::getDarwinTargetVariantSDKVersion() const {
 void Module::setDarwinTargetVariantSDKVersion(VersionTuple Version) {
   addSDKVersionMD(Version, *this, "darwin.target_variant.SDK Version");
 }
+
+void Module::setDefaultSubtargetInfo(StringRef CPU, StringRef Features) {
+  auto *MD = getOrInsertNamedMetadata("llvm.subtarget.info");
+  MD->clearOperands();
+  MD->addOperand(
+      MDNode::get(getContext(), {MDString::get(getContext(), CPU),
+                                 MDString::get(getContext(), Features)}));
+}
+
+StringRef Module::getDefaultTargetCPU() {
+  auto *MD = getNamedMetadata("llvm.subtarget.info");
+  // FIXME: This function is only meant to be called pre-LTO, but the AMDGPU and
+  // NVPTX backends end up calling this function in their CtorDtorLowering
+  // passes. For now we just return an empty string to avoid breaking them.
+  if (!MD) {
+    Triple T(getTargetTriple());
+    if (T.isAMDGPU() || T.isNVPTX())
+      return "";
+    report_fatal_error("Invalid call to getDefaultTargetCPU()");
+  }
+  return cast<MDString>(MD->getOperand(0)->getOperand(0))->getString();
+}
+
+StringRef Module::getDefaultTargetFeatures() {
+  auto *MD = getNamedMetadata("llvm.subtarget.info");
+  // FIXME: See getDefaultTargetCPU().
+  if (!MD) {
+    Triple T(getTargetTriple());
+    if (T.isAMDGPU() || T.isNVPTX())
+      return "";
+    report_fatal_error("Invalid call to getDefaultTargetFeatures()");
+  }
+  return cast<MDString>(MD->getOperand(0)->getOperand(1))->getString();
+}
diff --git a/llvm/lib/Linker/IRMover.cpp b/llvm/lib/Linker/IRMover.cpp
index ba8f371127764..6717ab443473c 100644
--- a/llvm/lib/Linker/IRMover.cpp
+++ b/llvm/lib/Linker/IRMover.cpp
@@ -1240,11 +1240,22 @@ void IRLinker::linkNamedMDNodes() {
     if (IsPerformingImport && NMD.getName() == "llvm.stats")
       continue;
 
+    // Default subtarget info is only intended to be used before LTO and
+    // shouldn't be present after merging because the default subtargets may be
+    // different. Even if they were the same we wouldn't want to keep them to
+    // avoid introducing bugs that would only occur when merging modules with
+    // different subtarget info.
+    if (NMD.getName() == "llvm.subtarget.info")
+      continue;
+      
     NamedMDNode *DestNMD = DstM.getOrInsertNamedMetadata(NMD.getName());
     // Add Src elements into Dest node.
     for (const MDNode *Op : NMD.operands())
       DestNMD->addOperand(Mapper.mapMDNode(*Op));
   }
+
+  if (auto *NMD = DstM.getNamedMetadata("llvm.subtarget.info"))
+    DstM.eraseNamedMetadata(NMD);
 }
 
 /// Merge the linker flags in Src into the Dest module.

@pcc
Copy link
Contributor Author

pcc commented Jul 12, 2024

This is a WIP. Because I'm adding new metadata, several tests fail, e.g. check-clang:

Testing Time: 244.38s

Total Discovered Tests: 37911
  Skipped          :    31 (0.08%)
  Unsupported      :   102 (0.27%)
  Passed           : 37689 (99.41%)
  Expectedly Failed:    28 (0.07%)
  Failed           :    61 (0.16%)

I'm uploading this to get agreement on the direction before updating the tests.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 3a744283f4c56b57adb2c381c0aeaf7faf5120ec 6976f169ec1befd902d0a0d0cc05b375d6ae094b --extensions h,cpp -- clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenAction.cpp llvm/include/llvm/IR/LLVMContext.h llvm/include/llvm/IR/Module.h llvm/lib/IR/Function.cpp llvm/lib/IR/LLVMContext.cpp llvm/lib/IR/Module.cpp llvm/lib/Linker/IRMover.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/IR/LLVMContext.cpp b/llvm/lib/IR/LLVMContext.cpp
index 8120cccace..15697b4c06 100644
--- a/llvm/lib/IR/LLVMContext.cpp
+++ b/llvm/lib/IR/LLVMContext.cpp
@@ -387,6 +387,4 @@ void LLVMContext::setOpaquePointers(bool Enable) const {
   assert(Enable && "Cannot disable opaque pointers");
 }
 
-bool LLVMContext::supportsTypedPointers() const {
-  return false;
-}
+bool LLVMContext::supportsTypedPointers() const { return false; }
diff --git a/llvm/lib/Linker/IRMover.cpp b/llvm/lib/Linker/IRMover.cpp
index 6717ab4434..7416f97a90 100644
--- a/llvm/lib/Linker/IRMover.cpp
+++ b/llvm/lib/Linker/IRMover.cpp
@@ -1247,7 +1247,7 @@ void IRLinker::linkNamedMDNodes() {
     // different subtarget info.
     if (NMD.getName() == "llvm.subtarget.info")
       continue;
-      
+
     NamedMDNode *DestNMD = DstM.getOrInsertNamedMetadata(NMD.getName());
     // Add Src elements into Dest node.
     for (const MDNode *Op : NMD.operands())

@pcc pcc requested review from MaskRay, efriedma-quic, rnk and jrtc27 July 12, 2024 18:16
@MaskRay
Copy link
Member

MaskRay commented Jul 12, 2024

This is a WIP. Because I'm adding new metadata, several tests fail, e.g. check-clang:

Testing Time: 244.38s

Total Discovered Tests: 37911
  Skipped          :    31 (0.08%)
  Unsupported      :   102 (0.27%)
  Passed           : 37689 (99.41%)
  Expectedly Failed:    28 (0.07%)
  Failed           :    61 (0.16%)

I'm uploading this to get agreement on the direction before updating the tests.

Thanks. Once an agreement on the direction is more formal, perhaps update the description to link to the Discourse thread and how this helps opt.

I have a shorter summary, which is hopefully useful.

If `Function::createWithDefaultAttr` reads from the serialized module flags metadata, we don't want conflict resolution logic because there is no unambiguously "correct" set of target attributes when merging two modules with different attributes.
In addition, when a translation unit has unexpected attributes, it could be benign in the non-LTO case as long as the code is unreachable.
However, conflict resolution logic could make the merged attributes unexpected.

<https://github.com/llvm/llvm-project/pull/96721> implemented a strategy by storing the target features in `LLVMContext`, which is not serialized.
`Function::createWithDefaultAttr` will load the target features from `LLVMContext`.

While 96721 works well, there is a drawback that the generate module from the LTO pre-link compilation does not contain sufficient information.
Running `opt` without the correct target features would not mimick Clang .
To address this drawback, we can use module flags metadata that is dropped during merging.

// different. Even if they were the same we wouldn't want to keep them to
// avoid introducing bugs that would only occur when merging modules with
// different subtarget info.
if (NMD.getName() == "llvm.subtarget.info")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be an overly simplistic take. Downstream we need a subset of the target features during LTO, and it wouldn't surprise me if there are cases upstream that could benefit from it too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such cases should use a separate field, such as the e_flags field that I suggested in the Discourse thread. That would more cleanly implement the semantics that the target actually wants. The metadata that I'm adding here is intended as a generic description of the singular default target CPU/features, and since that's not a concept that exists generically after LTO it gets removed from the IR at that point.

@MaskRay
Copy link
Member

MaskRay commented Aug 14, 2024

@efriedma-quic Does this work for you?

@efriedma-quic
Copy link
Collaborator

Sorry, I meant to get back to this sooner.

I'm not sure it makes sense to distinguish between "compile-time" and "link-time" in the sense this patch is doing: whether an instrumentation pass runs at compile-time or link-time is more a matter of what's convenient for a given compilation pipeline, not a fundamental semantic difference. The fundamental difference is whether we know the caller: if we know the caller, we can just take the flags from the caller. If we don't, we have to pick some "generic" flags that work for any possible caller.

If this means frontends have to do something more complicated to compute the right set of features, or we need to add new flags to clang, I think that's just the cost of doing things correctly.

Along those lines, I think we need some sort of merge rule. Some attributes can be merged: for example, whether the target has an atomic add instruction, as opposed to generic ll/sc atomics. Some attributes can't be merged, like whether a register is reserved: there can only be one correct answer.

Maybe the answer is just that we don't want a generic "target-features" list at all; we should just have individual module flags for each feature. That way each attribute can specify its own merge rule. We've already effectively started to do this with #83153: we have a bunch of attributes derived from module flags.

@pcc
Copy link
Contributor Author

pcc commented Mar 17, 2025

I think the question of which attributes to use for functions produced at LTO time is similar to the question of how we want to support things like linkonce_odr functions defined in multiple translation units with different -march flags. For the linkonce_odr functions I think we want an approach that is something like this:

  1. External functions get a (target-cpu, target-features) derived from -march and linkonce_odr functions get a (target-cpu, target-features) produced as if -march was not specified (but still incorporating things like -ffixed-$REG). Alternatively, we may create a separate set of flags that only affect external functions and keep the existing flags affecting both external and linkonce_odr functions (this latter approach seems more backwards compatible).
  2. This implies that when code generated on their own, linkonce_odr functions are compiled for the base ABI.
  3. linkonce_odr functions are allowed to be inlined into external callers in spite of the different (target-cpu, target-features) pairs.

Consider the __cfi_check function. We can think of __cfi_check as being defined in each translation unit and one copy is selected by the linker, as with linkonce_odr. There is already an (arguably incorrect) assumption that relying on the linker to select a linkonce_odr function from object files compiled with different -march flags produces a result that is correct enough, and for now we can rely on the same assumption for __cfi_check. Once the frontend is extended to produce the two (target-cpu, target-features) pairs, we can attach the linkonce_odr-intended pair to the __cfi_check function.

I'm not sure that we need a merge rule or a separate "language" for target attributes similar to the module flags that have been created so far out of necessity. An approach based on selecting a (target-cpu, target-features) pair supplied by an arbitrary TU, combined with ensuring that each of the many (target-cpu, target-features) pairs is suitable for the linkage unit as a whole, seems sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category llvm:ir LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants