Skip to content

Add basic -mtune support #98517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 16, 2024
Merged

Conversation

AlexisPerry
Copy link
Contributor

Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jul 11, 2024
@AlexisPerry AlexisPerry requested a review from tarunprabhu July 11, 2024 18:55
@llvmbot
Copy link
Member

llvmbot commented Jul 11, 2024

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-flang-fir-hlfir
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-flang-driver

Author: Alexis Perry-Holby (AlexisPerry)

Changes

Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043


Patch is 27.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/98517.diff

26 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+4-3)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+9-1)
  • (modified) flang/include/flang/Frontend/TargetOptions.h (+3)
  • (modified) flang/include/flang/Lower/Bridge.h (+3-3)
  • (modified) flang/include/flang/Optimizer/CodeGen/CGPasses.td (+4)
  • (modified) flang/include/flang/Optimizer/CodeGen/Target.h (+19-2)
  • (modified) flang/include/flang/Optimizer/Dialect/Support/FIRContext.h (+7)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+4-1)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2-1)
  • (modified) flang/lib/Lower/Bridge.cpp (+2-1)
  • (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+5-1)
  • (modified) flang/lib/Optimizer/CodeGen/Target.cpp (+11)
  • (modified) flang/lib/Optimizer/CodeGen/TargetRewrite.cpp (+11-1)
  • (modified) flang/lib/Optimizer/CodeGen/TypeConverter.cpp (+2-1)
  • (modified) flang/lib/Optimizer/Dialect/Support/FIRContext.cpp (+18)
  • (added) flang/test/Driver/tune-cpu-fir.f90 (+25)
  • (added) flang/test/Lower/tune-cpu-llvm.f90 (+8)
  • (modified) flang/tools/bbc/bbc.cpp (+2-1)
  • (modified) flang/tools/tco/tco.cpp (+4)
  • (modified) flang/unittests/Optimizer/FIRContextTest.cpp (+4-1)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1)
  • (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+5)
  • (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3)
  • (added) mlir/test/Target/LLVMIR/Import/tune-cpu.ll (+16)
  • (added) mlir/test/Target/LLVMIR/tune-cpu.mlir (+14)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index cfb37b3c5b474..8d49a4708aaf0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5436,6 +5436,7 @@ def module_file_info : Flag<["-"], "module-file-info">, Flags<[]>,
   HelpText<"Provide information about a particular module file">;
 def mthumb : Flag<["-"], "mthumb">, Group<m_Group>;
 def mtune_EQ : Joined<["-"], "mtune=">, Group<m_Group>,
+  Visibility<[ClangOption, FlangOption]>,
   HelpText<"Only supported on AArch64, PowerPC, RISC-V, SPARC, SystemZ, and X86">;
 def multi__module : Flag<["-"], "multi_module">;
 def multiply__defined__unused : Separate<["-"], "multiply_defined_unused">;
@@ -6760,9 +6761,6 @@ def emit_hlfir : Flag<["-"], "emit-hlfir">, Group<Action_Group>,
 
 let Visibility = [CC1Option, CC1AsOption] in {
 
-def tune_cpu : Separate<["-"], "tune-cpu">,
-  HelpText<"Tune for a specific cpu type">,
-  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_abi : Separate<["-"], "target-abi">,
   HelpText<"Target a particular ABI type">,
   MarshallingInfoString<TargetOpts<"ABI">>;
@@ -6789,6 +6787,9 @@ def darwin_target_variant_triple : Separate<["-"], "darwin-target-variant-triple
 
 let Visibility = [CC1Option, CC1AsOption, FC1Option] in {
 
+def tune_cpu : Separate<["-"], "tune-cpu">,
+  HelpText<"Tune for a specific cpu type">,
+  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_cpu : Separate<["-"], "target-cpu">,
   HelpText<"Target a specific cpu type">,
   MarshallingInfoString<TargetOpts<"CPU">>;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index ee8292a508f93..7e42bad258cc6 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -15,6 +15,7 @@
 #include "llvm/Frontend/Debug/Options.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Path.h"
+#include "llvm/TargetParser/Host.h"
 #include "llvm/TargetParser/RISCVISAInfo.h"
 #include "llvm/TargetParser/RISCVTargetParser.h"
 
@@ -411,6 +412,13 @@ void Flang::addTargetOptions(const ArgList &Args,
   }
 
   // TODO: Add target specific flags, ABI, mtune option etc.
+  if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
+    CmdArgs.push_back("-tune-cpu");
+    if (A->getValue() == StringRef{"native"})
+      CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
+    else
+      CmdArgs.push_back(A->getValue());
+  }
 }
 
 void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
@@ -807,7 +815,7 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA,
   case CodeGenOptions::FramePointerKind::None:
     FPKeepKindStr = "-mframe-pointer=none";
     break;
-   case CodeGenOptions::FramePointerKind::Reserved:
+  case CodeGenOptions::FramePointerKind::Reserved:
     FPKeepKindStr = "-mframe-pointer=reserved";
     break;
   case CodeGenOptions::FramePointerKind::NonLeaf:
diff --git a/flang/include/flang/Frontend/TargetOptions.h b/flang/include/flang/Frontend/TargetOptions.h
index ef5d270a2185d..fa72c77a028a1 100644
--- a/flang/include/flang/Frontend/TargetOptions.h
+++ b/flang/include/flang/Frontend/TargetOptions.h
@@ -32,6 +32,9 @@ class TargetOptions {
   /// If given, the name of the target CPU to generate code for.
   std::string cpu;
 
+  /// If given, the name of the target CPU to tune code for.
+  std::string cpuToTuneFor;
+
   /// The list of target specific features to enable or disable, as written on
   /// the command line.
   std::vector<std::string> featuresAsWritten;
diff --git a/flang/include/flang/Lower/Bridge.h b/flang/include/flang/Lower/Bridge.h
index 52110b861b680..4379ed512cdf0 100644
--- a/flang/include/flang/Lower/Bridge.h
+++ b/flang/include/flang/Lower/Bridge.h
@@ -65,11 +65,11 @@ class LoweringBridge {
          const Fortran::lower::LoweringOptions &loweringOptions,
          const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
          const Fortran::common::LanguageFeatureControl &languageFeatures,
-         const llvm::TargetMachine &targetMachine) {
+         const llvm::TargetMachine &targetMachine, llvm::StringRef tuneCPU) {
     return LoweringBridge(ctx, semanticsContext, defaultKinds, intrinsics,
                           targetCharacteristics, allCooked, triple, kindMap,
                           loweringOptions, envDefaults, languageFeatures,
-                          targetMachine);
+                          targetMachine, tuneCPU);
   }
 
   //===--------------------------------------------------------------------===//
@@ -148,7 +148,7 @@ class LoweringBridge {
       const Fortran::lower::LoweringOptions &loweringOptions,
       const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
       const Fortran::common::LanguageFeatureControl &languageFeatures,
-      const llvm::TargetMachine &targetMachine);
+      const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU);
   LoweringBridge() = delete;
   LoweringBridge(const LoweringBridge &) = delete;
 
diff --git a/flang/include/flang/Optimizer/CodeGen/CGPasses.td b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
index 9a4d327b33bad..989e3943882a1 100644
--- a/flang/include/flang/Optimizer/CodeGen/CGPasses.td
+++ b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
@@ -31,6 +31,8 @@ def FIRToLLVMLowering : Pass<"fir-to-llvm-ir", "mlir::ModuleOp"> {
            "Override module's data layout.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"applyTBAA", "apply-tbaa", "bool", /*default=*/"false",
@@ -68,6 +70,8 @@ def TargetRewritePass : Pass<"target-rewrite", "mlir::ModuleOp"> {
            "Override module's target triple.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"noCharacterConversion", "no-character-conversion",
diff --git a/flang/include/flang/Optimizer/CodeGen/Target.h b/flang/include/flang/Optimizer/CodeGen/Target.h
index 3cf6a74a9adb7..2b3b2152ac80c 100644
--- a/flang/include/flang/Optimizer/CodeGen/Target.h
+++ b/flang/include/flang/Optimizer/CodeGen/Target.h
@@ -76,6 +76,11 @@ class CodeGenSpecifics {
       llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
       const mlir::DataLayout &dl);
 
+  static std::unique_ptr<CodeGenSpecifics>
+  get(mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+      llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+      const mlir::DataLayout &dl, llvm::StringRef tuneCPU);
+
   static TypeAndAttr getTypeAndAttr(mlir::Type t) { return TypeAndAttr{t, {}}; }
 
   CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
@@ -83,7 +88,17 @@ class CodeGenSpecifics {
                    mlir::LLVM::TargetFeaturesAttr targetFeatures,
                    const mlir::DataLayout &dl)
       : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
-        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl} {}
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{""} {}
+
+  CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
+                   KindMapping &&kindMap, llvm::StringRef targetCPU,
+                   mlir::LLVM::TargetFeaturesAttr targetFeatures,
+                   const mlir::DataLayout &dl, llvm::StringRef tuneCPU)
+      : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{tuneCPU} {}
+
   CodeGenSpecifics() = delete;
   virtual ~CodeGenSpecifics() {}
 
@@ -165,7 +180,8 @@ class CodeGenSpecifics {
   virtual unsigned char getCIntTypeWidth() const = 0;
 
   llvm::StringRef getTargetCPU() const { return targetCPU; }
-
+  llvm::StringRef getTuneCPU() const { return tuneCPU; }
+  
   mlir::LLVM::TargetFeaturesAttr getTargetFeatures() const {
     return targetFeatures;
   }
@@ -182,6 +198,7 @@ class CodeGenSpecifics {
   llvm::StringRef targetCPU;
   mlir::LLVM::TargetFeaturesAttr targetFeatures;
   const mlir::DataLayout *dataLayout = nullptr;
+  llvm::StringRef tuneCPU;
 };
 
 } // namespace fir
diff --git a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
index 059a10ce2fe51..b69f1415040ec 100644
--- a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
+++ b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
@@ -58,6 +58,13 @@ void setTargetCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
 /// Get the target CPU string from the Module or return a null reference.
 llvm::StringRef getTargetCPU(mlir::ModuleOp mod);
 
+/// Set the tune CPU for the module. `cpu` must not be deallocated while
+/// module `mod` is still live.
+void setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
+
+/// Get the tune CPU string from the Module or return a null reference.
+llvm::StringRef getTuneCPU(mlir::ModuleOp mod);
+  
 /// Set the target features for the module.
 void setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features);
 
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index b3ed9acad36df..786083f95e15c 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -411,7 +411,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> {
     Option<"unsafeFPMath", "unsafe-fp-math",
            "bool", /*default=*/"false",
            "Set the unsafe-fp-math attribute on functions in the module.">,
-  ];
+    Option<"tuneCPU", "tune-cpu",
+           "llvm::StringRef", /*default=*/"llvm::StringRef{}",
+           "Set the tune-cpu attribute on functions in the module.">,
+];
 }
 
 def AssumedRankOpConversion : Pass<"fir-assumed-rank-op", "mlir::ModuleOp"> {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index e2d60ad46f14f..3d66a946fc946 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -431,6 +431,10 @@ static void parseTargetArgs(TargetOptions &opts, llvm::opt::ArgList &args) {
           args.getLastArg(clang::driver::options::OPT_target_cpu))
     opts.cpu = a->getValue();
 
+  if (const llvm::opt::Arg *a =
+          args.getLastArg(clang::driver::options::OPT_tune_cpu))
+    opts.cpuToTuneFor = a->getValue();
+
   for (const llvm::opt::Arg *currentArg :
        args.filtered(clang::driver::options::OPT_target_feature))
     opts.featuresAsWritten.emplace_back(currentArg->getValue());
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index a85ecd1ac71b3..5c86bd947ce73 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -297,7 +297,8 @@ bool CodeGenAction::beginSourceFileAction() {
       ci.getParsing().allCooked(), ci.getInvocation().getTargetOpts().triple,
       kindMap, ci.getInvocation().getLoweringOpts(),
       ci.getInvocation().getFrontendOpts().envDefaults,
-      ci.getInvocation().getFrontendOpts().features, targetMachine);
+      ci.getInvocation().getFrontendOpts().features, targetMachine,
+      ci.getInvocation().getTargetOpts().cpuToTuneFor);
 
   // Fetch module from lb, so we can set
   mlirModule = std::make_unique<mlir::ModuleOp>(lb.getModule());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 3d071f6bb8d5a..b998709dccd8c 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -6020,7 +6020,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
     const Fortran::lower::LoweringOptions &loweringOptions,
     const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
     const Fortran::common::LanguageFeatureControl &languageFeatures,
-    const llvm::TargetMachine &targetMachine)
+    const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU)
     : semanticsContext{semanticsContext}, defaultKinds{defaultKinds},
       intrinsics{intrinsics}, targetCharacteristics{targetCharacteristics},
       cooked{&cooked}, context{context}, kindMap{kindMap},
@@ -6077,6 +6077,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
   fir::setTargetTriple(*module.get(), triple);
   fir::setKindMapping(*module.get(), kindMap);
   fir::setTargetCPU(*module.get(), targetMachine.getTargetCPU());
+  fir::setTuneCPU(*module.get(), tuneCPU);
   fir::setTargetFeatures(*module.get(), targetMachine.getTargetFeatureString());
   fir::support::setMLIRDataLayout(*module.get(),
                                   targetMachine.createDataLayout());
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 7483acfcd1ca7..e370a33b7c4a7 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -3618,6 +3618,9 @@ class FIRToLLVMLowering
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -3714,7 +3717,8 @@ class FIRToLLVMLowering
       signalPassFailure();
     }
 
-    // Run pass to add comdats to functions that have weak linkage on relevant platforms
+    // Run pass to add comdats to functions that have weak linkage on relevant
+    // platforms
     if (fir::getTargetTriple(mod).supportsCOMDAT()) {
       mlir::OpPassManager comdatPM("builtin.module");
       comdatPM.addPass(mlir::LLVM::createLLVMAddComdats());
diff --git a/flang/lib/Optimizer/CodeGen/Target.cpp b/flang/lib/Optimizer/CodeGen/Target.cpp
index 652e2bddc1b89..25141102a8c43 100644
--- a/flang/lib/Optimizer/CodeGen/Target.cpp
+++ b/flang/lib/Optimizer/CodeGen/Target.cpp
@@ -1113,3 +1113,14 @@ fir::CodeGenSpecifics::get(mlir::MLIRContext *ctx, llvm::Triple &&trp,
   }
   TODO(mlir::UnknownLoc::get(ctx), "target not implemented");
 }
+
+std::unique_ptr<fir::CodeGenSpecifics> fir::CodeGenSpecifics::get(
+    mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+    llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+    const mlir::DataLayout &dl, llvm::StringRef tuneCPU) {
+  std::unique_ptr<fir::CodeGenSpecifics> CGS = fir::CodeGenSpecifics::get(
+      ctx, std::move(trp), std::move(kindMap), targetCPU, targetFeatures, dl);
+
+  CGS->tuneCPU = tuneCPU;
+  return CGS;
+}
diff --git a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
index 561d700f41220..b52f2b9325ece 100644
--- a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
+++ b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
@@ -89,6 +89,9 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -106,7 +109,8 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
 
     auto specifics = fir::CodeGenSpecifics::get(
         mod.getContext(), fir::getTargetTriple(mod), fir::getKindMapping(mod),
-        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl);
+        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl,
+	fir::getTuneCPU(mod));
 
     setMembers(specifics.get(), &rewriter, &*dl);
 
@@ -672,12 +676,18 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     auto targetCPU = specifics->getTargetCPU();
     mlir::StringAttr targetCPUAttr =
         targetCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, targetCPU);
+    auto tuneCPU = specifics->getTuneCPU();
+    mlir::StringAttr tuneCPUAttr =
+        tuneCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, tuneCPU);
     auto targetFeaturesAttr = specifics->getTargetFeatures();
 
     for (auto fn : mod.getOps<mlir::func::FuncOp>()) {
       if (targetCPUAttr)
         fn->setAttr("target_cpu", targetCPUAttr);
 
+      if (tuneCPUAttr)
+        fn->setAttr("tune_cpu", tuneCPUAttr);
+
       if (targetFeaturesAttr)
         fn->setAttr("target_features", targetFeaturesAttr);
 
diff --git a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
index ce86c625e082f..a28b03442fe83 100644
--- a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
+++ b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
@@ -35,7 +35,8 @@ LLVMTypeConverter::LLVMTypeConverter(mlir::ModuleOp module, bool applyTBAA,
       kindMapping(getKindMapping(module)),
       specifics(CodeGenSpecifics::get(
           module.getContext(), getTargetTriple(module), getKindMapping(module),
-          getTargetCPU(module), getTargetFeatures(module), dl)),
+          getTargetCPU(module), getTargetFeatures(module), dl,
+          getTuneCPU(module))),
       tbaaBuilder(std::make_unique<TBAABuilder>(module->getContext(), applyTBAA,
                                                 forceUnifiedTBAATree)),
       dataLayout{&dl} {
diff --git a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
index c4d00875c45e4..1aa631cb39126 100644
--- a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
+++ b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
@@ -77,6 +77,24 @@ llvm::StringRef fir::getTargetCPU(mlir::ModuleOp mod) {
   return {};
 }
 
+static constexpr const char *tuneCpuName = "fir.tune_cpu";
+
+void fir::setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu) {
+  if (cpu.empty())
+    return;
+
+  auto *ctx = mod.getContext();
+
+  mod->setAttr(tuneCpuName, mlir::StringAttr::get(ctx, cpu));
+}
+
+llvm::StringRef fir::getTuneCPU(mlir::ModuleOp mod) {
+  if (auto attr = mod->getAttrOfType<mlir::StringAttr>(tuneCpuName))
+    return attr.getValue();
+
+  return {};
+}
+
 static constexpr const char *targetFeaturesName = "fir.target_features";
 
 void fir::setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features) {
diff --git a/flang/test/Driver/tune-cpu-fir.f90 b/flang/test/Driver/tune-cpu-fir.f90
new file mode 100644
index 0000000000000..43c13b426d5d9
--- /dev/null
+++ b/flang/test/Driver/tune-cpu-fir.f90
@@ -0,0 +1,25 @@
+! RUN: %if aarch64-registered-target %{ %flang_fc1 -emit-fir -triple aarch64-unknown-linux-gnu -target-cpu aarch64 %s -o - | FileCheck %s --check-prefixes=ALL,ARMCPU %}
+! RUN: %if aarch64-registered-target %{ %flang_fc1 -emit-fir -triple aarch64-unknown-linux-gnu -tune-cpu neoverse-n1 %s -o - | FileCheck %s --check-prefixes=ALL,ARMTUNE %}
+! RUN: %if aarch64-registered-target %{ %flang_fc1 -emit-fir -triple aarch64-unknown-linux-gnu -target-cpu aarch64 -tune-cpu neoverse-n1 %s -o - | FileCheck %s --check-prefixes=ALL,ARMBOTH %}
+
+! RUN: %if x86-registered-target %{ %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -target-cpu x86-64 %s -o - | FileCheck %s --check-prefixes=ALL,X86CPU %}
+! RUN: %if x86-registered-target %{ %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -tune-cpu pentium4 %s -o - | FileCheck %s --check-prefixes=ALL,X86TUNE %}
+! RUN: %if x86-registered-target %{ %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -target-cpu x86-64 -tune-cpu pentium4 %s -o - | FileCheck %s --check-prefixes=ALL,X86BOTH %}
+
+! ALL: module attributes {
+
+! ARMCPU-SAME:      fir.target_cpu = "aarch64"
+! ARMCPU-NOT:       fir.tune_cpu = "neoverse-n1"
+
+! ARMTUNE-SAME:     fir.tune_cpu = "neoverse-n1"
+
+! ARMBOTH-SAME: fir.target_cpu = "aarch64"
+! ARMBOTH-SAME: fir.tune_cpu = "neoverse-n1"  
+
+! X86CPU-SAME:      fir.target_cpu = "x86-64"
+! X86CPU-N...
[truncated]

Copy link

github-actions bot commented Jul 11, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Initial implementation for the -mtune flag in Flang.
@AlexisPerry AlexisPerry force-pushed the AlexisPerry/mtune-3 branch from 0a9bf0d to a14f7c8 Compare July 11, 2024 19:25
@AlexisPerry
Copy link
Contributor Author

@banach-space If this PR is indeed ready to go, would you mind committing it on my behalf? I don't have commit privileges any more. Thanks so much for all your help! And thanks to all the reviewers on previous versions of this PR as well!

@banach-space banach-space merged commit f1d3fe7 into llvm:main Jul 16, 2024
7 checks passed
@AlexisPerry AlexisPerry deleted the AlexisPerry/mtune-3 branch July 16, 2024 16:42
DavidSpickett added a commit that referenced this pull request Jul 17, 2024
The native architecture is AArch64 here so the pentium name won't
work even if you've got the x86 backend enabled.

https://lab.llvm.org/buildbot/#/builders/17/builds/898

Pass an explicit target for each run line to fix this.

Test added in f1d3fe7 / #98517
@DavidSpickett
Copy link
Collaborator

This was failing on our bots that build all backends, I've fixed that:
8bf952d

It will help you in future if you set an email address for your commits in this Github project. Buildbot will then use that to notify you of failures.

You can also glance at http://llvm.validation.linaro.org/ (the "Flang" section) or search flang in https://lab.llvm.org/buildbot/#/builders (not every one that includes flang, but all the ones that focus on it at least).

It will post on PRs too if it knows a single commit is the problem, but all the bots that would have been able to do that only build the AArch64 target and it worked fine in that case.

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
The native architecture is AArch64 here so the pentium name won't
work even if you've got the x86 backend enabled.

https://lab.llvm.org/buildbot/#/builders/17/builds/898

Pass an explicit target for each run line to fix this.

Test added in f1d3fe7 / #98517

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251610
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:codegen flang:driver flang:fir-hlfir flang Flang issues not falling into any other category mlir:llvm mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants