Skip to content

[OpenMP] Fix passing target id features to AMDGPU offloading #94765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 7, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jun 7, 2024

Summary:
AMDGPU supports a target-id feature which is used to qualify targets
with different incompatible features. These are both rules and target
features. Currently, we pass -target-cpu twice when offloading to
OpenMP, and do not pass the target-id features at all. The effect was
that passing something like --offload-arch=gfx90a:xnack+ would show up
as -target-cpu=gfx90a:xnack+ -target-cpu=gfx90a. Thus ignoring the
xnack completely and passing it twice. This patch fixes that to pass it
once and then separate it like how HIP does.

Summary:
AMDGPU supports a `target-id` feature which is used to qualify targets
with different incompatible features. These are both rules and target
features. Currently, we pass `-target-cpu` twice when offloading to
OpenMP, and do not pass the target-id features at all. The effect was
that passing something like `--offload-arch=gfx90a:xnack+` would show up
as `-target-cpu=gfx90a:xnack+ -target-cpu=gfx90a`. Thus ignoring the
xnack completely and passing it twice. This patch fixes that to pass it
once and then separate it like how HIP does.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:openmp OpenMP related changes to Clang labels Jun 7, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 7, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
AMDGPU supports a target-id feature which is used to qualify targets
with different incompatible features. These are both rules and target
features. Currently, we pass -target-cpu twice when offloading to
OpenMP, and do not pass the target-id features at all. The effect was
that passing something like --offload-arch=gfx90a:xnack+ would show up
as -target-cpu=gfx90a:xnack+ -target-cpu=gfx90a. Thus ignoring the
xnack completely and passing it twice. This patch fixes that to pass it
once and then separate it like how HIP does.


Full diff: https://github.com/llvm/llvm-project/pull/94765.diff

3 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+5-1)
  • (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (-5)
  • (modified) clang/test/Driver/amdgpu-openmp-toolchain.c (+2-1)
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 11a98a0ec314d..20f879e2f75cb 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -645,7 +645,11 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D,
                                      std::vector<StringRef> &Features) {
   // Add target ID features to -target-feature options. No diagnostics should
   // be emitted here since invalid target ID is diagnosed at other places.
-  StringRef TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
+  StringRef TargetID;
+  if (Args.hasArg(options::OPT_mcpu_EQ))
+    TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
+  else if (Args.hasArg(options::OPT_march_EQ))
+    TargetID = Args.getLastArgValue(options::OPT_march_EQ);
   if (!TargetID.empty()) {
     llvm::StringMap<bool> FeatureMap;
     auto OptionalGpuArch = parseTargetID(Triple, TargetID, &FeatureMap);
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index cca18431ff773..d17ecb15c8208 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -44,14 +44,9 @@ void AMDGPUOpenMPToolChain::addClangTargetOptions(
     Action::OffloadKind DeviceOffloadingKind) const {
   HostTC.addClangTargetOptions(DriverArgs, CC1Args, DeviceOffloadingKind);
 
-  StringRef GPUArch = DriverArgs.getLastArgValue(options::OPT_march_EQ);
-  assert(!GPUArch.empty() && "Must have an explicit GPU arch.");
-
   assert(DeviceOffloadingKind == Action::OFK_OpenMP &&
          "Only OpenMP offloading kinds are supported.");
 
-  CC1Args.push_back("-target-cpu");
-  CC1Args.push_back(DriverArgs.MakeArgStringRef(GPUArch));
   CC1Args.push_back("-fcuda-is-device");
 
   if (DriverArgs.hasArg(options::OPT_nogpulib))
diff --git a/clang/test/Driver/amdgpu-openmp-toolchain.c b/clang/test/Driver/amdgpu-openmp-toolchain.c
index ef58c2c4e3f3a..49af04acc4639 100644
--- a/clang/test/Driver/amdgpu-openmp-toolchain.c
+++ b/clang/test/Driver/amdgpu-openmp-toolchain.c
@@ -7,7 +7,7 @@
 
 // verify the tools invocations
 // CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-llvm-bc"{{.*}}"-x" "c"
-// CHECK: "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"{{.*}}"-target-cpu" "gfx906"{{.*}}"-fcuda-is-device"{{.*}}
+// CHECK: "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"{{.*}}"-fcuda-is-device"{{.*}}"-target-cpu" "gfx906"
 // CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-obj"
 // CHECK: clang-linker-wrapper{{.*}} "-o" "a.out"
 
@@ -63,6 +63,7 @@
 
 // RUN: %clang -### -target x86_64-pc-linux-gnu -fopenmp --offload-arch=gfx90a:sramecc-:xnack+ \
 // RUN:   -nogpulib %s 2>&1 | FileCheck %s --check-prefix=CHECK-TARGET-ID
+// CHECK-TARGET-ID: "-cc1" "-triple" "amdgcn-amd-amdhsa" {{.*}} "-target-cpu" "gfx90a" "-target-feature" "-sramecc" "-target-feature" "+xnack"
 // CHECK-TARGET-ID: clang-offload-packager{{.*}}arch=gfx90a:sramecc-:xnack+,kind=openmp,feature=-sramecc,feature=+xnack
 
 // RUN: not %clang -### -target x86_64-pc-linux-gnu -fopenmp --offload-arch=gfx90a,gfx90a:xnack+ \

@jhuber6 jhuber6 merged commit 374f655 into llvm:main Jun 7, 2024
9 of 11 checks passed
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jun 8, 2024
Reverts: breaks taretid_multi_image
374f655 [OpenMP] Fix passing target id features to AMDGPU offloading (llvm#94765)

Change-Id: I5f56fd191a6cb0e399157bd868fd4d473683254a
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jun 9, 2024
)

Summary:
AMDGPU supports a `target-id` feature which is used to qualify targets
with different incompatible features. These are both rules and target
features. Currently, we pass `-target-cpu` twice when offloading to
OpenMP, and do not pass the target-id features at all. The effect was
that passing something like `--offload-arch=gfx90a:xnack+` would show up
as `-target-cpu=gfx90a:xnack+ -target-cpu=gfx90a`. Thus ignoring the
xnack completely and passing it twice. This patch fixes that to pass it
once and then separate it like how HIP does.

Change-Id: Ib8d9299326bdececd1de331395e4f3ae9dbea596
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants