Skip to content

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Oct 11, 2025

Closes #95219

Copy link
Contributor Author

arsenm commented Oct 11, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@arsenm arsenm requested review from aengelke and slinder1 October 11, 2025 17:32
@arsenm arsenm marked this pull request as ready for review October 11, 2025 17:32
@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2025

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Closes #95219


Full diff: https://github.com/llvm/llvm-project/pull/163011.diff

6 Files Affected:

  • (modified) llvm/lib/TargetParser/TargetDataLayout.cpp (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/global-constant.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll (+4-4)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected (+4-4)
diff --git a/llvm/lib/TargetParser/TargetDataLayout.cpp b/llvm/lib/TargetParser/TargetDataLayout.cpp
index cea246e9527bd..950bb2bc557b4 100644
--- a/llvm/lib/TargetParser/TargetDataLayout.cpp
+++ b/llvm/lib/TargetParser/TargetDataLayout.cpp
@@ -258,7 +258,7 @@ static std::string computePowerDataLayout(const Triple &T) {
 static std::string computeAMDDataLayout(const Triple &TT) {
   if (TT.getArch() == Triple::r600) {
     // 32-bit pointers.
-    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
+    return "e-m:e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
            "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1";
   }
 
@@ -268,7 +268,7 @@ static std::string computeAMDDataLayout(const Triple &TT) {
   // (address space 7), and 128-bit non-integral buffer resourcees (address
   // space 8) which cannot be non-trivilally accessed by LLVM memory operations
   // like getelementptr.
-  return "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
+  return "e-m:e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
          "-p7:160:256:256:32-p8:128:128:128:48-p9:192:256:256:32-i64:64-"
          "v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-"
          "v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9";
diff --git a/llvm/test/CodeGen/AMDGPU/global-constant.ll b/llvm/test/CodeGen/AMDGPU/global-constant.ll
index 866d3a1e30890..b04602aff8e6a 100644
--- a/llvm/test/CodeGen/AMDGPU/global-constant.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-constant.ll
@@ -12,21 +12,21 @@
 
 ; Non-R600 OSes use relocations.
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC0_LO:[0-9]+]]:[[PC0_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], private1@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], private1@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], .Lprivate1@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], .Lprivate1@rel32@hi+12
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC1_LO:[0-9]+]]:[[PC1_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], private2@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], private2@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], .Lprivate2@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], .Lprivate2@rel32@hi+12
 
 ; MESA uses absolute relocations.
-; GCN-MESA: s_add_u32 s2, private1@abs32@lo, s4
-; GCN-MESA: s_addc_u32 s3, private1@abs32@hi, s5
+; GCN-MESA: s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-MESA: s_addc_u32 s3, .Lprivate1@abs32@hi, s5
 
 ; PAL uses absolute relocations.
-; GCN-PAL:    s_add_u32 s2, private1@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s3, private1@abs32@hi, s5
-; GCN-PAL:    s_add_u32 s4, private2@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s5, private2@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s3, .Lprivate1@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s4, .Lprivate2@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s5, .Lprivate2@abs32@hi, s5
 
 ; R600-LABEL: private_test
 define amdgpu_kernel void @private_test(i32 %index, ptr addrspace(1) %out) {
diff --git a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
index b8cfcbf2d2665..6d55e79edbef6 100644
--- a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
@@ -14,8 +14,8 @@
 
 ; CHECK-LABEL: private_test:
 ; CHECK: s_getpc_b64 s[[[PC_LO:[0-9]+]]:[[PC_HI:[0-9]+]]]
-; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], private@rel32@lo+8
-; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], private@rel32@hi+16
+; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], .Lprivate@rel32@lo+8
+; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], .Lprivate@rel32@hi+16
 ; CHECK: s_load_dword s{{[0-9]+}}, s[[[ADDR_LO]]:[[ADDR_HI]]]
 define amdgpu_kernel void @private_test(ptr addrspace(1) %out) {
   %ptr = getelementptr [256 x i32], ptr addrspace(1) @private, i32 0, i32 1
@@ -153,7 +153,7 @@ define amdgpu_kernel void @external_w_init_test(ptr addrspace(1) %out) {
   ret void
 }
 
-; CHECK: .local private
+; CHECK: .local .Lprivate
 ; CHECK: .local internal
 ; CHECK: .weak linkonce
 ; CHECK: .weak weak
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
index 63e9eef3297a1..66b795876d70e 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
@@ -315,7 +315,7 @@ define amdgpu_kernel void @test_small_memcpy_i64_global_to_global_align16(ptr ad
 
 ; FUNC-LABEL: {{^}}test_memcpy_const_string_align4:
 ; SI: s_getpc_b64
-; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, hello.align4@rel32@lo+4
+; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, .Lhello.align4@rel32@lo+4
 ; SI: s_addc_u32
 ; SI-DAG: s_load_dwordx8
 ; SI-DAG: s_load_dwordx2
diff --git a/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll b/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
index 5ff2d82c1464f..2509497bbcde7 100644
--- a/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
+++ b/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
@@ -5,8 +5,8 @@ declare dso_local void @main()
 
 define dso_local void @naked() naked "frame-pointer"="all" {
 ; CHECK-LABEL: naked:
-; CHECK:       naked$local:
-; CHECK-NEXT:    .type naked$local,@function
+; CHECK:       .Lnaked$local:
+; CHECK-NEXT:    .type .Lnaked$local,@function
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    s_getpc_b64 s[16:17]
@@ -19,8 +19,8 @@ define dso_local void @naked() naked "frame-pointer"="all" {
 
 define dso_local void @normal() "frame-pointer"="all" {
 ; CHECK-LABEL: normal:
-; CHECK:       normal$local:
-; CHECK-NEXT:    .type normal$local,@function
+; CHECK:       .Lnormal$local:
+; CHECK-NEXT:    .type .Lnormal$local,@function
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    s_mov_b32 s16, s33
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
index 429bee4195fa9..a8c2531117f42 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
@@ -65,8 +65,8 @@ define dso_local i32 @main() #0 {
 
 attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-LABEL: check_boundaries:
-; CHECK:       check_boundaries$local:
-; CHECK-NEXT:    .type check_boundaries$local,@function
+; CHECK:       .Lcheck_boundaries$local:
+; CHECK-NEXT:    .type .Lcheck_boundaries$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
@@ -107,8 +107,8 @@ attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; CHECK-LABEL: main:
-; CHECK:       main$local:
-; CHECK-NEXT:    .type main$local,@function
+; CHECK:       .Lmain$local:
+; CHECK-NEXT:    .type .Lmain$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)

@aengelke
Copy link
Contributor

Clang still has the data layouts hard-coded in lib/Basic/Targets. In an ideal world we'd get rid of that first... some tests also include the data layout, I think that could be removed.

@arsenm arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from d4aca48 to 0a83907 Compare October 12, 2025 02:05
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 12, 2025
@arsenm arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from 0a83907 to a23bc0d Compare October 12, 2025 17:47
@arsenm
Copy link
Contributor Author

arsenm commented Oct 12, 2025

Clang still has the data layouts hard-coded in lib/Basic/Targets. In an ideal world we'd get rid of that first...

@rnk seems to have gotten as far as moving it to TargetParser, but clang isn't using it yet.

some tests also include the data layout, I think that could be removed.

Not sure if lld is one of the tools that auto-fill the datalayout from the triple.

Also it appears we're missing a proper error handling for the datalayout mismatch, these tests are hitting an assertion

@arsenm arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from a23bc0d to 22eaf7d Compare October 12, 2025 17:52
@aengelke
Copy link
Contributor

LGTM once all tests pass

Also it appears we're missing a proper error handling for the datalayout mismatch, these tests are hitting an assertion

Yeah, that's particularly annoying when working with a release build of LLVM.

Is there a use case for specifying a data layout in a module that doesn't match the implied data layout from the triple?

@arsenm arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from 22eaf7d to fc32794 Compare October 13, 2025 02:28
@arsenm arsenm enabled auto-merge (squash) October 13, 2025 02:29
@arsenm arsenm merged commit 853760b into main Oct 13, 2025
9 of 10 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/use-elf-mangling-datalayout branch October 13, 2025 03:01
DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the unfortunate byproduct of making every HIP/OpenMP compilation emit like 50 lines of warnings about conflicting data layouts with the ROCm Device libs.

@jhuber6
Copy link
Contributor

jhuber6 commented Oct 15, 2025

We probably need an exception or something in the IR linker to avoid printing warnings for a case like this. I think we do the exact same thing for the CUDA device library IR.

@arsenm
Copy link
Contributor Author

arsenm commented Oct 16, 2025

We should just delete the hardcoded datalayouts out of device libs (really we should delete the IR there altogether, but the datalayout is unnecessary)

@jhuber6
Copy link
Contributor

jhuber6 commented Oct 16, 2025

We should just delete the hardcoded datalayouts out of device libs (really we should delete the IR there altogether, but the datalayout is unnecessary)

It will take quite some time for those changes to land and get propagated from ROCm and it's unacceptable for a trivial OpenMP / HIP program to emit 50+ lines of warnings for six months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lld:ELF lld

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants