AMDGPU: Use ELF mangling in data layout #163011

arsenm · 2025-10-11T17:32:03Z

Closes #95219

arsenm · 2025-10-11T17:32:21Z

AMDGPU: Use ELF mangling in data layout #163011 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-10-11T17:32:38Z

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Closes #95219

Full diff: https://github.com/llvm/llvm-project/pull/163011.diff

6 Files Affected:

(modified) llvm/lib/TargetParser/TargetDataLayout.cpp (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/global-constant.ll (+10-10)
(modified) llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll (+4-4)
(modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected (+4-4)

diff --git a/llvm/lib/TargetParser/TargetDataLayout.cpp b/llvm/lib/TargetParser/TargetDataLayout.cpp
index cea246e9527bd..950bb2bc557b4 100644
--- a/llvm/lib/TargetParser/TargetDataLayout.cpp
+++ b/llvm/lib/TargetParser/TargetDataLayout.cpp
@@ -258,7 +258,7 @@ static std::string computePowerDataLayout(const Triple &T) {
 static std::string computeAMDDataLayout(const Triple &TT) {
   if (TT.getArch() == Triple::r600) {
     // 32-bit pointers.
-    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
+    return "e-m:e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
            "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1";
   }
 
@@ -268,7 +268,7 @@ static std::string computeAMDDataLayout(const Triple &TT) {
   // (address space 7), and 128-bit non-integral buffer resourcees (address
   // space 8) which cannot be non-trivilally accessed by LLVM memory operations
   // like getelementptr.
-  return "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
+  return "e-m:e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
          "-p7:160:256:256:32-p8:128:128:128:48-p9:192:256:256:32-i64:64-"
          "v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-"
          "v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9";
diff --git a/llvm/test/CodeGen/AMDGPU/global-constant.ll b/llvm/test/CodeGen/AMDGPU/global-constant.ll
index 866d3a1e30890..b04602aff8e6a 100644
--- a/llvm/test/CodeGen/AMDGPU/global-constant.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-constant.ll
@@ -12,21 +12,21 @@
 
 ; Non-R600 OSes use relocations.
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC0_LO:[0-9]+]]:[[PC0_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], private1@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], private1@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], .Lprivate1@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], .Lprivate1@rel32@hi+12
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC1_LO:[0-9]+]]:[[PC1_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], private2@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], private2@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], .Lprivate2@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], .Lprivate2@rel32@hi+12
 
 ; MESA uses absolute relocations.
-; GCN-MESA: s_add_u32 s2, private1@abs32@lo, s4
-; GCN-MESA: s_addc_u32 s3, private1@abs32@hi, s5
+; GCN-MESA: s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-MESA: s_addc_u32 s3, .Lprivate1@abs32@hi, s5
 
 ; PAL uses absolute relocations.
-; GCN-PAL:    s_add_u32 s2, private1@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s3, private1@abs32@hi, s5
-; GCN-PAL:    s_add_u32 s4, private2@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s5, private2@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s3, .Lprivate1@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s4, .Lprivate2@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s5, .Lprivate2@abs32@hi, s5
 
 ; R600-LABEL: private_test
 define amdgpu_kernel void @private_test(i32 %index, ptr addrspace(1) %out) {
diff --git a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
index b8cfcbf2d2665..6d55e79edbef6 100644
--- a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
@@ -14,8 +14,8 @@
 
 ; CHECK-LABEL: private_test:
 ; CHECK: s_getpc_b64 s[[[PC_LO:[0-9]+]]:[[PC_HI:[0-9]+]]]
-; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], private@rel32@lo+8
-; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], private@rel32@hi+16
+; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], .Lprivate@rel32@lo+8
+; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], .Lprivate@rel32@hi+16
 ; CHECK: s_load_dword s{{[0-9]+}}, s[[[ADDR_LO]]:[[ADDR_HI]]]
 define amdgpu_kernel void @private_test(ptr addrspace(1) %out) {
   %ptr = getelementptr [256 x i32], ptr addrspace(1) @private, i32 0, i32 1
@@ -153,7 +153,7 @@ define amdgpu_kernel void @external_w_init_test(ptr addrspace(1) %out) {
   ret void
 }
 
-; CHECK: .local private
+; CHECK: .local .Lprivate
 ; CHECK: .local internal
 ; CHECK: .weak linkonce
 ; CHECK: .weak weak
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
index 63e9eef3297a1..66b795876d70e 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
@@ -315,7 +315,7 @@ define amdgpu_kernel void @test_small_memcpy_i64_global_to_global_align16(ptr ad
 
 ; FUNC-LABEL: {{^}}test_memcpy_const_string_align4:
 ; SI: s_getpc_b64
-; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, hello.align4@rel32@lo+4
+; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, .Lhello.align4@rel32@lo+4
 ; SI: s_addc_u32
 ; SI-DAG: s_load_dwordx8
 ; SI-DAG: s_load_dwordx2
diff --git a/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll b/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
index 5ff2d82c1464f..2509497bbcde7 100644
--- a/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
+++ b/llvm/test/CodeGen/AMDGPU/naked-fn-with-frame-pointer.ll
@@ -5,8 +5,8 @@ declare dso_local void @main()
 
 define dso_local void @naked() naked "frame-pointer"="all" {
 ; CHECK-LABEL: naked:
-; CHECK:       naked$local:
-; CHECK-NEXT:    .type naked$local,@function
+; CHECK:       .Lnaked$local:
+; CHECK-NEXT:    .type .Lnaked$local,@function
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    s_getpc_b64 s[16:17]
@@ -19,8 +19,8 @@ define dso_local void @naked() naked "frame-pointer"="all" {
 
 define dso_local void @normal() "frame-pointer"="all" {
 ; CHECK-LABEL: normal:
-; CHECK:       normal$local:
-; CHECK-NEXT:    .type normal$local,@function
+; CHECK:       .Lnormal$local:
+; CHECK-NEXT:    .type .Lnormal$local,@function
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    s_mov_b32 s16, s33
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
index 429bee4195fa9..a8c2531117f42 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
@@ -65,8 +65,8 @@ define dso_local i32 @main() #0 {
 
 attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-LABEL: check_boundaries:
-; CHECK:       check_boundaries$local:
-; CHECK-NEXT:    .type check_boundaries$local,@function
+; CHECK:       .Lcheck_boundaries$local:
+; CHECK-NEXT:    .type .Lcheck_boundaries$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
@@ -107,8 +107,8 @@ attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; CHECK-LABEL: main:
-; CHECK:       main$local:
-; CHECK-NEXT:    .type main$local,@function
+; CHECK:       .Lmain$local:
+; CHECK-NEXT:    .type .Lmain$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)

aengelke · 2025-10-11T19:44:38Z

Clang still has the data layouts hard-coded in lib/Basic/Targets. In an ideal world we'd get rid of that first... some tests also include the data layout, I think that could be removed.

arsenm · 2025-10-12T17:49:38Z

Clang still has the data layouts hard-coded in lib/Basic/Targets. In an ideal world we'd get rid of that first...

@rnk seems to have gotten as far as moving it to TargetParser, but clang isn't using it yet.

some tests also include the data layout, I think that could be removed.

Not sure if lld is one of the tools that auto-fill the datalayout from the triple.

Also it appears we're missing a proper error handling for the datalayout mismatch, these tests are hitting an assertion

aengelke · 2025-10-12T17:57:46Z

LGTM once all tests pass

Also it appears we're missing a proper error handling for the datalayout mismatch, these tests are hitting an assertion

Yeah, that's particularly annoying when working with a release build of LLVM.

Is there a use case for specifying a data layout in a module that doesn't match the implied data layout from the triple?

Closes #95219

Closes llvm#95219

jhuber6

This has the unfortunate byproduct of making every HIP/OpenMP compilation emit like 50 lines of warnings about conflicting data layouts with the ROCm Device libs.

jhuber6 · 2025-10-15T20:29:38Z

We probably need an exception or something in the IR linker to avoid printing warnings for a case like this. I think we do the exact same thing for the CUDA device library IR.

arsenm · 2025-10-16T00:27:23Z

We should just delete the hardcoded datalayouts out of device libs (really we should delete the IR there altogether, but the datalayout is unnecessary)

jhuber6 · 2025-10-16T01:21:44Z

We should just delete the hardcoded datalayouts out of device libs (really we should delete the IR there altogether, but the datalayout is unnecessary)

It will take quite some time for those changes to land and get propagated from ROCm and it's unacceptable for a trivial OpenMP / HIP program to emit 50+ lines of warnings for six months.

arsenm added the backend:AMDGPU label Oct 11, 2025 — with Graphite App

arsenm requested review from aengelke and slinder1 October 11, 2025 17:32

arsenm marked this pull request as ready for review October 11, 2025 17:32

aengelke mentioned this pull request Oct 11, 2025

[AMDGPU] Use ELF mangling in data layout #95219

Closed

alexrp mentioned this pull request Oct 12, 2025

Tracking issue for the LLVM 22 upgrade ziglang/zig#24542

Open

21 tasks

arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from d4aca48 to 0a83907 Compare October 12, 2025 02:05

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 12, 2025

arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from 0a83907 to a23bc0d Compare October 12, 2025 17:47

llvmbot added lld lld:ELF labels Oct 12, 2025

arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from a23bc0d to 22eaf7d Compare October 12, 2025 17:52

AMDGPU: Use ELF mangling in data layout

fc32794

Closes #95219

arsenm force-pushed the users/arsenm/amdgpu/use-elf-mangling-datalayout branch from 22eaf7d to fc32794 Compare October 13, 2025 02:28

arsenm enabled auto-merge (squash) October 13, 2025 02:29

arsenm merged commit 853760b into main Oct 13, 2025
9 of 10 checks passed

arsenm deleted the users/arsenm/amdgpu/use-elf-mangling-datalayout branch October 13, 2025 03:01

DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025

AMDGPU: Use ELF mangling in data layout (llvm#163011)

b6a9802

Closes llvm#95219

akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025

AMDGPU: Use ELF mangling in data layout (llvm#163011)

9139929

Closes llvm#95219

jhuber6 reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU: Use ELF mangling in data layout #163011

AMDGPU: Use ELF mangling in data layout #163011

Uh oh!

arsenm commented Oct 11, 2025

Uh oh!

arsenm commented Oct 11, 2025

Uh oh!

llvmbot commented Oct 11, 2025 •

edited

Loading

Uh oh!

aengelke commented Oct 11, 2025

Uh oh!

arsenm commented Oct 12, 2025

Uh oh!

aengelke commented Oct 12, 2025

Uh oh!

Uh oh!

jhuber6 left a comment

Uh oh!

jhuber6 commented Oct 15, 2025

Uh oh!

arsenm commented Oct 16, 2025

Uh oh!

jhuber6 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AMDGPU: Use ELF mangling in data layout #163011

AMDGPU: Use ELF mangling in data layout #163011

Uh oh!

Conversation

arsenm commented Oct 11, 2025

Uh oh!

arsenm commented Oct 11, 2025

Uh oh!

llvmbot commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aengelke commented Oct 11, 2025

Uh oh!

arsenm commented Oct 12, 2025

Uh oh!

aengelke commented Oct 12, 2025

Uh oh!

Uh oh!

jhuber6 left a comment

Choose a reason for hiding this comment

Uh oh!

jhuber6 commented Oct 15, 2025

Uh oh!

arsenm commented Oct 16, 2025

Uh oh!

jhuber6 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Oct 11, 2025 •

edited

Loading