[AMDGPU] Use ELF mangling in data layout #95219

aengelke · 2024-06-12T09:57:08Z

Previously, the MCAsmInfo specified ELF mangling whereas the data layout specified no mangling. Avoid this inconsistency by adjusting the data layout to use ELF mangling.

For context: there's a getPrivateGlobalPrefix in the DataLayout and in the MCAsmInfo. The MAI version is mainly used in MCContext, which has no DataLayout, but determines whether a symbol gets added to the symbol table in the output. I think both versions of the prefix should be identical so that private symbols never end up in the symbol table, but they weren't identical for AMDGPU.

I'm not sure whether this is correct, causes problems down the line, etc., but I stumbled across this inconsistency and just wanted to let you know it exists. :)

Previously, the MCAsmInfo specified ELF mangling whereas the data layout specified no mangling. Avoid this inconsistency by adjusting the data layout to use ELF mangling.

llvmbot · 2024-06-12T09:57:38Z

@llvm/pr-subscribers-backend-amdgpu

Author: None (aengelke)

Changes

Previously, the MCAsmInfo specified ELF mangling whereas the data layout specified no mangling. Avoid this inconsistency by adjusting the data layout to use ELF mangling.

For context: there's a getPrivateGlobalPrefix in the DataLayout and in the MCAsmInfo. The MAI version is mainly used in MCContext, which has no DataLayout, but determines whether a symbol get's added to the symbol table in the output. I think both versions of the prefix should be identical so that private symbols never end up in the symbol table, but they weren't identical for AMDGPU.

I'm not sure whether this is correct, causes problems down the line, etc., but I stumbled across this inconsistency and just wanted to let you know it exists. :)

Full diff: https://github.com/llvm/llvm-project/pull/95219.diff

6 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/global-constant.ll (+10-10)
(modified) llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll (+1-1)
(modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected (+4-4)
(modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.nogenerated.expected (+4-4)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index ce997c659094a..d7574e57b3533 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -547,7 +547,7 @@ static MachineSchedRegistry GCNILPSchedRegistry(
 static StringRef computeDataLayout(const Triple &TT) {
   if (TT.getArch() == Triple::r600) {
     // 32-bit pointers.
-    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
+    return "e-m:e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128"
            "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1";
   }
 
@@ -557,7 +557,7 @@ static StringRef computeDataLayout(const Triple &TT) {
   // (address space 7), and 128-bit non-integral buffer resourcees (address
   // space 8) which cannot be non-trivilally accessed by LLVM memory operations
   // like getelementptr.
-  return "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
+  return "e-m:e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
          "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-"
          "v32:32-v48:64-v96:"
          "128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-"
diff --git a/llvm/test/CodeGen/AMDGPU/global-constant.ll b/llvm/test/CodeGen/AMDGPU/global-constant.ll
index 38b9c5df7faa1..2d5021782c5b8 100644
--- a/llvm/test/CodeGen/AMDGPU/global-constant.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-constant.ll
@@ -12,21 +12,21 @@
 
 ; Non-R600 OSes use relocations.
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC0_LO:[0-9]+]]:[[PC0_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], private1@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], private1@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC0_LO]], .Lprivate1@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC0_HI]], .Lprivate1@rel32@hi+12
 ; GCN-DEFAULT: s_getpc_b64 s[[[PC1_LO:[0-9]+]]:[[PC1_HI:[0-9]+]]]
-; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], private2@rel32@lo+4
-; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], private2@rel32@hi+12
+; GCN-DEFAULT: s_add_u32 s{{[0-9]+}}, s[[PC1_LO]], .Lprivate2@rel32@lo+4
+; GCN-DEFAULT: s_addc_u32 s{{[0-9]+}}, s[[PC1_HI]], .Lprivate2@rel32@hi+12
 
 ; MESA uses absolute relocations.
-; GCN-MESA: s_add_u32 s2, private1@abs32@lo, s4
-; GCN-MESA: s_addc_u32 s3, private1@abs32@hi, s5
+; GCN-MESA: s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-MESA: s_addc_u32 s3, .Lprivate1@abs32@hi, s5
 
 ; PAL uses absolute relocations.
-; GCN-PAL:    s_add_u32 s2, private1@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s3, private1@abs32@hi, s5
-; GCN-PAL:    s_add_u32 s4, private2@abs32@lo, s4
-; GCN-PAL:    s_addc_u32 s5, private2@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s2, .Lprivate1@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s3, .Lprivate1@abs32@hi, s5
+; GCN-PAL:    s_add_u32 s4, .Lprivate2@abs32@lo, s4
+; GCN-PAL:    s_addc_u32 s5, .Lprivate2@abs32@hi, s5
 
 ; R600-LABEL: private_test
 define amdgpu_kernel void @private_test(i32 %index, ptr addrspace(1) %out) {
diff --git a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
index b8cfcbf2d2665..6d55e79edbef6 100644
--- a/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
@@ -14,8 +14,8 @@
 
 ; CHECK-LABEL: private_test:
 ; CHECK: s_getpc_b64 s[[[PC_LO:[0-9]+]]:[[PC_HI:[0-9]+]]]
-; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], private@rel32@lo+8
-; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], private@rel32@hi+16
+; CHECK: s_add_u32 s[[ADDR_LO:[0-9]+]], s[[PC_LO]], .Lprivate@rel32@lo+8
+; CHECK: s_addc_u32 s[[ADDR_HI:[0-9]+]], s[[PC_HI]], .Lprivate@rel32@hi+16
 ; CHECK: s_load_dword s{{[0-9]+}}, s[[[ADDR_LO]]:[[ADDR_HI]]]
 define amdgpu_kernel void @private_test(ptr addrspace(1) %out) {
   %ptr = getelementptr [256 x i32], ptr addrspace(1) @private, i32 0, i32 1
@@ -153,7 +153,7 @@ define amdgpu_kernel void @external_w_init_test(ptr addrspace(1) %out) {
   ret void
 }
 
-; CHECK: .local private
+; CHECK: .local .Lprivate
 ; CHECK: .local internal
 ; CHECK: .weak linkonce
 ; CHECK: .weak weak
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
index 41e8762311306..ac64d19043a3a 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.memcpy.ll
@@ -315,7 +315,7 @@ define amdgpu_kernel void @test_small_memcpy_i64_global_to_global_align16(ptr ad
 
 ; FUNC-LABEL: {{^}}test_memcpy_const_string_align4:
 ; SI: s_getpc_b64
-; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, hello.align4@rel32@lo+4
+; SI: s_add_u32 s{{[0-9]+}}, s{{[0-9]+}}, .Lhello.align4@rel32@lo+4
 ; SI: s_addc_u32
 ; SI-DAG: s_load_dwordx8
 ; SI-DAG: s_load_dwordx2
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
index d1500e002d7e9..6d5ad3704119a 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.generated.expected
@@ -65,8 +65,8 @@ define dso_local i32 @main() #0 {
 
 attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-LABEL: check_boundaries:
-; CHECK:       check_boundaries$local:
-; CHECK-NEXT:    .type check_boundaries$local,@function
+; CHECK:       .Lcheck_boundaries$local:
+; CHECK-NEXT:    .type .Lcheck_boundaries$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
@@ -107,8 +107,8 @@ attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; CHECK-LABEL: main:
-; CHECK:       main$local:
-; CHECK-NEXT:    .type main$local,@function
+; CHECK:       .Lmain$local:
+; CHECK-NEXT:    .type .Lmain$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.nogenerated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.nogenerated.expected
index deadc4adb02c5..0d5afc6c36b60 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.nogenerated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_generated_funcs.ll.nogenerated.expected
@@ -6,8 +6,8 @@
 
 define dso_local i32 @check_boundaries() #0 {
 ; CHECK-LABEL: check_boundaries:
-; CHECK:       check_boundaries$local:
-; CHECK-NEXT:    .type check_boundaries$local,@function
+; CHECK:       .Lcheck_boundaries$local:
+; CHECK-NEXT:    .type .Lcheck_boundaries$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
@@ -84,8 +84,8 @@ define dso_local i32 @check_boundaries() #0 {
 
 define dso_local i32 @main() #0 {
 ; CHECK-LABEL: main:
-; CHECK:       main$local:
-; CHECK-NEXT:    .type main$local,@function
+; CHECK:       .Lmain$local:
+; CHECK-NEXT:    .type .Lmain$local,@function
 ; CHECK-NEXT:    .cfi_startproc
 ; CHECK-NEXT:  ; %bb.0:
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)

arsenm · 2024-06-12T10:53:43Z

What is the datalayout default, if not elf?

aengelke · 2024-06-12T10:56:33Z

The mangling mode currently is MM_None.

ssahasra · 2024-06-12T11:37:20Z

Looks straightforward enough to me, but I don't really have an opinion. @kzhuravl, do we need to check with the loader or the runtime about the default mangling?

slinder1

LGTM, I wonder if we want to mass update any tests with a datalayout, or if that isn't worth the churn?

I don't imagine this will have an impact on the loader or anything else downstream, or at least if it does it represents a bug there rather than in the compiler.

arsenm · 2025-10-11T02:15:38Z

This should be resolved for the move of the DL specification to TargetParser

Closes #95219

aengelke · 2025-10-11T19:45:22Z

Closing in favor of #163011. Modifying data layout strings remains a mess.

Closes #95219

Closes llvm#95219

[AMDGPU] Use ELF mangling in data layout

621a031

Previously, the MCAsmInfo specified ELF mangling whereas the data layout specified no mangling. Avoid this inconsistency by adjusting the data layout to use ELF mangling.

aengelke requested review from arsenm, jayfoad and nhaehnle June 12, 2024 09:57

llvmbot added the backend:AMDGPU label Jun 12, 2024

arsenm requested review from kzhuravl, slinder1, ssahasra and t-tye June 12, 2024 10:52

arsenm approved these changes Jun 12, 2024

View reviewed changes

slinder1 approved these changes Jul 11, 2025

View reviewed changes

arsenm added a commit that referenced this pull request Oct 11, 2025

AMDGPU: Use ELF mangling in data layout

d4aca48

Closes #95219

arsenm mentioned this pull request Oct 11, 2025

AMDGPU: Use ELF mangling in data layout #163011

Merged

aengelke closed this Oct 11, 2025

arsenm added a commit that referenced this pull request Oct 12, 2025

AMDGPU: Use ELF mangling in data layout

0a83907

Closes #95219

arsenm added a commit that referenced this pull request Oct 12, 2025

AMDGPU: Use ELF mangling in data layout

a23bc0d

Closes #95219

arsenm added a commit that referenced this pull request Oct 12, 2025

AMDGPU: Use ELF mangling in data layout

22eaf7d

Closes #95219

arsenm added a commit that referenced this pull request Oct 13, 2025

AMDGPU: Use ELF mangling in data layout

fc32794

Closes #95219

arsenm added a commit that referenced this pull request Oct 13, 2025

AMDGPU: Use ELF mangling in data layout (#163011)

853760b

Closes #95219

DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025

AMDGPU: Use ELF mangling in data layout (llvm#163011)

b6a9802

Closes llvm#95219

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Use ELF mangling in data layout #95219

[AMDGPU] Use ELF mangling in data layout #95219

Uh oh!

aengelke commented Jun 12, 2024 •

edited by arsenm

Loading

Uh oh!

llvmbot commented Jun 12, 2024

Uh oh!

arsenm commented Jun 12, 2024

Uh oh!

aengelke commented Jun 12, 2024

Uh oh!

ssahasra commented Jun 12, 2024

Uh oh!

slinder1 left a comment

Uh oh!

arsenm commented Oct 11, 2025

Uh oh!

aengelke commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[AMDGPU] Use ELF mangling in data layout #95219

[AMDGPU] Use ELF mangling in data layout #95219

Uh oh!

Conversation

aengelke commented Jun 12, 2024 • edited by arsenm Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 12, 2024

Uh oh!

arsenm commented Jun 12, 2024

Uh oh!

aengelke commented Jun 12, 2024

Uh oh!

ssahasra commented Jun 12, 2024

Uh oh!

slinder1 left a comment

Choose a reason for hiding this comment

Uh oh!

arsenm commented Oct 11, 2025

Uh oh!

aengelke commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

aengelke commented Jun 12, 2024 •

edited by arsenm

Loading