Skip to content

Conversation

@jwnrt
Copy link

@jwnrt jwnrt commented Apr 3, 2025

AVR uses separate address spaces for data (0) and program (1) memory, however pointer layout is only defined for address space 0 in the data layout string.

The p[n] directive defines pointer layouts where n defaults to 0. When pointer layout is unspecified it defaults to p[n]:64:64:64. (data layout reference)

AVR uses 16-bit pointers for both data and program memory (e.g. for function pointers): https://onlinedocs.microchip.com/oxy/GUID-BD1C16C8-7FA3-4D73-A4BE-241EE05EF592-en-US-7/GUID-5F57457F-3E41-400A-B56C-432188906974.html

Function pointers are 2 bytes in size.

AVR uses separate address spaces for data (0) and program (1) memory,
however pointer layout is only defined for address space 0 in the data
layout string.

The `p[n]` directive defines pointer layouts where `n` defaults to `0`.
When pointer layout is unspecified it defaults to `p[n]:64:64:64`.

AVR uses 16-bit pointers for both data and program memory (e.g. for
function pointers).
@github-actions
Copy link

github-actions bot commented Apr 3, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:transforms labels Apr 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-clang

Author: James Wainwright (jwnrt)

Changes

AVR uses separate address spaces for data (0) and program (1) memory, however pointer layout is only defined for address space 0 in the data layout string.

The p[n] directive defines pointer layouts where n defaults to 0. When pointer layout is unspecified it defaults to p[n]:64:64:64. (data layout reference)

AVR uses 16-bit pointers for both data and program memory (e.g. for function pointers): https://onlinedocs.microchip.com/oxy/GUID-BD1C16C8-7FA3-4D73-A4BE-241EE05EF592-en-US-7/GUID-5F57457F-3E41-400A-B56C-432188906974.html

> Function pointers are 2 bytes in size.


Full diff: https://github.com/llvm/llvm-project/pull/134254.diff

7 Files Affected:

  • (modified) clang/lib/Basic/Targets/AVR.h (+2-1)
  • (modified) llvm/lib/Target/AVR/AVRTargetMachine.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll (+1-1)
  • (modified) llvm/test/CodeGen/AVR/shift-expand.ll (+1-1)
  • (modified) llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll (+1-1)
  • (modified) llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll (+1-1)
  • (modified) llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll (+1-1)
diff --git a/clang/lib/Basic/Targets/AVR.h b/clang/lib/Basic/Targets/AVR.h
index 2117ab58e6f30..301d405adba7e 100644
--- a/clang/lib/Basic/Targets/AVR.h
+++ b/clang/lib/Basic/Targets/AVR.h
@@ -57,7 +57,8 @@ class LLVM_LIBRARY_VISIBILITY AVRTargetInfo : public TargetInfo {
     Int16Type = SignedInt;
     Char32Type = UnsignedLong;
     SigAtomicType = SignedChar;
-    resetDataLayout("e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8");
+    resetDataLayout(
+        "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8");
   }
 
   void getTargetDefines(const LangOptions &Opts,
diff --git a/llvm/lib/Target/AVR/AVRTargetMachine.cpp b/llvm/lib/Target/AVR/AVRTargetMachine.cpp
index 579f7ac712313..915c5848a2d79 100644
--- a/llvm/lib/Target/AVR/AVRTargetMachine.cpp
+++ b/llvm/lib/Target/AVR/AVRTargetMachine.cpp
@@ -27,7 +27,7 @@
 namespace llvm {
 
 static const char *AVRDataLayout =
-    "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8";
+    "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8";
 
 /// Processes a CPU name.
 static StringRef getCPU(StringRef CPU) {
diff --git a/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll b/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
index 6444c5c30fbe2..fdf9eb39b2a10 100644
--- a/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
+++ b/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
@@ -11,7 +11,7 @@
 ; This would cause a load of uninitialized memory, not even
 ; touching the program's machine code as otherwise desired.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 ; CHECK-LABEL: load_with_no_forward_reference
 define i8 @load_with_no_forward_reference(i8 %a, i8 %b) {
diff --git a/llvm/test/CodeGen/AVR/shift-expand.ll b/llvm/test/CodeGen/AVR/shift-expand.ll
index be075e5d30394..fb5b5a14c66c8 100644
--- a/llvm/test/CodeGen/AVR/shift-expand.ll
+++ b/llvm/test/CodeGen/AVR/shift-expand.ll
@@ -5,7 +5,7 @@
 ; amount to a loop. These loops avoid generating a (non-existing) builtin such
 ; as __ashlsi3.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 target triple = "avr"
 
 define i16 @shl16(i16 %value, i16 %amount) addrspace(1) {
diff --git a/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll b/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
index 6cabc5bb8f3a9..f56097fbd6deb 100644
--- a/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
@@ -4,7 +4,7 @@
 ; ArgumentPromotion should preserve the default function address space
 ; from the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 @g = common global i32 0, align 4
 
diff --git a/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll b/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
index b588a399e5bd9..4a75fb7b6d5d1 100644
--- a/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
@@ -5,7 +5,7 @@
 ; ArgumentPromotion should preserve the default function address space
 ; from the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 @g = common global i32 0, align 4
 
diff --git a/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll b/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
index ddd9aaac628d5..c992e23bb3cfe 100644
--- a/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
@@ -3,7 +3,7 @@
 ; DeadArgumentElimination should respect the function address space
 ; in the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 ; CHECK: define internal i32 @foo() addrspace(1)
 define internal i32 @foo(i32 %x) #0 {

@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-llvm-transforms

Author: James Wainwright (jwnrt)

Changes

AVR uses separate address spaces for data (0) and program (1) memory, however pointer layout is only defined for address space 0 in the data layout string.

The p[n] directive defines pointer layouts where n defaults to 0. When pointer layout is unspecified it defaults to p[n]:64:64:64. (data layout reference)

AVR uses 16-bit pointers for both data and program memory (e.g. for function pointers): https://onlinedocs.microchip.com/oxy/GUID-BD1C16C8-7FA3-4D73-A4BE-241EE05EF592-en-US-7/GUID-5F57457F-3E41-400A-B56C-432188906974.html

> Function pointers are 2 bytes in size.


Full diff: https://github.com/llvm/llvm-project/pull/134254.diff

7 Files Affected:

  • (modified) clang/lib/Basic/Targets/AVR.h (+2-1)
  • (modified) llvm/lib/Target/AVR/AVRTargetMachine.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll (+1-1)
  • (modified) llvm/test/CodeGen/AVR/shift-expand.ll (+1-1)
  • (modified) llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll (+1-1)
  • (modified) llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll (+1-1)
  • (modified) llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll (+1-1)
diff --git a/clang/lib/Basic/Targets/AVR.h b/clang/lib/Basic/Targets/AVR.h
index 2117ab58e6f30..301d405adba7e 100644
--- a/clang/lib/Basic/Targets/AVR.h
+++ b/clang/lib/Basic/Targets/AVR.h
@@ -57,7 +57,8 @@ class LLVM_LIBRARY_VISIBILITY AVRTargetInfo : public TargetInfo {
     Int16Type = SignedInt;
     Char32Type = UnsignedLong;
     SigAtomicType = SignedChar;
-    resetDataLayout("e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8");
+    resetDataLayout(
+        "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8");
   }
 
   void getTargetDefines(const LangOptions &Opts,
diff --git a/llvm/lib/Target/AVR/AVRTargetMachine.cpp b/llvm/lib/Target/AVR/AVRTargetMachine.cpp
index 579f7ac712313..915c5848a2d79 100644
--- a/llvm/lib/Target/AVR/AVRTargetMachine.cpp
+++ b/llvm/lib/Target/AVR/AVRTargetMachine.cpp
@@ -27,7 +27,7 @@
 namespace llvm {
 
 static const char *AVRDataLayout =
-    "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8";
+    "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8";
 
 /// Processes a CPU name.
 static StringRef getCPU(StringRef CPU) {
diff --git a/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll b/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
index 6444c5c30fbe2..fdf9eb39b2a10 100644
--- a/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
+++ b/llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
@@ -11,7 +11,7 @@
 ; This would cause a load of uninitialized memory, not even
 ; touching the program's machine code as otherwise desired.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 ; CHECK-LABEL: load_with_no_forward_reference
 define i8 @load_with_no_forward_reference(i8 %a, i8 %b) {
diff --git a/llvm/test/CodeGen/AVR/shift-expand.ll b/llvm/test/CodeGen/AVR/shift-expand.ll
index be075e5d30394..fb5b5a14c66c8 100644
--- a/llvm/test/CodeGen/AVR/shift-expand.ll
+++ b/llvm/test/CodeGen/AVR/shift-expand.ll
@@ -5,7 +5,7 @@
 ; amount to a loop. These loops avoid generating a (non-existing) builtin such
 ; as __ashlsi3.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 target triple = "avr"
 
 define i16 @shl16(i16 %value, i16 %amount) addrspace(1) {
diff --git a/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll b/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
index 6cabc5bb8f3a9..f56097fbd6deb 100644
--- a/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll
@@ -4,7 +4,7 @@
 ; ArgumentPromotion should preserve the default function address space
 ; from the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 @g = common global i32 0, align 4
 
diff --git a/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll b/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
index b588a399e5bd9..4a75fb7b6d5d1 100644
--- a/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/Attributor/ArgumentPromotion/nonzero-address-spaces.ll
@@ -5,7 +5,7 @@
 ; ArgumentPromotion should preserve the default function address space
 ; from the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 @g = common global i32 0, align 4
 
diff --git a/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll b/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
index ddd9aaac628d5..c992e23bb3cfe 100644
--- a/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
+++ b/llvm/test/Transforms/DeadArgElim/nonzero-address-spaces.ll
@@ -3,7 +3,7 @@
 ; DeadArgumentElimination should respect the function address space
 ; in the data layout.
 
-target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+target datalayout = "e-P1-p0:16:8-p1:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
 
 ; CHECK: define internal i32 @foo() addrspace(1)
 define internal i32 @foo(i32 %x) #0 {

@jwnrt
Copy link
Author

jwnrt commented Apr 3, 2025

CC @benshi001

To be 100% honest I haven't tested this on AVR silicon yet. My motivation for the change is to allow for non-zero data address spaces in Rust which parses these LLVM data format specs, but would then use both 64-bit and 16-bit pointers for AVR. Apologies if this is bad practice!

@nikic nikic requested a review from benshi001 April 3, 2025 14:43
@benshi001
Copy link
Member

CC @benshi001

To be 100% honest I haven't tested this on AVR silicon yet. My motivation for the change is to allow for non-zero data address spaces in Rust which parses these LLVM data format specs, but would then use both 64-bit and 16-bit pointers for AVR. Apologies if this is bad practice!

Thanks! I will have a look later.

@benshi001
Copy link
Member

benshi001 commented Apr 8, 2025

I find there is not change in the size of addrspace 0/1 pointers, before and after your patch.

for program

// clang a.c --target=avr -mmcu=atmega2560
int gf(int a);
int (*p)(int) = gf;
int f;
int *p1 = &f;

the generated assembly is unique by before/after applying your patch.

        .type   p,@object                       ; @p
        .data
        .globl  p
p:
        .short  pm(gf)
        .size   p, 2

        .type   f,@object                       ; @f
        .section        .bss,"aw",@nobits
        .globl  f
f:
        .short  0                               ; 0x0
        .size   f, 2

        .type   p1,@object                      ; @p1
        .data
        .globl  p1
p1:
        .short  f
        .size   p1, 2

Copy link
Member

@benshi001 benshi001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give me an example to show the necessity of this patch? Is it possible to control the rust front end to generate explicit data layout, other than change llvm's default? Since llvm-avr serves multiple languages, not only rust.

@jwnrt
Copy link
Author

jwnrt commented May 7, 2025

Sorry for the long delay.

I think you're right that there's no difference to the RTL. I need to test this properly and learn more about address spaces. In particular, I think address space zero has some special casing around it being the generic / fallback address space that interacts here.

I hope I didn't waste too much of your time here.

@jwnrt jwnrt marked this pull request as draft May 7, 2025 09:46
@benshi001 benshi001 closed this Jun 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants