Skip to content

Conversation

@AlexMaclean
Copy link
Member

the PTX target allows an alignment to be specified on both return values and parameters to allow for more efficient vectorized stores. Currently we represent these parameter alignments via the "alignstack" attribute, but must fall back to metadata for the return value. This PR allows "alignstack" on return values as well.

@llvmbot
Copy link
Member

llvmbot commented Mar 8, 2025

@llvm/pr-subscribers-backend-nvptx

@llvm/pr-subscribers-llvm-ir

Author: Alex MacLean (AlexMaclean)

Changes

the PTX target allows an alignment to be specified on both return values and parameters to allow for more efficient vectorized stores. Currently we represent these parameter alignments via the "alignstack" attribute, but must fall back to metadata for the return value. This PR allows "alignstack" on return values as well.


Full diff: https://github.com/llvm/llvm-project/pull/130439.diff

3 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+6-6)
  • (modified) llvm/include/llvm/IR/Attributes.td (+1-1)
  • (modified) llvm/test/CodeGen/NVPTX/param-overalign.ll (+8-3)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 33c85c7ba9d29..8216b15512d9f 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1616,12 +1616,12 @@ Currently, only the following parameter attributes are defined:
 
 ``alignstack(<n>)``
     This indicates the alignment that should be considered by the backend when
-    assigning this parameter to a stack slot during calling convention
-    lowering. The enforcement of the specified alignment is target-dependent,
-    as target-specific calling convention rules may override this value. This
-    attribute serves the purpose of carrying language specific alignment
-    information that is not mapped to base types in the backend (for example,
-    over-alignment specification through language attributes).
+    assigning this parameter or return value to a stack slot during calling
+    convention lowering. The enforcement of the specified alignment is
+    target-dependent, as target-specific calling convention rules may override
+    this value. This attribute serves the purpose of carrying language specific
+    alignment information that is not mapped to base types in the backend (for
+    example, over-alignment specification through language attributes).
 
 ``allocalign``
     The function parameter marked with this attribute is the alignment in bytes of the
diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td
index 70b9a2c488d3e..fb94926043fc7 100644
--- a/llvm/include/llvm/IR/Attributes.td
+++ b/llvm/include/llvm/IR/Attributes.td
@@ -291,7 +291,7 @@ def SExt : EnumAttr<"signext", IntersectPreserve, [ParamAttr, RetAttr]>;
 
 /// Alignment of stack for function (3 bits)  stored as log2 of alignment with
 /// +1 bias 0 means unaligned (different from alignstack=(1)).
-def StackAlignment : IntAttr<"alignstack", IntersectPreserve, [FnAttr, ParamAttr]>;
+def StackAlignment : IntAttr<"alignstack", IntersectPreserve, [FnAttr, ParamAttr, RetAttr]>;
 
 /// Function can be speculated.
 def Speculatable : EnumAttr<"speculatable", IntersectAnd, [FnAttr]>;
diff --git a/llvm/test/CodeGen/NVPTX/param-overalign.ll b/llvm/test/CodeGen/NVPTX/param-overalign.ll
index b71b50a73f7cb..374475a29ffa1 100644
--- a/llvm/test/CodeGen/NVPTX/param-overalign.ll
+++ b/llvm/test/CodeGen/NVPTX/param-overalign.ll
@@ -45,7 +45,7 @@ define float @caller_md(float %a, float %b) {
   ret float %r
 }
 
-define float @callee_md(%struct.float2 %a) {
+define float @callee_md(%struct.float2 alignstack(8) %a) {
 ; CHECK-LABEL: .visible .func  (.param .b32 func_retval0) callee_md(
 ; CHECK-NEXT:         .param .align 8 .b8 callee_md_param_0[8]
 ; CHECK-NEXT: )
@@ -105,5 +105,10 @@ define float @callee(%struct.float2 alignstack(8) %a ) {
   ret float %2
 }
 
-!nvvm.annotations = !{!0}
-!0 = !{ptr @callee_md, !"align", i32 u0x00010008}
+define alignstack(8) %struct.float2 @aligned_return(%struct.float2 %a ) {
+; CHECK-LABEL: .visible .func  (.param .align 8 .b8 func_retval0[8]) aligned_return(
+; CHECK-NEXT:         .param .align 4 .b8 aligned_return_param_0[8]
+; CHECK-NEXT: )
+; CHECK-NEXT: {
+  ret %struct.float2 %a
+}

@AlexMaclean AlexMaclean requested a review from Artem-B March 17, 2025 16:37
@AlexMaclean
Copy link
Member Author

@nikic Ping for review when you have a moment, thanks!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlexMaclean AlexMaclean merged commit 0191307 into llvm:main Mar 17, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants