[NVPTX] Annotate CUDA kernel pointer arguments with .ptr .space .align attributes. #79646

Vandana2896 · 2024-01-26T21:05:09Z

The current issue is PTX doesn't vectorise load and stores that can be vectorized.

We noticed that we were missing vectorization for sin, cos and power operations from LLVM resulting in lesser speedup. The reason is currently we don't generate any .ptr and .align attributes for kernel parameters in CUDA and the required alignment information is missing. This results in missing out on vectorization opportunities.
The change enables adding .align attribute for alignment information and .ptr attribute for kernel pointers in kernel parameters under the assumption that all kernel parameters pointers point to global memory space. This results in vectorization and boosting the speedup by ~2x.
.align is enabled only when the pointer has explicit alignment specifier.

PTX ISA doc - https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-parameter-attribute-ptr

github-actions · 2024-01-26T21:05:28Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Vandana2896 · 2024-01-26T21:37:40Z

@Artem-B could you review this?

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp

github-actions · 2024-02-02T03:47:37Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Clang formatting reverted.

Vandana2896 · 2024-02-20T22:51:55Z

@Artem-B could you review this? Thanks.

Artem-B

LGTM in general, but there's a puzzle.

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp

llvm/test/CodeGen/NVPTX/kernel-param-align.ll

…to main

Artem-B

LGTM overall.

llvm/test/CodeGen/NVPTX/kernel-param-align.ll

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp

llvm/test/CodeGen/NVPTX/kernel-param-align.ll

Artem-B

Just in case -- do all other LLVM tests pass?

If they do, you're good to go.

Artem-B · 2024-05-20T17:46:45Z

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp

            // CUDA kernels assume that pointers are in global address space
            // See:
            // https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parameter-state-space
-            assert(addrSpace == 0 && "Invalid address space");


I think this assertion is still (partially?) valid. The only case when we want to emit .global is if the pointer is generic or explicitly in the global AS.

Artem-B · 2024-05-20T17:50:24Z

llvm/test/CodeGen/NVPTX/kernel-param-align.ll

+; CHECK-LABEL: func_align
+; CHECK: .param .u64 .ptr .global .align 16 func_align_param_0,
+; CHECK: .param .u64 .ptr .global func_align_param_1,
+; CHECK: .param .u32 .ptr .global func_align_param_2


This is still wrong. addrspace(3) is not a global pointer. It's a shared AS pointer.

Speaking of which, please also add a test for explicitly annotated const and global AS pointers.

LewisCrawford · 2024-11-04T21:40:45Z

I've been asked to pick up the upstreaming of this patch, so I've created a new PR in #114874 to build on the existing commits here, and address the review feedback so far.

I'll close this PR, and we can continue the discussion over on #114874 to see whether the latest comments are suitably addressed now.

…114874) Emit .ptr, .address-space, and .align attributes for kernel args in CUDA (previously handled only for OpenCL). This allows for more vectorization opportunities if the PTX consumer is able to know about the pointer alignments. If no alignment is explicitly specified, .align 1 will be emitted to match the LLVM IR semantics in this case. PTX ISA doc - https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-parameter-attribute-ptr This is a rework of the original patch proposed in #79646 --------- Co-authored-by: Vandana <[email protected]>

Enable .ptr .global .align attributes for kernel attributes for CUDA

d5bd021

Vandana2896 changed the title ~~Enable .ptr .global .align attributes for kernel attributes for CUDA~~ [NVPTX] Enable .ptr .global .align attributes for kernel attributes for CUDA Jan 26, 2024

Artem-B reviewed Jan 26, 2024

View reviewed changes

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp Outdated Show resolved Hide resolved

Artem-B changed the title ~~[NVPTX] Enable .ptr .global .align attributes for kernel attributes for CUDA~~ [NVPTX] Annotate CUDA kernel pointer arguments with .ptr .space .align attributes. Jan 26, 2024

Rearrange code, add comment

a952cc1

Vandana2896 and others added 2 commits February 1, 2024 19:50

Fixed clang formatting

f5e7276

Update NVPTXAsmPrinter.cpp

761d8a0

Clang formatting reverted.

Update NVPTXAsmPrinter.cpp

3d49f30

Artem-B reviewed Feb 21, 2024

View reviewed changes

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp Outdated Show resolved Hide resolved

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Outdated Show resolved Hide resolved

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Outdated Show resolved Hide resolved

Vandana2896 added 3 commits March 11, 2024 04:28

Update .global and .align

424667b

Merge branch 'main' of https://github.com/Vandana2896/llvm-project in…

637da98

…to main

Fix comment

3912211

Artem-B approved these changes Apr 4, 2024

View reviewed changes

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Outdated Show resolved Hide resolved

llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp Show resolved Hide resolved

add addrspace

ce29dc1

Artem-B reviewed Apr 25, 2024

View reviewed changes

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Show resolved Hide resolved

Artem-B reviewed Apr 25, 2024

View reviewed changes

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Show resolved Hide resolved

Vandana2896 added 2 commits April 25, 2024 10:58

Fix typo

fbedea5

CHECK-LABEL

75cce1b

Artem-B reviewed Apr 25, 2024

View reviewed changes

llvm/test/CodeGen/NVPTX/kernel-param-align.ll Outdated Show resolved Hide resolved

Update testcase

9fc5d6c

Artem-B approved these changes Apr 29, 2024

View reviewed changes

Vandana2896 and others added 2 commits April 30, 2024 08:05

Merge branch 'main' into main

9171d70

upadte test

14291a9

Artem-B requested changes May 20, 2024

View reviewed changes

LewisCrawford mentioned this pull request Nov 4, 2024

Enable .ptr .global .align attributes for kernel attributes for CUDA #114874

Merged

LewisCrawford closed this Nov 4, 2024

[NVPTX] Annotate CUDA kernel pointer arguments with .ptr .space .align attributes. #79646

[NVPTX] Annotate CUDA kernel pointer arguments with .ptr .space .align attributes. #79646

Uh oh!

Conversation

Vandana2896 commented Jan 26, 2024

Uh oh!

github-actions bot commented Jan 26, 2024

Uh oh!

Vandana2896 commented Jan 26, 2024

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vandana2896 commented Feb 20, 2024

Uh oh!

Artem-B left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Artem-B left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Artem-B left a comment

Choose a reason for hiding this comment

Uh oh!

Artem-B May 20, 2024

Choose a reason for hiding this comment

Uh oh!

Artem-B May 20, 2024

Choose a reason for hiding this comment

Uh oh!

LewisCrawford commented Nov 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Feb 2, 2024 •

edited

Loading