[AMDGPU] Add an option to completely disable kernel argument preload #153975

shiltian · 2025-08-16T19:25:20Z

The existing amdgpu-kernarg-preload-count can't be used as a switch to turn it off if it is set to 0. This PR adds an extra option to turn it off.

Fixes SWDEV-550147.

The existing `amdgpu-kernarg-preload-count` can't be used as a switch to turn it off if it is set to 0. This PR adds an extra option to turn it off. Fixes SWDEV-550147.

shiltian · 2025-08-16T19:25:40Z

[AMDGPU] Add an option to completely disable kernel argument preload #153975 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-08-16T19:25:49Z

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

The existing amdgpu-kernarg-preload-count can't be used as a switch to turn it
off if it is set to 0. This PR adds an extra option to turn it off.

Fixes SWDEV-550147.

Full diff: https://github.com/llvm/llvm-project/pull/153975.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp (+8)
(added) llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll (+29)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp
index 984c1ee89309e..a386fe621a553 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp
@@ -37,6 +37,11 @@ static cl::opt<unsigned> KernargPreloadCount(
     "amdgpu-kernarg-preload-count",
     cl::desc("How many kernel arguments to preload onto SGPRs"), cl::init(0));
 
+static cl::opt<bool>
+    EnableKernargPreload("amdgpu-kernarg-preload",
+                         cl::desc("Enable preload kernel arguments to SGPRs"),
+                         cl::init(true));
+
 namespace {
 
 class AMDGPUPreloadKernelArgumentsLegacy : public ModulePass {
@@ -275,6 +280,9 @@ AMDGPUPreloadKernelArgumentsLegacy::AMDGPUPreloadKernelArgumentsLegacy(
     : ModulePass(ID), TM(TM) {}
 
 static bool markKernelArgsAsInreg(Module &M, const TargetMachine &TM) {
+  if (!EnableKernargPreload)
+    return false;
+
   SmallVector<Function *, 4> FunctionsToErase;
   bool Changed = false;
   for (auto &F : M) {
diff --git a/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll b/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll
new file mode 100644
index 0000000000000..75aaec6f1fa70
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll
@@ -0,0 +1,29 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -passes=amdgpu-preload-kernel-arguments -amdgpu-kernarg-preload=0 %s -o - | FileCheck -check-prefix=NO-PRELOAD %s
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -passes=amdgpu-preload-kernel-arguments %s -o - | FileCheck -check-prefix=DEFAULT-PRELOAD %s
+
+@g1 = protected addrspace(1) externally_initialized global i16 0, align 2
+
+define amdgpu_kernel void @test_kernel_with_zero_kernel_arg() {
+; NO-PRELOAD-LABEL: define amdgpu_kernel void @test_kernel_with_zero_kernel_arg(
+; NO-PRELOAD-SAME: ) #[[ATTR0:[0-9]+]] {
+; NO-PRELOAD-NEXT:    [[IMPLICITARG_PTR:%.*]] = call ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
+; NO-PRELOAD-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr addrspace(4) [[IMPLICITARG_PTR]], i64 12
+; NO-PRELOAD-NEXT:    [[GROUP_SIZE_X:%.*]] = load i16, ptr addrspace(4) [[GEP]], align 2
+; NO-PRELOAD-NEXT:    store i16 [[GROUP_SIZE_X]], ptr addrspace(1) @g1, align 2
+; NO-PRELOAD-NEXT:    ret void
+;
+; DEFAULT-PRELOAD-LABEL: define amdgpu_kernel void @test_kernel_with_zero_kernel_arg(
+; DEFAULT-PRELOAD-SAME: i32 inreg "amdgpu-hidden-argument" [[_HIDDEN_BLOCK_COUNT_X:%.*]], i32 inreg "amdgpu-hidden-argument" [[_HIDDEN_BLOCK_COUNT_Y:%.*]], i32 inreg "amdgpu-hidden-argument" [[_HIDDEN_BLOCK_COUNT_Z:%.*]], i16 inreg "amdgpu-hidden-argument" [[_HIDDEN_GROUP_SIZE_X:%.*]]) #[[ATTR0:[0-9]+]] {
+; DEFAULT-PRELOAD-NEXT:    [[IMPLICITARG_PTR:%.*]] = call ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
+; DEFAULT-PRELOAD-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr addrspace(4) [[IMPLICITARG_PTR]], i64 12
+; DEFAULT-PRELOAD-NEXT:    [[GROUP_SIZE_X:%.*]] = load i16, ptr addrspace(4) [[GEP]], align 2
+; DEFAULT-PRELOAD-NEXT:    store i16 [[_HIDDEN_GROUP_SIZE_X]], ptr addrspace(1) @g1, align 2
+; DEFAULT-PRELOAD-NEXT:    ret void
+;
+  %implicitarg.ptr = call ptr addrspace(4) @llvm.amdgcn.implicitarg.ptr()
+  %gep = getelementptr inbounds i8, ptr addrspace(4) %implicitarg.ptr, i64 12
+  %group_size_x = load i16, ptr addrspace(4) %gep
+  store i16 %group_size_x, ptr addrspace(1) @g1
+  ret void
+}

tgymnich · 2025-08-16T22:34:53Z

Have you considered making amdgpu-kernarg-preload-count=0 work as intended instead? Would be less confusing.

shiltian · 2025-08-17T01:47:56Z

Yes, I have. As I understand it, amdgpu-kernarg-preload-count only controls how many explicit kernel arguments are preloaded. It doesn't affect existing inreg arguments or implicit kernel arguments.

tgymnich

Would be nice to come up with a name that would reflect this subtle difference. But I cannot find a better name either.

mikaelholmen · 2025-08-19T05:07:39Z

Hi @shiltian

If built with EXPENSIVE_CHECKS, then running the new testcase fails like this:

LLVM ERROR: Module changed by AMDGPUPreloadKernelArgumentsPass without invalidating analyses
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /repo/llvm/build-all-expensive/bin/opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -passes=amdgpu-preload-kernel-arguments /repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll -o -
1.	Running pass "amdgpu-preload-kernel-arguments" on module "/repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll"
 #0 0x000056368faf3636 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/repo/llvm/build-all-expensive/bin/opt+0x4dab636)
 #1 0x000056368faf0bc5 llvm::sys::RunSignalHandlers() (/repo/llvm/build-all-expensive/bin/opt+0x4da8bc5)
 #2 0x000056368faf4809 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007fbfbdae3990 __restore_rt (/lib64/libpthread.so.0+0x12990)
 #4 0x00007fbfbcb8b52f raise (/lib64/libc.so.6+0x4e52f)
 #5 0x00007fbfbcb5ee65 abort (/lib64/libc.so.6+0x21e65)
 #6 0x000056368fab9254 llvm::report_fatal_error(llvm::Twine const&, bool) (/repo/llvm/build-all-expensive/bin/opt+0x4d71254)
 #7 0x000056369120b478 void llvm::detail::UniqueFunctionBase<void, llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&>::CallImpl<llvm::PreservedCFGCheckerInstrumentation::registerCallbacks(llvm::PassInstrumentationCallbacks&, llvm::AnalysisManager<llvm::Module>&)::$_2>(void*, llvm::StringRef, llvm::Any&, llvm::PreservedAnalyses const&) StandardInstrumentations.cpp:0:0
 #8 0x000056368fd43cae llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/repo/llvm/build-all-expensive/bin/opt+0x4ffbcae)
 #9 0x00005636911d96e7 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool, bool) (/repo/llvm/build-all-expensive/bin/opt+0x64916e7)
#10 0x000056368fa8dc9f optMain (/repo/llvm/build-all-expensive/bin/opt+0x4d45c9f)
#11 0x00007fbfbcb777e5 __libc_start_main (/lib64/libc.so.6+0x3a7e5)
#12 0x000056368fa8b2ee _start (/repo/llvm/build-all-expensive/bin/opt+0x4d432ee)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /repo/llvm/build-all-expensive/bin/FileCheck -check-prefix=DEFAULT-PRELOAD /repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll

mshockwave · 2025-08-20T22:46:40Z

Hi @shiltian

If built with EXPENSIVE_CHECKS, then running the new testcase fails like this:

LLVM ERROR: Module changed by AMDGPUPreloadKernelArgumentsPass without invalidating analyses
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /repo/llvm/build-all-expensive/bin/opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -passes=amdgpu-preload-kernel-arguments /repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll -o -
1.	Running pass "amdgpu-preload-kernel-arguments" on module "/repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll"
 #0 0x000056368faf3636 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/repo/llvm/build-all-expensive/bin/opt+0x4dab636)
 #1 0x000056368faf0bc5 llvm::sys::RunSignalHandlers() (/repo/llvm/build-all-expensive/bin/opt+0x4da8bc5)
 #2 0x000056368faf4809 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007fbfbdae3990 __restore_rt (/lib64/libpthread.so.0+0x12990)
 #4 0x00007fbfbcb8b52f raise (/lib64/libc.so.6+0x4e52f)
 #5 0x00007fbfbcb5ee65 abort (/lib64/libc.so.6+0x21e65)
 #6 0x000056368fab9254 llvm::report_fatal_error(llvm::Twine const&, bool) (/repo/llvm/build-all-expensive/bin/opt+0x4d71254)
 #7 0x000056369120b478 void llvm::detail::UniqueFunctionBase<void, llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&>::CallImpl<llvm::PreservedCFGCheckerInstrumentation::registerCallbacks(llvm::PassInstrumentationCallbacks&, llvm::AnalysisManager<llvm::Module>&)::$_2>(void*, llvm::StringRef, llvm::Any&, llvm::PreservedAnalyses const&) StandardInstrumentations.cpp:0:0
 #8 0x000056368fd43cae llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/repo/llvm/build-all-expensive/bin/opt+0x4ffbcae)
 #9 0x00005636911d96e7 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool, bool) (/repo/llvm/build-all-expensive/bin/opt+0x64916e7)
#10 0x000056368fa8dc9f optMain (/repo/llvm/build-all-expensive/bin/opt+0x4d45c9f)
#11 0x00007fbfbcb777e5 __libc_start_main (/lib64/libc.so.6+0x3a7e5)
#12 0x000056368fa8b2ee _start (/repo/llvm/build-all-expensive/bin/opt+0x4d432ee)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /repo/llvm/build-all-expensive/bin/FileCheck -check-prefix=DEFAULT-PRELOAD /repo/llvm/test/CodeGen/AMDGPU/disable-preload-kernargs.ll

It was not directly caused by the changes made in this patch, but rather the test case itself triggers the error that was not previously caught. I sent a fix: #154645

#154645) #153975 added a new test, `test/CodeGen/AMDGPU/disable-preload-kernargs.ll`, that triggers an assertion under `LLVM_ENABLE_EXPENSIVE_CHECKS` complaining about not invalidating analyses even when the Pass made changes. It was caused by the fact that the Pass only invalidates the analyses when number of explicit arguments is greater than zero, while it is possible that some functions will be removed even when there isn't any explicit argument, hence the missed invalidation.

[AMDGPU] Add an option to completely disable kernel argument preload

2a6aedc

The existing `amdgpu-kernarg-preload-count` can't be used as a switch to turn it off if it is set to 0. This PR adds an extra option to turn it off. Fixes SWDEV-550147.

llvmbot added the backend:AMDGPU label Aug 16, 2025

shiltian requested review from arsenm, cdevadas and kerbowa August 16, 2025 19:25

tgymnich approved these changes Aug 17, 2025

View reviewed changes

shiltian merged commit e37eff5 into main Aug 18, 2025
11 checks passed

shiltian deleted the users/shiltian/skip-amdgpu-preload-kernel-arguments branch August 18, 2025 13:44

mshockwave mentioned this pull request Aug 20, 2025

[AMDGPU] Fix uncaught changes made by AMDGPUPreloadKernelArgumentsPass #154645

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Add an option to completely disable kernel argument preload #153975

[AMDGPU] Add an option to completely disable kernel argument preload #153975

Uh oh!

shiltian commented Aug 16, 2025 •

edited

Loading

Uh oh!

shiltian commented Aug 16, 2025

Uh oh!

llvmbot commented Aug 16, 2025

Uh oh!

tgymnich commented Aug 16, 2025

Uh oh!

shiltian commented Aug 17, 2025

Uh oh!

tgymnich left a comment

Uh oh!

Uh oh!

mikaelholmen commented Aug 19, 2025

Uh oh!

mshockwave commented Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[AMDGPU] Add an option to completely disable kernel argument preload #153975

[AMDGPU] Add an option to completely disable kernel argument preload #153975

Uh oh!

Conversation

shiltian commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shiltian commented Aug 16, 2025

Uh oh!

llvmbot commented Aug 16, 2025

Uh oh!

tgymnich commented Aug 16, 2025

Uh oh!

shiltian commented Aug 17, 2025

Uh oh!

tgymnich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikaelholmen commented Aug 19, 2025

Uh oh!

mshockwave commented Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

shiltian commented Aug 16, 2025 •

edited

Loading