Skip to content

Conversation

@trenouf
Copy link
Collaborator

@trenouf trenouf commented Jul 11, 2025

For each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function.

For each function with the AMDGPU_CS_Chain calling convention, with
dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the
function symbol, plus an offset encoding one less than the number of
VGPR blocks used by the function (16 VGPRs per block, no more than 128)
in bits 5..3 of the symbol value. This is used by a front-end to have
functions that are chained rather than called, and a dispatcher that
dynamically resizes the VGPR count before dispatching to a function.
@llvmbot
Copy link
Member

llvmbot commented Jul 11, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Tim Renouf (trenouf)

Changes

For each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function.


Full diff: https://github.com/llvm/llvm-project/pull/148251.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+26)
  • (added) llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll (+12)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 749b9efc81378..00ed5f57967ce 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -194,6 +194,32 @@ void AMDGPUAsmPrinter::emitFunctionBodyStart() {
     return;
   }
 
+  if (STM.isDynamicVGPREnabled() &&
+      MF->getFunction().getCallingConv() == CallingConv::AMDGPU_CS_Chain) {
+    // Add a _dvgpr$ symbol, with the value of the function symbol, plus an
+    // offset encoding one less than the number of VGPR blocks used by the
+    // function (16 VGPRs per block, no more than 128) in bits 5..3 of the
+    // symbol value. This is used by a front-end to have functions that are
+    // chained rather than called, and a dispatcher that dynamically resizes
+    // the VGPR count before dispatching to a function.
+    ResourceUsage = &getAnalysis<AMDGPUResourceUsageAnalysis>();
+    const AMDGPUResourceUsageAnalysis::SIFunctionResourceInfo &Info =
+        ResourceUsage->getResourceInfo();
+    MCContext &Ctx = MF->getContext();
+    unsigned EncodedNumVGPRs = (Info.NumVGPR - 1) >> 1 & 0x38;
+    MCSymbol *CurPCSym = Ctx.createTempSymbol();
+    OutStreamer->emitLabel(CurPCSym);
+    const MCExpr *DVgprFuncVal = MCBinaryExpr::createAdd(
+        MCSymbolRefExpr::create(CurPCSym, MCSymbolRefExpr::VK_None, Ctx),
+        MCConstantExpr::create(EncodedNumVGPRs, Ctx), Ctx);
+    MCSymbol *DVgprFuncSym =
+        Ctx.getOrCreateSymbol(Twine("_dvgpr$") + MF->getFunction().getName());
+    OutStreamer->emitAssignment(DVgprFuncSym, DVgprFuncVal);
+    cast<MCSymbolELF>(DVgprFuncSym)
+        ->setBinding(
+            cast<MCSymbolELF>(getSymbol(&MF->getFunction()))->getBinding());
+  }
+
   if (!MFI.isEntryFunction())
     return;
 
diff --git a/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll b/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
new file mode 100644
index 0000000000000..992963d304ead
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
@@ -0,0 +1,12 @@
+; Test generation of _dvgpr$ symbol for an amdgpu_cs_chain function with +dynamic-vgpr.
+
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx1200 -asm-verbose=0 < %s | FileCheck -check-prefixes=DVGPR %s
+
+; DVGPR-LABEL: func:
+; DVGPR: .Ltmp0:
+; DVGPR: .set _dvgpr$func, .Ltmp0+{{[0-9]+}}
+
+define amdgpu_cs_chain void @func() #0 {
+  ret void
+}
+attributes #0 = { "target-features"="+dynamic-vgpr" }

@arsenm arsenm requested review from jhuber6 and rovka July 14, 2025 06:38
trenouf added 2 commits July 15, 2025 21:24
* Use new func attr;
* allow 16 or 32 block size;
* put code in its own func;
* enhance test, including anonymous func;
* fix name, visibility and linkage
@trenouf
Copy link
Collaborator Author

trenouf commented Aug 3, 2025

@arsenm Is this ok to merge now?

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nits, missed check of anonymous function test

@trenouf
Copy link
Collaborator Author

trenouf commented Aug 13, 2025

#151672 stops my tests working because arg VGPRs are no longer counted as VGPR usage in a chain function. I'll have to adjust the tests to use VGPRs in another way.

@trenouf trenouf merged commit f279c47 into llvm:main Aug 15, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants