-
Notifications
You must be signed in to change notification settings - Fork 15.4k
AMDGPU gfx12: Add _dvgpr$ symbols for dynamic VGPRs #148251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function.
|
@llvm/pr-subscribers-backend-amdgpu Author: Tim Renouf (trenouf) ChangesFor each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function. Full diff: https://github.com/llvm/llvm-project/pull/148251.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 749b9efc81378..00ed5f57967ce 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -194,6 +194,32 @@ void AMDGPUAsmPrinter::emitFunctionBodyStart() {
return;
}
+ if (STM.isDynamicVGPREnabled() &&
+ MF->getFunction().getCallingConv() == CallingConv::AMDGPU_CS_Chain) {
+ // Add a _dvgpr$ symbol, with the value of the function symbol, plus an
+ // offset encoding one less than the number of VGPR blocks used by the
+ // function (16 VGPRs per block, no more than 128) in bits 5..3 of the
+ // symbol value. This is used by a front-end to have functions that are
+ // chained rather than called, and a dispatcher that dynamically resizes
+ // the VGPR count before dispatching to a function.
+ ResourceUsage = &getAnalysis<AMDGPUResourceUsageAnalysis>();
+ const AMDGPUResourceUsageAnalysis::SIFunctionResourceInfo &Info =
+ ResourceUsage->getResourceInfo();
+ MCContext &Ctx = MF->getContext();
+ unsigned EncodedNumVGPRs = (Info.NumVGPR - 1) >> 1 & 0x38;
+ MCSymbol *CurPCSym = Ctx.createTempSymbol();
+ OutStreamer->emitLabel(CurPCSym);
+ const MCExpr *DVgprFuncVal = MCBinaryExpr::createAdd(
+ MCSymbolRefExpr::create(CurPCSym, MCSymbolRefExpr::VK_None, Ctx),
+ MCConstantExpr::create(EncodedNumVGPRs, Ctx), Ctx);
+ MCSymbol *DVgprFuncSym =
+ Ctx.getOrCreateSymbol(Twine("_dvgpr$") + MF->getFunction().getName());
+ OutStreamer->emitAssignment(DVgprFuncSym, DVgprFuncVal);
+ cast<MCSymbolELF>(DVgprFuncSym)
+ ->setBinding(
+ cast<MCSymbolELF>(getSymbol(&MF->getFunction()))->getBinding());
+ }
+
if (!MFI.isEntryFunction())
return;
diff --git a/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll b/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
new file mode 100644
index 0000000000000..992963d304ead
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
@@ -0,0 +1,12 @@
+; Test generation of _dvgpr$ symbol for an amdgpu_cs_chain function with +dynamic-vgpr.
+
+; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx1200 -asm-verbose=0 < %s | FileCheck -check-prefixes=DVGPR %s
+
+; DVGPR-LABEL: func:
+; DVGPR: .Ltmp0:
+; DVGPR: .set _dvgpr$func, .Ltmp0+{{[0-9]+}}
+
+define amdgpu_cs_chain void @func() #0 {
+ ret void
+}
+attributes #0 = { "target-features"="+dynamic-vgpr" }
|
* Use new func attr; * allow 16 or 32 block size; * put code in its own func; * enhance test, including anonymous func; * fix name, visibility and linkage
|
@arsenm Is this ok to merge now? |
llvm/test/CodeGen/AMDGPU/dvgpr_sym_fail_too_many_block_size_16.ll
Outdated
Show resolved
Hide resolved
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with nits, missed check of anonymous function test
|
#151672 stops my tests working because arg VGPRs are no longer counted as VGPR usage in a chain function. I'll have to adjust the tests to use VGPRs in another way. |
... which stops including args in VGPR usage
For each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function.