-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[AMDGPU] Add GFX12 wave register names with WAVE_ prefix #144352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Add GFX12 wave register names with WAVE_ prefix #144352
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-mc Author: Aleksandar Spasojevic (aleksandar-amd) ChangesRename canonical register names with WAVE_ prefix for GFX12 Patch is 22.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144352.diff 5 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUAsmUtils.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUAsmUtils.cpp
index e433b85489e6e..78f9f65f0a0a1 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUAsmUtils.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUAsmUtils.cpp
@@ -167,7 +167,14 @@ namespace Hwreg {
// NOLINTBEGIN
// clang-format off
static constexpr CustomOperand Operands[] = {
- {{""}},
+ // GFX12+ renamed registers
+ {{"HW_REG_WAVE_MODE"}, ID_MODE, isGFX12Plus},
+ {{"HW_REG_WAVE_STATUS"}, ID_STATUS, isGFX12Plus},
+ {{"HW_REG_WAVE_GPR_ALLOC"}, ID_GPR_ALLOC, isGFX12Plus},
+ {{"HW_REG_WAVE_LDS_ALLOC"}, ID_LDS_ALLOC, isGFX12Plus},
+ {{"HW_REG_WAVE_HW_ID1"}, ID_HW_ID1, isGFX12Plus},
+ {{"HW_REG_WAVE_HW_ID2"}, ID_HW_ID2, isGFX12Plus},
+
{{"HW_REG_MODE"}, ID_MODE},
{{"HW_REG_STATUS"}, ID_STATUS},
{{"HW_REG_TRAPSTS"}, ID_TRAPSTS, isNotGFX12Plus},
@@ -196,25 +203,25 @@ static constexpr CustomOperand Operands[] = {
{{""}},
{{"HW_REG_PERF_SNAPSHOT_DATA"}, ID_PERF_SNAPSHOT_DATA_gfx11, isGFX11},
{{""}},
- {{"HW_REG_SHADER_CYCLES"}, ID_SHADER_CYCLES, isGFX10_3_GFX11},
- {{"HW_REG_SHADER_CYCLES_HI"}, ID_SHADER_CYCLES_HI, isGFX12Plus},
- {{"HW_REG_DVGPR_ALLOC_LO"}, ID_DVGPR_ALLOC_LO, isGFX12Plus},
- {{"HW_REG_DVGPR_ALLOC_HI"}, ID_DVGPR_ALLOC_HI, isGFX12Plus},
+ {{"HW_REG_SHADER_CYCLES"}, ID_SHADER_CYCLES, isGFX10_3_GFX11},
+ {{"HW_REG_SHADER_CYCLES_HI"}, ID_SHADER_CYCLES_HI, isGFX12Plus},
+ {{"HW_REG_WAVE_DVGPR_ALLOC_LO"}, ID_DVGPR_ALLOC_LO, isGFX12Plus},
+ {{"HW_REG_WAVE_DVGPR_ALLOC_HI"}, ID_DVGPR_ALLOC_HI, isGFX12Plus},
// Register numbers reused in GFX11
{{"HW_REG_PERF_SNAPSHOT_PC_LO"}, ID_PERF_SNAPSHOT_PC_LO_gfx11, isGFX11},
{{"HW_REG_PERF_SNAPSHOT_PC_HI"}, ID_PERF_SNAPSHOT_PC_HI_gfx11, isGFX11},
// Register numbers reused in GFX12+
- {{"HW_REG_STATE_PRIV"}, ID_STATE_PRIV, isGFX12Plus},
- {{"HW_REG_PERF_SNAPSHOT_DATA1"}, ID_PERF_SNAPSHOT_DATA1, isGFX12Plus},
- {{"HW_REG_PERF_SNAPSHOT_DATA2"}, ID_PERF_SNAPSHOT_DATA2, isGFX12Plus},
- {{"HW_REG_EXCP_FLAG_PRIV"}, ID_EXCP_FLAG_PRIV, isGFX12Plus},
- {{"HW_REG_EXCP_FLAG_USER"}, ID_EXCP_FLAG_USER, isGFX12Plus},
- {{"HW_REG_TRAP_CTRL"}, ID_TRAP_CTRL, isGFX12Plus},
- {{"HW_REG_SCRATCH_BASE_LO"}, ID_FLAT_SCR_LO, isGFX12Plus},
- {{"HW_REG_SCRATCH_BASE_HI"}, ID_FLAT_SCR_HI, isGFX12Plus},
- {{"HW_REG_SHADER_CYCLES_LO"}, ID_SHADER_CYCLES, isGFX12Plus},
+ {{"HW_REG_WAVE_STATE_PRIV"}, ID_STATE_PRIV, isGFX12Plus},
+ {{"HW_REG_PERF_SNAPSHOT_DATA1"}, ID_PERF_SNAPSHOT_DATA1, isGFX12Plus},
+ {{"HW_REG_PERF_SNAPSHOT_DATA2"}, ID_PERF_SNAPSHOT_DATA2, isGFX12Plus},
+ {{"HW_REG_WAVE_EXCP_FLAG_PRIV"}, ID_EXCP_FLAG_PRIV, isGFX12Plus},
+ {{"HW_REG_WAVE_EXCP_FLAG_USER"}, ID_EXCP_FLAG_USER, isGFX12Plus},
+ {{"HW_REG_WAVE_TRAP_CTRL"}, ID_TRAP_CTRL, isGFX12Plus},
+ {{"HW_REG_WAVE_SCRATCH_BASE_LO"}, ID_FLAT_SCR_LO, isGFX12Plus},
+ {{"HW_REG_WAVE_SCRATCH_BASE_HI"}, ID_FLAT_SCR_HI, isGFX12Plus},
+ {{"HW_REG_SHADER_CYCLES_LO"}, ID_SHADER_CYCLES, isGFX12Plus},
// GFX942 specific registers
{{"HW_REG_XCC_ID"}, ID_XCC_ID, isGFX940},
@@ -224,7 +231,15 @@ static constexpr CustomOperand Operands[] = {
{{"HW_REG_SQ_PERF_SNAPSHOT_PC_HI"}, ID_SQ_PERF_SNAPSHOT_PC_HI, isGFX940},
// Aliases
- {{"HW_REG_HW_ID"}, ID_HW_ID1, isGFX10},
+ {{"HW_REG_HW_ID"}, ID_HW_ID1, isGFX10},
+ {{"HW_REG_STATE_PRIV"}, ID_STATE_PRIV, isGFX12Plus},
+ {{"HW_REG_EXCP_FLAG_PRIV"}, ID_EXCP_FLAG_PRIV, isGFX12Plus},
+ {{"HW_REG_EXCP_FLAG_USER"}, ID_EXCP_FLAG_USER, isGFX12Plus},
+ {{"HW_REG_TRAP_CTRL"}, ID_TRAP_CTRL, isGFX12Plus},
+ {{"HW_REG_SCRATCH_BASE_LO"}, ID_FLAT_SCR_LO, isGFX12Plus},
+ {{"HW_REG_SCRATCH_BASE_HI"}, ID_FLAT_SCR_HI, isGFX12Plus},
+ {{"HW_REG_DVGPR_ALLOC_LO"}, ID_DVGPR_ALLOC_LO, isGFX12Plus},
+ {{"HW_REG_DVGPR_ALLOC_HI"}, ID_DVGPR_ALLOC_HI, isGFX12Plus},
};
// clang-format on
// NOLINTEND
diff --git a/llvm/test/CodeGen/AMDGPU/dynamic-vgpr-reserve-stack-for-cwsr.ll b/llvm/test/CodeGen/AMDGPU/dynamic-vgpr-reserve-stack-for-cwsr.ll
index 2d253c9484309..e79fa9eae1d47 100644
--- a/llvm/test/CodeGen/AMDGPU/dynamic-vgpr-reserve-stack-for-cwsr.ll
+++ b/llvm/test/CodeGen/AMDGPU/dynamic-vgpr-reserve-stack-for-cwsr.ll
@@ -7,7 +7,7 @@
define amdgpu_cs void @amdgpu_cs() #0 {
; CHECK-LABEL: amdgpu_cs:
; CHECK: ; %bb.0:
-; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; CHECK-NEXT: s_cmp_lg_u32 0, s33
; CHECK-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -19,7 +19,7 @@ define amdgpu_cs void @amdgpu_cs() #0 {
define amdgpu_kernel void @kernel() #0 {
; CHECK-LABEL: kernel:
; CHECK: ; %bb.0:
-; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; CHECK-NEXT: s_cmp_lg_u32 0, s33
; CHECK-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -31,7 +31,7 @@ define amdgpu_kernel void @kernel() #0 {
define amdgpu_cs void @with_local() #0 {
; CHECK-TRUE16-LABEL: with_local:
; CHECK-TRUE16: ; %bb.0:
-; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-TRUE16-NEXT: v_mov_b16_e32 v0.l, 13
; CHECK-TRUE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-TRUE16-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -42,7 +42,7 @@ define amdgpu_cs void @with_local() #0 {
;
; CHECK-FAKE16-LABEL: with_local:
; CHECK-FAKE16: ; %bb.0:
-; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-FAKE16-NEXT: v_mov_b32_e32 v0, 13
; CHECK-FAKE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-FAKE16-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -60,7 +60,7 @@ define amdgpu_cs void @with_local() #0 {
define amdgpu_cs void @with_calls_inline_const() #0 {
; CHECK-TRUE16-LABEL: with_calls_inline_const:
; CHECK-TRUE16: ; %bb.0:
-; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-TRUE16-NEXT: v_mov_b16_e32 v0.l, 15
; CHECK-TRUE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-TRUE16-NEXT: s_mov_b32 s1, callee@abs32@hi
@@ -76,7 +76,7 @@ define amdgpu_cs void @with_calls_inline_const() #0 {
;
; CHECK-FAKE16-LABEL: with_calls_inline_const:
; CHECK-FAKE16: ; %bb.0:
-; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-FAKE16-NEXT: v_mov_b32_e32 v0, 15
; CHECK-FAKE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-FAKE16-NEXT: s_mov_b32 s1, callee@abs32@hi
@@ -100,7 +100,7 @@ define amdgpu_cs void @with_calls_inline_const() #0 {
define amdgpu_cs void @with_calls_no_inline_const() #0 {
; CHECK-TRUE16-LABEL: with_calls_no_inline_const:
; CHECK-TRUE16: ; %bb.0:
-; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-TRUE16-NEXT: v_mov_b16_e32 v0.l, 15
; CHECK-TRUE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-TRUE16-NEXT: s_mov_b32 s1, callee@abs32@hi
@@ -117,7 +117,7 @@ define amdgpu_cs void @with_calls_no_inline_const() #0 {
;
; CHECK-FAKE16-LABEL: with_calls_no_inline_const:
; CHECK-FAKE16: ; %bb.0:
-; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-FAKE16-NEXT: v_mov_b32_e32 v0, 15
; CHECK-FAKE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-FAKE16-NEXT: s_mov_b32 s1, callee@abs32@hi
@@ -140,7 +140,7 @@ define amdgpu_cs void @with_calls_no_inline_const() #0 {
define amdgpu_cs void @with_spills() {
; CHECK-LABEL: with_spills:
; CHECK: ; %bb.0:
-; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; CHECK-NEXT: s_cmp_lg_u32 0, s33
; CHECK-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -153,7 +153,7 @@ define amdgpu_cs void @with_spills() {
define amdgpu_cs void @realign_stack(<32 x i32> %x) #0 {
; CHECK-LABEL: realign_stack:
; CHECK: ; %bb.0:
-; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-NEXT: s_mov_b32 s1, callee@abs32@hi
; CHECK-NEXT: s_cmp_lg_u32 0, s33
; CHECK-NEXT: s_mov_b32 s0, callee@abs32@lo
@@ -182,7 +182,7 @@ define amdgpu_cs void @realign_stack(<32 x i32> %x) #0 {
define amdgpu_cs void @frame_pointer_none() #1 {
; CHECK-TRUE16-LABEL: frame_pointer_none:
; CHECK-TRUE16: ; %bb.0:
-; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-TRUE16-NEXT: v_mov_b16_e32 v0.l, 13
; CHECK-TRUE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-TRUE16-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -193,7 +193,7 @@ define amdgpu_cs void @frame_pointer_none() #1 {
;
; CHECK-FAKE16-LABEL: frame_pointer_none:
; CHECK-FAKE16: ; %bb.0:
-; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-FAKE16-NEXT: v_mov_b32_e32 v0, 13
; CHECK-FAKE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-FAKE16-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -209,7 +209,7 @@ define amdgpu_cs void @frame_pointer_none() #1 {
define amdgpu_cs void @frame_pointer_all() #2 {
; CHECK-TRUE16-LABEL: frame_pointer_all:
; CHECK-TRUE16: ; %bb.0:
-; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-TRUE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-TRUE16-NEXT: v_mov_b16_e32 v0.l, 13
; CHECK-TRUE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-TRUE16-NEXT: s_cmovk_i32 s33, 0x1c0
@@ -220,7 +220,7 @@ define amdgpu_cs void @frame_pointer_all() #2 {
;
; CHECK-FAKE16-LABEL: frame_pointer_all:
; CHECK-FAKE16: ; %bb.0:
-; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_HW_ID2, 8, 2)
+; CHECK-FAKE16-NEXT: s_getreg_b32 s33, hwreg(HW_REG_WAVE_HW_ID2, 8, 2)
; CHECK-FAKE16-NEXT: v_mov_b32_e32 v0, 13
; CHECK-FAKE16-NEXT: s_cmp_lg_u32 0, s33
; CHECK-FAKE16-NEXT: s_cmovk_i32 s33, 0x1c0
diff --git a/llvm/test/MC/AMDGPU/gfx12_asm_sopk.s b/llvm/test/MC/AMDGPU/gfx12_asm_sopk.s
index 5ce6847b9dcad..8daab3412fcb1 100644
--- a/llvm/test/MC/AMDGPU/gfx12_asm_sopk.s
+++ b/llvm/test/MC/AMDGPU/gfx12_asm_sopk.s
@@ -192,19 +192,19 @@ s_call_b64 vcc, 0x1234
s_call_b64 null, 0x1234
// GFX12: encoding: [0x34,0x12,0x7c,0xba]
-s_getreg_b32 s0, hwreg(HW_REG_MODE)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_MODE)
// GFX12: encoding: [0x01,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_STATUS)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATUS)
// GFX12: encoding: [0x02,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_STATE_PRIV)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATE_PRIV)
// GFX12: encoding: [0x04,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_GPR_ALLOC)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_GPR_ALLOC)
// GFX12: encoding: [0x05,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_LDS_ALLOC)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_LDS_ALLOC)
// GFX12: encoding: [0x06,0xf8,0x80,0xb8]
s_getreg_b32 s0, hwreg(HW_REG_IB_STS)
@@ -225,31 +225,31 @@ s_getreg_b32 s0, hwreg(HW_REG_PERF_SNAPSHOT_DATA1)
s_getreg_b32 s0, hwreg(HW_REG_PERF_SNAPSHOT_DATA2)
// GFX12: encoding: [0x10,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_EXCP_FLAG_PRIV)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV)
// GFX12: encoding: [0x11,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_EXCP_FLAG_USER)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_EXCP_FLAG_USER)
// GFX12: encoding: [0x12,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_TRAP_CTRL)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_TRAP_CTRL)
// GFX12: encoding: [0x13,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_SCRATCH_BASE_LO)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_SCRATCH_BASE_LO)
// GFX12: encoding: [0x14,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_SCRATCH_BASE_HI)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_SCRATCH_BASE_HI)
// GFX12: encoding: [0x15,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_HW_ID1)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_HW_ID1)
// GFX12: encoding: [0x17,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_HW_ID2)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_HW_ID2)
// GFX12: encoding: [0x18,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_DVGPR_ALLOC_LO)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_DVGPR_ALLOC_LO)
// GFX12: encoding: [0x1f,0xf8,0x80,0xb8]
-s_getreg_b32 s0, hwreg(HW_REG_DVGPR_ALLOC_HI)
+s_getreg_b32 s0, hwreg(HW_REG_WAVE_DVGPR_ALLOC_HI)
// GFX12: encoding: [0x20,0xf8,0x80,0xb8]
s_getreg_b32 s0, hwreg(HW_REG_SHADER_CYCLES_LO)
diff --git a/llvm/test/MC/AMDGPU/gfx12_asm_sopk_alias.s b/llvm/test/MC/AMDGPU/gfx12_asm_sopk_alias.s
index 4a25922f956d3..bd265938170f1 100644
--- a/llvm/test/MC/AMDGPU/gfx12_asm_sopk_alias.s
+++ b/llvm/test/MC/AMDGPU/gfx12_asm_sopk_alias.s
@@ -1,4 +1,46 @@
// RUN: llvm-mc -triple=amdgcn -show-encoding -mcpu=gfx1200 %s | FileCheck --check-prefix=GFX12 %s
s_addk_i32 s0, 0x1234
-// GFX12: s_addk_co_i32 s0, 0x1234 ; encoding: [0x34,0x12,0x80,0xb7]
+// GFX12: s_addk_co_i32 s0, 0x1234 ; encoding: [0x34,0x12,0x80,0xb7]
+
+s_getreg_b32 s0, hwreg(HW_REG_MODE)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_MODE) ; encoding: [0x01,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_STATUS)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATUS) ; encoding: [0x02,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_STATE_PRIV)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATE_PRIV) ; encoding: [0x04,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_GPR_ALLOC)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_GPR_ALLOC) ; encoding: [0x05,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_LDS_ALLOC)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_LDS_ALLOC) ; encoding: [0x06,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_EXCP_FLAG_PRIV)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV) ; encoding: [0x11,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_EXCP_FLAG_USER)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_EXCP_FLAG_USER) ; encoding: [0x12,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_TRAP_CTRL)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_TRAP_CTRL) ; encoding: [0x13,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_SCRATCH_BASE_LO)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_SCRATCH_BASE_LO) ; encoding: [0x14,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_SCRATCH_BASE_HI)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_SCRATCH_BASE_HI) ; encoding: [0x15,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_HW_ID1)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_HW_ID1) ; encoding: [0x17,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_HW_ID2)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_HW_ID2) ; encoding: [0x18,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_DVGPR_ALLOC_LO)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_DVGPR_ALLOC_LO) ; encoding: [0x1f,0xf8,0x80,0xb8]
+
+s_getreg_b32 s0, hwreg(HW_REG_DVGPR_ALLOC_HI)
+// GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_DVGPR_ALLOC_HI) ; encoding: [0x20,0xf8,0x80,0xb8]
diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_sopk.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_sopk.txt
index 3e323ed69216c..718b8ef6031c2 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_sopk.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_sopk.txt
@@ -76,7 +76,7 @@
# GFX12: s_getreg_b32 s0, hwreg(52, 8, 3) ; encoding: [0x34,0x12,0x80,0xb8]
0x34,0x12,0x80,0xb8
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_EXCP_FLAG_PRIV, 7, 25) ; encoding: [0xd1,0xc1,0x80,0xb8]
+# GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, 7, 25) ; encoding: [0xd1,0xc1,0x80,0xb8]
0xd1,0xc1,0x80,0xb8
# GFX12: s_getreg_b32 s105, hwreg(52, 8, 3) ; encoding: [0x34,0x12,0xe9,0xb8]
@@ -157,7 +157,7 @@
# GFX12: s_setreg_b32 hwreg(52, 8, 3), vcc_lo ; encoding: [0x34,0x12,0x6a,0xb9]
0x34,0x12,0x6a,0xb9
-# GFX12: s_setreg_b32 hwreg(HW_REG_EXCP_FLAG_PRIV, 7, 25), s0 ; encoding: [0xd1,0xc1,0x00,0xb9]
+# GFX12: s_setreg_b32 hwreg(HW_REG_WAVE_EXCP_FLAG_PRIV, 7, 25), s0 ; encoding: [0xd1,0xc1,0x00,0xb9]
0xd1,0xc1,0x00,0xb9
# GFX12: s_version 0x1234 ; encoding: [0x34,0x12,0x80,0xb0]
@@ -181,43 +181,43 @@
# GFX12: s_version ((128|UC_VERSION_W64_BIT)|UC_VERSION_W32_BIT)|UC_VERSION_MDP_BIT ; encoding: [0x80,0xe0,0x80,0xb0]
0x80,0xe0,0x80,0xb0
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_MODE), 0xaf123456 ; encoding: [0x01,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_MODE), 0xaf123456 ; encoding: [0x01,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x01,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_MODE, 31, 1), 0xaf123456 ; encoding: [0xc1,0x07,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_MODE, 31, 1), 0xaf123456 ; encoding: [0xc1,0x07,0x80,0xb9,0x56,0x34,0x12,0xaf]
0xc1,0x07,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_STATUS), 0xaf123456 ; encoding: [0x02,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_STATUS), 0xaf123456 ; encoding: [0x02,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x02,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_GPR_ALLOC), 0xaf123456 ; encoding: [0x05,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_GPR_ALLOC), 0xaf123456 ; encoding: [0x05,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x05,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_LDS_ALLOC), 0xaf123456 ; encoding: [0x06,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_LDS_ALLOC), 0xaf123456 ; encoding: [0x06,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x06,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_IB_STS), 0xaf123456 ; encoding: [0x07,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x07,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_HW_ID1), 0xaf123456 ; encoding: [0x17,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_HW_ID1), 0xaf123456 ; encoding: [0x17,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x17,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_HW_ID2), 0xaf123456 ; encoding: [0x18,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
+# GFX12: s_setreg_imm32_b32 hwreg(HW_REG_WAVE_HW_ID2), 0xaf123456 ; encoding: [0x18,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf]
0x18,0xf8,0x80,0xb9,0x56,0x34,0x12,0xaf
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_MODE) ; encoding: [0x01,0xf8,0x80,0xb8]
+# GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_MODE) ; encoding: [0x01,0xf8,0x80,0xb8]
0x01,0xf8,0x80,0xb8
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_STATUS) ; encoding: [0x02,0xf8,0x80,0xb8]
+# GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATUS) ; encoding: [0x02,0xf8,0x80,0xb8]
0x02,0xf8,0x80,0xb8
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_STATE_PRIV) ; encoding: [0x04,0xf8,0x80,0xb8]
+# GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_STATE_PRIV) ; encoding: [0x04,0xf8,0x80,0xb8]
0x04,0xf8,0x80,0xb8
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_GPR_ALLOC) ; encoding: [0x05,0xf8,0x80,0xb8]
+# GFX12: s_getreg_b32 s0, hwreg(HW_REG_WAVE_GPR_ALLOC) ; encoding: [0x05,0xf8,0x80,0xb8]
0x05,0xf8,0x80,0xb8
-# GFX12: s_getreg_b32 s0, hwreg(HW_REG_LDS_ALLOC) ; encoding: [0x06,0xf8,0x80,0xb8]
+# GF...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original design of this table was that you look up ID values directly. That is why entry 0 is "", because we want "HW_REG_MODE" to be entry 1, because ID_MODE == 1.
Then we started getting cases where the same ID value meant different registers on different subtargets, but we still tried to keep most entries in the correct position. That is why getNameFromOperandTable first tries direct lookup (fast case) and then falls back to a table search (slow case).
Now I am wondering if we should abandon the idea of direct lookup. Instead we could keep this table ordered by ID, and always do a binary search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the explanatory comment here:
| // A table of custom operands shall describe "primary" operand names first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ordered the tables by ID and implemented a binary search on them.
e9797d6 to
90ef8e6
Compare
| !Table[Idx].Name.empty() && | ||
| (!Table[Idx].Cond || Table[Idx].Cond(STI)); | ||
| auto IsValid = [&](const CustomOperand &Entry) { | ||
| return Entry.Encoding == Encoding && !Entry.Name.empty() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name is never empty, is it?
| return Entry.Encoding == Encoding && !Entry.Name.empty() && | |
| return Entry.Encoding == Encoding && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, thank you.
jayfoad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, thanks.
| return Idx < N && Table[Idx].Encoding == Encoding && | ||
| !Table[Idx].Name.empty() && | ||
| (!Table[Idx].Cond || Table[Idx].Cond(STI)); | ||
| auto IsValid = [&](const CustomOperand &Entry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only one use of this function. I think it would be simpler to inline it.
90ef8e6 to
323657d
Compare
jayfoad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
| }); | ||
|
|
||
| // Search through entries with the same encoding to find the first valid one | ||
| for (auto It = First; It != Table + N && It->Encoding == Encoding; ++It) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: braces around this for loop.
323657d to
5e85570
Compare
Rename canonical register names with WAVE_ prefix for GFX12 - Maintain backward compatibility through aliases
|
@aleksandar-amd Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
Rename canonical register names with WAVE_ prefix for GFX12 Maintain backward compatibility through aliases
Rename canonical register names with WAVE_ prefix for GFX12
Maintain backward compatibility through aliases