Skip to content

Conversation

@Acim-Maravic
Copy link
Contributor

Added assembler/disassembler support for ds_bpermute_fi_b32 instruction, as well as tests.

@llvmbot llvmbot added backend:AMDGPU llvm:mc Machine (object) code labels Jan 23, 2025
@Acim-Maravic Acim-Maravic requested a review from jayfoad January 23, 2025 12:31
@llvmbot
Copy link
Member

llvmbot commented Jan 23, 2025

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-mc

Author: Acim Maravic (Acim-Maravic)

Changes

Added assembler/disassembler support for ds_bpermute_fi_b32 instruction, as well as tests.


Full diff: https://github.com/llvm/llvm-project/pull/124108.diff

4 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+2)
  • (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+1)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_ds.s (+12)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_ds.txt (+9)
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index ef618727258cf2..bc217e10e0fbd7 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -699,6 +699,7 @@ def DS_PERMUTE_B32  : DS_1A1D_PERMUTE <"ds_permute_b32",
                                        int_amdgcn_ds_permute>;
 def DS_BPERMUTE_B32 : DS_1A1D_PERMUTE <"ds_bpermute_b32",
                                        int_amdgcn_ds_bpermute>;
+def DS_BPERMUTE_FI_B32 : DS_1A1D_PERMUTE <"ds_bpermute_fi_b32">;
 }
 
 } // let SubtargetPredicate = isGFX8Plus
@@ -1264,6 +1265,7 @@ defm DS_PK_ADD_F16        : DS_Real_gfx12<0x09a>;
 defm DS_PK_ADD_RTN_F16    : DS_Real_gfx12<0x0aa>;
 defm DS_PK_ADD_BF16       : DS_Real_gfx12<0x09b>;
 defm DS_PK_ADD_RTN_BF16   : DS_Real_gfx12<0x0ab>;
+defm DS_BPERMUTE_FI_B32   : DS_Real_gfx12<0x0cd>;
 
 // New aliases added in GFX12 without renaming the instructions.
 let AssemblerPredicate = isGFX12Plus in {
diff --git a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
index 873d18e30a430a..d4015235c6c708 100644
--- a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
@@ -150,6 +150,7 @@ static bool isSendMsgTraceDataOrGDS(const SIInstrInfo &TII,
   case AMDGPU::DS_NOP:
   case AMDGPU::DS_PERMUTE_B32:
   case AMDGPU::DS_BPERMUTE_B32:
+  case AMDGPU::DS_BPERMUTE_FI_B32:
     return false;
   default:
     if (TII.isDS(MI.getOpcode())) {
diff --git a/llvm/test/MC/AMDGPU/gfx12_asm_ds.s b/llvm/test/MC/AMDGPU/gfx12_asm_ds.s
index a0e6a3a613555a..34c42affdd46cc 100644
--- a/llvm/test/MC/AMDGPU/gfx12_asm_ds.s
+++ b/llvm/test/MC/AMDGPU/gfx12_asm_ds.s
@@ -1910,3 +1910,15 @@ ds_swizzle_b32 v8, v2 offset:swizzle(BROADCAST,8,7)
 
 ds_swizzle_b32 v8, v2 offset:swizzle(BITMASK_PERM, "01pip")
 // GFX12: [0x07,0x09,0xd4,0xd8,0x02,0x00,0x00,0x08]
+
+ds_bpermute_fi_b32 v5, v1, v2
+// GFX12: encoding: [0x00,0x00,0x34,0xdb,0x01,0x02,0x00,0x05]
+
+ds_bpermute_fi_b32 v5, v1, v2 offset:65535
+// GFX12: encoding: [0xff,0xff,0x34,0xdb,0x01,0x02,0x00,0x05]
+
+ds_bpermute_fi_b32 v5, v1, v2 offset:0
+// GFX12: encoding: [0x00,0x00,0x34,0xdb,0x01,0x02,0x00,0x05]
+
+ds_bpermute_fi_b32 v255, v255, v255 offset:4
+// GFX12: encoding: [0x04,0x00,0x34,0xdb,0xff,0xff,0x00,0xff]
diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_ds.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_ds.txt
index 080a4cab2a319d..d66748135ffd42 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_ds.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_ds.txt
@@ -3233,3 +3233,12 @@
 
 # GFX12: ds_xor_rtn_b64 v[5:6], v255, v[2:3] offset:65535 ; encoding: [0xff,0xff,0xac,0xd9,0xff,0x02,0x00,0x05]
 0xff,0xff,0xac,0xd9,0xff,0x02,0x00,0x05
+
+# GFX12: ds_bpermute_fi_b32 v5, v1, v2           ; encoding: [0x00,0x00,0x34,0xdb,0x01,0x02,0x00,0x05]
+0x00,0x00,0x34,0xdb,0x01,0x02,0x00,0x05
+
+# GFX12: ds_bpermute_fi_b32 v5, v1, v2 offset:65535 ; encoding: [0xff,0xff,0x34,0xdb,0x01,0x02,0x00,0x05]
+0xff,0xff,0x34,0xdb,0x01,0x02,0x00,0x05
+
+# GFX12: ds_bpermute_fi_b32 v255, v255, v255 offset:4 ; encoding: [0x04,0x00,0x34,0xdb,0xff,0xff,0x00,0xff]
+0x04,0x00,0x34,0xdb,0xff,0xff,0x00,0xff

case AMDGPU::DS_NOP:
case AMDGPU::DS_PERMUTE_B32:
case AMDGPU::DS_BPERMUTE_B32:
case AMDGPU::DS_BPERMUTE_FI_B32:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more than MC and untested

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the hazard recognizer part removed.

Added assembler/disassembler support for ds_bpermute_fi_b32 instruction,
as well as tests.
@Acim-Maravic
Copy link
Contributor Author

LGTM with the hazard recognizer part removed.

Hazard recognizer part is removed.

@Acim-Maravic Acim-Maravic merged commit 7ddeea3 into llvm:main Jan 23, 2025
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants