Skip to content

Conversation

@broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Dec 19, 2024

Support true16 format for v_sat_pk_u8_i16 in MC

@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-vop1-more-instruction-1 branch from 5eb527a to a12f37b Compare December 19, 2024 21:58
@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-vop1-more-instruction-1 branch from a12f37b to 625ae52 Compare December 20, 2024 04:26
@broxigarchen broxigarchen marked this pull request as ready for review December 20, 2024 14:44
@llvmbot llvmbot added backend:AMDGPU llvm:mc Machine (object) code labels Dec 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2024

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-mc

Author: Brox Chen (broxigarchen)

Changes

Support true16 format for v_sat_pk_u8_i16 in MC


Patch is 101.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120634.diff

28 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/VOP1Instructions.td (+1-1)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop1.s (+39-30)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp16.s (+37-28)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp8.s (+15-6)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s (+18)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s (+15-6)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s (+31-28)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s (+9-6)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop3_from_vop1.s (+33-30)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1.s (+36-15)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp16.s (+6)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp8.s (+6)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_t16_err.s (+18)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_t16_promote.s (+9)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1.s (+3)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp16.s (+3)
  • (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp8.s (+3)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1.txt (+40-15)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1_dpp16.txt (+40-14)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1_dpp8.txt (+15-2)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16_from_vop1.txt (+32-14)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt (+8-2)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_from_vop1.txt (+34-15)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp16.txt (+36-14)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp8.txt (+11-2)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1.txt (+34-15)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp16.txt (+32-14)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp8.txt (+8-2)
diff --git a/llvm/lib/Target/AMDGPU/VOP1Instructions.td b/llvm/lib/Target/AMDGPU/VOP1Instructions.td
index 1dd39be9e8d9c7..bbb456ab739ab3 100644
--- a/llvm/lib/Target/AMDGPU/VOP1Instructions.td
+++ b/llvm/lib/Target/AMDGPU/VOP1Instructions.td
@@ -1047,7 +1047,7 @@ defm V_RNDNE_F16_fake16      : VOP1_Real_FULL_t16_gfx11_gfx12<0x05e, "v_rndne_f1
 defm V_FRACT_F16_fake16      : VOP1_Real_FULL_t16_gfx11_gfx12<0x05f, "v_fract_f16">;
 defm V_SIN_F16_fake16        : VOP1_Real_FULL_t16_gfx11_gfx12<0x060, "v_sin_f16">;
 defm V_COS_F16_fake16        : VOP1_Real_FULL_t16_gfx11_gfx12<0x061, "v_cos_f16">;
-defm V_SAT_PK_U8_I16_fake16  : VOP1_Real_FULL_t16_gfx11_gfx12<0x062, "v_sat_pk_u8_i16">;
+defm V_SAT_PK_U8_I16         : VOP1_Real_FULL_t16_and_fake16_gfx11_gfx12<0x062, "v_sat_pk_u8_i16">;
 defm V_CVT_NORM_I16_F16      : VOP1_Real_FULL_t16_and_fake16_gfx11_gfx12<0x063, "v_cvt_norm_i16_f16">;
 defm V_CVT_NORM_U16_F16      : VOP1_Real_FULL_t16_and_fake16_gfx11_gfx12<0x064, "v_cvt_norm_u16_f16">;
 
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1.s
index 4e4dc6647daebd..4448720e6f79f8 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop1.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1.s
@@ -3236,50 +3236,59 @@ v_rsq_f64 v[5:6], src_scc
 v_rsq_f64 v[254:255], 0xaf123456
 // GFX11: v_rsq_f64_e32 v[254:255], 0xaf123456    ; encoding: [0xff,0x62,0xfc,0x7f,0x56,0x34,0x12,0xaf]
 
-v_sat_pk_u8_i16 v5, v1
-// GFX11: v_sat_pk_u8_i16_e32 v5, v1              ; encoding: [0x01,0xc5,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, v1
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, v1            ; encoding: [0x01,0xc5,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, v255
-// GFX11: v_sat_pk_u8_i16_e32 v5, v255            ; encoding: [0xff,0xc5,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, v255
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, v255          ; encoding: [0xff,0xc5,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, s1
-// GFX11: v_sat_pk_u8_i16_e32 v5, s1              ; encoding: [0x01,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, s1
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, s1            ; encoding: [0x01,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, s105
-// GFX11: v_sat_pk_u8_i16_e32 v5, s105            ; encoding: [0x69,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, s105
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, s105          ; encoding: [0x69,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, vcc_lo
-// GFX11: v_sat_pk_u8_i16_e32 v5, vcc_lo          ; encoding: [0x6a,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, vcc_lo
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, vcc_lo        ; encoding: [0x6a,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, vcc_hi
-// GFX11: v_sat_pk_u8_i16_e32 v5, vcc_hi          ; encoding: [0x6b,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, vcc_hi
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, vcc_hi        ; encoding: [0x6b,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, ttmp15
-// GFX11: v_sat_pk_u8_i16_e32 v5, ttmp15          ; encoding: [0x7b,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, ttmp15
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, ttmp15        ; encoding: [0x7b,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, m0
-// GFX11: v_sat_pk_u8_i16_e32 v5, m0              ; encoding: [0x7d,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, m0
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, m0            ; encoding: [0x7d,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, exec_lo
-// GFX11: v_sat_pk_u8_i16_e32 v5, exec_lo         ; encoding: [0x7e,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, exec_lo
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, exec_lo       ; encoding: [0x7e,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, exec_hi
-// GFX11: v_sat_pk_u8_i16_e32 v5, exec_hi         ; encoding: [0x7f,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, exec_hi
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, exec_hi       ; encoding: [0x7f,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, null
-// GFX11: v_sat_pk_u8_i16_e32 v5, null            ; encoding: [0x7c,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, null
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, null          ; encoding: [0x7c,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, -1
-// GFX11: v_sat_pk_u8_i16_e32 v5, -1              ; encoding: [0xc1,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, -1
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, -1            ; encoding: [0xc1,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, 0.5
-// GFX11: v_sat_pk_u8_i16_e32 v5, 0.5             ; encoding: [0xf0,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, 0.5
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, 0.5           ; encoding: [0xf0,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v5, src_scc
-// GFX11: v_sat_pk_u8_i16_e32 v5, src_scc         ; encoding: [0xfd,0xc4,0x0a,0x7e]
+v_sat_pk_u8_i16 v5.l, src_scc
+// GFX11: v_sat_pk_u8_i16_e32 v5.l, src_scc       ; encoding: [0xfd,0xc4,0x0a,0x7e]
 
-v_sat_pk_u8_i16 v127, 0xfe0b
-// GFX11: v_sat_pk_u8_i16_e32 v127, 0xfe0b        ; encoding: [0xff,0xc4,0xfe,0x7e,0x0b,0xfe,0x00,0x00]
+v_sat_pk_u8_i16 v127.l, 0xfe0b
+// GFX11: v_sat_pk_u8_i16_e32 v127.l, 0xfe0b      ; encoding: [0xff,0xc4,0xfe,0x7e,0x0b,0xfe,0x00,0x00]
+
+v_sat_pk_u8_i16 v127.l, 0.5
+// GFX11: v_sat_pk_u8_i16_e32 v127.l, 0.5         ; encoding: [0xf0,0xc4,0xfe,0x7e]
+
+v_sat_pk_u8_i16 v5.h, src_scc
+// GFX11: v_sat_pk_u8_i16_e32 v5.h, src_scc       ; encoding: [0xfd,0xc4,0x0a,0x7f]
+
+v_sat_pk_u8_i16 v127.h, 0xfe0b
+// GFX11: v_sat_pk_u8_i16_e32 v127.h, 0xfe0b      ; encoding: [0xff,0xc4,0xfe,0x7f,0x0b,0xfe,0x00,0x00]
 
 v_sin_f16 v5, v1
 // GFX11: v_sin_f16_e32 v5, v1                    ; encoding: [0x01,0xc1,0x0a,0x7e]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp16.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp16.s
index 98e4b29b25666e..da2a3615360a43 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp16.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp16.s
@@ -2522,47 +2522,56 @@ v_rsq_f32 v5, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
 v_rsq_f32 v255, -|v255| row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: v_rsq_f32_dpp v255, -|v255| row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1 ; encoding: [0xfa,0x5c,0xfe,0x7f,0xff,0x6f,0x35,0x30]
 
-v_sat_pk_u8_i16 v5, v1 quad_perm:[3,2,1,0]
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x1b,0x00,0xff]
+v_sat_pk_u8_i16 v5.l, v1 quad_perm:[3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x1b,0x00,0xff]
 
-v_sat_pk_u8_i16 v5, v1 quad_perm:[0,1,2,3]
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0xe4,0x00,0xff]
+v_sat_pk_u8_i16 v5.l, v1 quad_perm:[0,1,2,3]
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0xe4,0x00,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_mirror
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x40,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_mirror
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x40,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_half_mirror
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x41,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_half_mirror
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x41,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_shl:1
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x01,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_shl:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x01,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_shl:15
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x0f,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_shl:15
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x0f,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_shr:1
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x11,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_shr:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x11,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_shr:15
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x1f,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_shr:15
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x1f,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_ror:1
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x21,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_ror:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x21,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_ror:15
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x2f,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_ror:15
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x2f,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_share:0 row_mask:0xf bank_mask:0xf
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x50,0x01,0xff]
+v_sat_pk_u8_i16 v5.l, v1 row_share:0 row_mask:0xf bank_mask:0xf
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x50,0x01,0xff]
 
-v_sat_pk_u8_i16 v5, v1 row_share:15 row_mask:0x0 bank_mask:0x1
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x5f,0x01,0x01]
+v_sat_pk_u8_i16 v5.l, v1 row_share:15 row_mask:0x0 bank_mask:0x1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x5f,0x01,0x01]
 
-v_sat_pk_u8_i16 v5, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x60,0x09,0x13]
+v_sat_pk_u8_i16 v5.l, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 ; encoding: [0xfa,0xc4,0x0a,0x7e,0x01,0x60,0x09,0x13]
 
-v_sat_pk_u8_i16 v127, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
-// GFX11: v_sat_pk_u8_i16_dpp v127, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1 ; encoding: [0xfa,0xc4,0xfe,0x7e,0xff,0x6f,0x05,0x30]
+v_sat_pk_u8_i16 v127.l, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1
+// GFX11: v_sat_pk_u8_i16_dpp v127.l, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1 ; encoding: [0xfa,0xc4,0xfe,0x7e,0xff,0x6f,0x05,0x30]
+
+v_sat_pk_u8_i16 v127.l, v1 row_share:15 row_mask:0x0 bank_mask:0x1
+// GFX11: v_sat_pk_u8_i16_dpp v127.l, v1 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0xfa,0xc4,0xfe,0x7e,0x01,0x5f,0x01,0x01]
+
+v_sat_pk_u8_i16 v5.h, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
+// GFX11: v_sat_pk_u8_i16_dpp v5.h, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 ; encoding: [0xfa,0xc4,0x0a,0x7f,0x01,0x60,0x09,0x13]
+
+v_sat_pk_u8_i16 v127.h, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
+// GFX11: v_sat_pk_u8_i16_dpp v127.h, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1 ; encoding: [0xfa,0xc4,0xfe,0x7f,0xff,0x6f,0x05,0x30]
 
 v_sin_f16 v5, v1 quad_perm:[3,2,1,0]
 // GFX11: v_sin_f16_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0xc0,0x0a,0x7e,0x01,0x1b,0x00,0xff]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp8.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp8.s
index ab4606af2bb35c..34cb2d097b7a72 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp8.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_dpp8.s
@@ -587,14 +587,23 @@ v_rsq_f32 v5, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
 v_rsq_f32 v255, v255 dpp8:[0,0,0,0,0,0,0,0] fi:0
 // GFX11: v_rsq_f32_dpp v255, v255 dpp8:[0,0,0,0,0,0,0,0] ; encoding: [0xe9,0x5c,0xfe,0x7f,0xff,0x00,0x00,0x00]
 
-v_sat_pk_u8_i16 v5, v1 dpp8:[7,6,5,4,3,2,1,0]
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0xc4,0x0a,0x7e,0x01,0x77,0x39,0x05]
+v_sat_pk_u8_i16 v5.l, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0xc4,0x0a,0x7e,0x01,0x77,0x39,0x05]
 
-v_sat_pk_u8_i16 v5, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
-// GFX11: v_sat_pk_u8_i16_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1 ; encoding: [0xea,0xc4,0x0a,0x7e,0x01,0x77,0x39,0x05]
+v_sat_pk_u8_i16 v5.l, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.l, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1 ; encoding: [0xea,0xc4,0x0a,0x7e,0x01,0x77,0x39,0x05]
 
-v_sat_pk_u8_i16 v127, v255 dpp8:[0,0,0,0,0,0,0,0] fi:0
-// GFX11: v_sat_pk_u8_i16_dpp v127, v255 dpp8:[0,0,0,0,0,0,0,0] ; encoding: [0xe9,0xc4,0xfe,0x7e,0xff,0x00,0x00,0x00]
+v_sat_pk_u8_i16 v127.l, v255 dpp8:[0,0,0,0,0,0,0,0]
+// GFX11: v_sat_pk_u8_i16_dpp v127.l, v255 dpp8:[0,0,0,0,0,0,0,0] ; encoding: [0xe9,0xc4,0xfe,0x7e,0xff,0x00,0x00,0x00]
+
+v_sat_pk_u8_i16 v127.l, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_dpp v127.l, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0xc4,0xfe,0x7e,0x01,0x77,0x39,0x05]
+
+v_sat_pk_u8_i16 v5.h, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
+// GFX11: v_sat_pk_u8_i16_dpp v5.h, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1 ; encoding: [0xea,0xc4,0x0a,0x7f,0x01,0x77,0x39,0x05]
+
+v_sat_pk_u8_i16 v127.h, v255 dpp8:[0,0,0,0,0,0,0,0] fi:0
+// GFX11: v_sat_pk_u8_i16_dpp v127.h, v255 dpp8:[0,0,0,0,0,0,0,0] ; encoding: [0xe9,0xc4,0xfe,0x7f,0xff,0x00,0x00,0x00]
 
 v_sin_f16 v5, v1 dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: v_sin_f16_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0xc0,0x0a,0x7e,0x01,0x77,0x39,0x05]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s
index 4ae91340386b65..9c5693de3d8b1b 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err.s
@@ -716,6 +716,24 @@ v_sat_pk_u8_i16_e32 v199, v5 dpp8:[7,6,5,4,3,2,1,0]
 v_sat_pk_u8_i16_e32 v199, v5 quad_perm:[3,2,1,0]
 // GFX11: :[[@LINE-1]]:30: error: invalid operand for instruction
 
+v_sat_pk_u8_i16_e32 v199.h, v5.h
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
+v_sat_pk_u8_i16_e32 v199.h, v5.h dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
+v_sat_pk_u8_i16_e32 v199.h, v5.h quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
+v_sat_pk_u8_i16_e32 v199.l, v5.l
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
+v_sat_pk_u8_i16_e32 v199.l, v5.l dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
+v_sat_pk_u8_i16_e32 v199.l, v5.l quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:21: error: invalid operand for instruction
+
 v_sin_f16_e32 v128, 0xfe0b
 // GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
 
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s
index 1d441720280ca2..fa6ab407f87c7c 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_promote.s
@@ -1802,14 +1802,23 @@ v_rsq_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
 v_rsq_f16 v5, v199 quad_perm:[3,2,1,0]
 // GFX11: v_rsq_f16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xd6,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
 
-v_sat_pk_u8_i16 v199, v5
-// GFX11: v_sat_pk_u8_i16_e64 v199, v5            ; encoding: [0xc7,0x00,0xe2,0xd5,0x05,0x01,0x00,0x00]
+v_sat_pk_u8_i16 v199.h, v5
+// GFX11: v_sat_pk_u8_i16_e64 v199.h, v5 op_sel:[0,1] ; encoding: [0xc7,0x40,0xe2,0xd5,0x05,0x01,0x00,0x00]
 
-v_sat_pk_u8_i16 v199, v5 dpp8:[7,6,5,4,3,2,1,0]
-// GFX11: v_sat_pk_u8_i16_e64_dpp v199, v5 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xc7,0x00,0xe2,0xd5,0xe9,0x00,0x00,0x00,0x05,0x77,0x39,0x05]
+v_sat_pk_u8_i16 v199.h, v5 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v199.h, v5 op_sel:[0,1] dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xc7,0x40,0xe2,0xd5,0xe9,0x00,0x00,0x00,0x05,0x77,0x39,0x05]
 
-v_sat_pk_u8_i16 v199, v5 quad_perm:[3,2,1,0]
-// GFX11: v_sat_pk_u8_i16_e64_dpp v199, v5 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xc7,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x05,0x1b,0x00,0xff]
+v_sat_pk_u8_i16 v199.h, v5 quad_perm:[3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v199.h, v5 op_sel:[0,1] quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xc7,0x40,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x05,0x1b,0x00,0xff]
+
+v_sat_pk_u8_i16 v199.l, v5
+// GFX11: v_sat_pk_u8_i16_e64 v199.l, v5          ; encoding: [0xc7,0x00,0xe2,0xd5,0x05,0x01,0x00,0x00]
+
+v_sat_pk_u8_i16 v199.l, v5 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v199.l, v5 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xc7,0x00,0xe2,0xd5,0xe9,0x00,0x00,0x00,0x05,0x77,0x39,0x05]
+
+v_sat_pk_u8_i16 v199.l, v5 quad_perm:[3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v199.l, v5 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xc7,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x05,0x1b,0x00,0xff]
 
 v_sin_f16 v128, 0xfe0b
 // GFX11: v_sin_f16_e64 v128, 0xfe0b              ; encoding: [0x80,0x00,0xe0,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
index f38ff6a2fdd7d3..1bd1a5c5695bcf 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
@@ -2644,47 +2644,50 @@ v_rsq_f32_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl
 v_rsq_f32_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: v_rsq_f32_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 fi:1 ; encoding: [0xff,0x81,0xae,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x05,0x30]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 quad_perm:[3,2,1,0]
-// GFX11: v_sat_pk_u8_i16_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+v_sat_pk_u8_i16_e64_dpp v5.l, v1 quad_perm:[3,2,1,0]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v5.l, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 quad_perm:[0,1,2,3]
-// GFX11: v_sat_pk_u8_i16_e64_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+v_sat_pk_u8_i16_e64_dpp v5.l, v1 quad_perm:[0,1,2,3]
+// GFX11: v_sat_pk_u8_i16_e64_dpp v5.l, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 row_mirror
-// GFX11: v_sat_pk_u8_i16_e64_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_mirror
+// GFX11: v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 row_half_mirror
-// GFX11: v_sat_pk_u8_i16_e64_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_half_mirror
+// GFX11: v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 row_shl:1
-// GFX11: v_sat_pk_u8_i16_e64_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_shl:1
+// GFX11: v_sat_pk_u8_i16_e64_dpp v5.l, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
 
-v_sat_pk_u8_i16_e64_dpp v5, v1 row_shl:15
-// GFX11: v_sat_pk_u8_i16_e64_...
[truncated]

@broxigarchen broxigarchen merged commit 322f16e into llvm:main Jan 3, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants