Skip to content

Conversation

@broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Apr 15, 2025

This is another NFC patch.

Update mc test for a few true16 instructions by duplicating the file to fake16 versions and udpate mattr flag with +/-real-true16. Also added some fake16 file that are not properly created before

@broxigarchen broxigarchen changed the title clean up of true16 mc changes [AMDGPU][True16][MC] update a few mc test in bmm and gfx11 Apr 15, 2025
@broxigarchen broxigarchen changed the title [AMDGPU][True16][MC] update a few mc test in bmm and gfx11 [AMDGPU][True16][MC] update a few mc test for true16 Apr 15, 2025
@broxigarchen broxigarchen marked this pull request as ready for review April 15, 2025 17:02
@broxigarchen broxigarchen requested review from Sisyph and shiltian April 15, 2025 17:02
@llvmbot llvmbot added backend:AMDGPU llvm:mc Machine (object) code labels Apr 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Brox Chen (broxigarchen)

Changes

This is another NFC patch.

Update mc test for a few true16 instructions by duplicating the file to fake16 versions and udpate mattr flag with +/-real-true16


Patch is 46.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135816.diff

8 Files Affected:

  • (added) llvm/test/MC/AMDGPU/bf16_imm-fake16.s (+114)
  • (modified) llvm/test/MC/AMDGPU/bf16_imm.s (+32-32)
  • (added) llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s (+353)
  • (modified) llvm/test/MC/AMDGPU/gfx11-promotions.s (+33-33)
  • (added) llvm/test/MC/AMDGPU/gfx1150_asm_features-fake16.s (+48)
  • (modified) llvm/test/MC/AMDGPU/gfx1150_asm_features.s (+10-10)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_t16.s (+59)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop3_alias.s (+4-4)
diff --git a/llvm/test/MC/AMDGPU/bf16_imm-fake16.s b/llvm/test/MC/AMDGPU/bf16_imm-fake16.s
new file mode 100644
index 0000000000000..ee697bee6ab2d
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/bf16_imm-fake16.s
@@ -0,0 +1,114 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --version 5
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -mattr=-real-true16 -show-encoding %s | FileCheck %s
+
+v_dot2_bf16_bf16 v5, v1, v2, 100.0
+// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8     ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
+
+v_dot2_bf16_bf16 v2, v0, 1.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 1.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, 1.0, v0, v2
+// CHECK: v_dot2_bf16_bf16 v2, 1.0, v0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
+
+v_dot2_bf16_bf16 v5, v1, v2, 1.0
+// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 1.0        ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
+
+v_dot2_bf16_bf16 v2, v0, -1.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -1.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0.5, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.5, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -0.5, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -0.5, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 2.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 2.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -2.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -2.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 4.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 4.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -4.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -4.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
+
+// Check 1/(2*pi) rounded value and ideomatic fp32 0.15915494 value
+// which cannot be accurately represented in bf16.
+
+v_dot2_bf16_bf16 v2, v0, 0.158203125, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0.15915494, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0x3e22, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, v2, 0.15915494
+// CHECK: v_dot2_bf16_bf16 v2, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
+
+v_dot2_f32_bf16 v2, v1, 0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0, v2           ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0x01,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0.5, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.5, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -0.5, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -0.5, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe3,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 1.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 1.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe5,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -1.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -1.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe7,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 2.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 2.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe9,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -2.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -2.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xeb,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 4.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 4.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xed,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -4.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -4.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xef,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0.15915494, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.15915494, v2  ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xf1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0x3e22, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.15915494, v2  ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xf1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, 0.5, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 0.5, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf0,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -0.5, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -0.5, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf1,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 1.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 1.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf2,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -1.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -1.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf3,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 2.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 2.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf4,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -2.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -2.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf5,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 4.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 4.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf6,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -4.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -4.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf7,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 100.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 0x42c8, v1, v2      ; encoding: [0x02,0x40,0x1a,0xcc,0xff,0x02,0x0a,0x1c,0xc8,0x42,0x00,0x00]
+
+v_dot2_f32_bf16 v2, v1, 100.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0x42c8, v2      ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xff,0x09,0x1c,0xc8,0x42,0x00,0x00]
diff --git a/llvm/test/MC/AMDGPU/bf16_imm.s b/llvm/test/MC/AMDGPU/bf16_imm.s
index 7cf18103adfe5..d79649073aa89 100644
--- a/llvm/test/MC/AMDGPU/bf16_imm.s
+++ b/llvm/test/MC/AMDGPU/bf16_imm.s
@@ -1,54 +1,54 @@
 // NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --version 5
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -mattr=+real-true16 -show-encoding %s | FileCheck %s
 
-v_dot2_bf16_bf16 v5, v1, v2, 100.0
-// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8     ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
+v_dot2_bf16_bf16 v5.l, v1, v2, 100.0
+// CHECK: v_dot2_bf16_bf16 v5.l, v1, v2, 0x42c8   ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
 
-v_dot2_bf16_bf16 v2, v0, 1.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 1.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 1.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 1.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, 1.0, v0, v2
-// CHECK: v_dot2_bf16_bf16 v2, 1.0, v0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
+v_dot2_bf16_bf16 v2.l, 1.0, v0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, 1.0, v0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
 
-v_dot2_bf16_bf16 v5, v1, v2, 1.0
-// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 1.0        ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
+v_dot2_bf16_bf16 v5.l, v1, v2, 1.0
+// CHECK: v_dot2_bf16_bf16 v5.l, v1, v2, 1.0      ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
 
-v_dot2_bf16_bf16 v2, v0, -1.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -1.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -1.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -1.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0.5, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.5, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.5, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.5, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -0.5, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -0.5, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -0.5, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -0.5, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 2.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 2.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 2.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 2.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -2.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -2.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -2.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -2.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 4.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 4.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 4.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 4.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -4.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -4.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -4.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -4.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
 
 // Check 1/(2*pi) rounded value and ideomatic fp32 0.15915494 value
 // which cannot be accurately represented in bf16.
 
-v_dot2_bf16_bf16 v2, v0, 0.158203125, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.158203125, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0.15915494, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0x3e22, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0x3e22, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, v2, 0.15915494
-// CHECK: v_dot2_bf16_bf16 v2, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
+v_dot2_bf16_bf16 v2.l, v0, v2, 0.15915494
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
 
 v_dot2_f32_bf16 v2, v1, 0, v2
 // CHECK: v_dot2_f32_bf16 v2, v1, 0, v2           ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0x01,0x09,0x1c]
diff --git a/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s b/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s
new file mode 100644
index 0000000000000..95a52ffe103fa
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s
@@ -0,0 +1,353 @@
+// RUN: llvm-mc -triple=amdgcn -show-encoding -mcpu=gfx1100 -mattr=+wavefrontsize32,-real-true16 %s | FileCheck --check-prefix=GFX11 %s
+
+// Check opcode promotions and forced suffices.
+// 1. When a suffix is optional, check that it may be omitted.
+// 2. When a suffix is optional, check that it may be specified w/o any effect.
+// 3. When a suffix is required, check that specifying it enforces opcode promotion.
+// 4. When a suffix is required, check that omitting the suffix results in a different encoding.
+
+//===----------------------------------------------------------------------===//
+// VOP1.
+//===----------------------------------------------------------------------===//
+
+v_mov_b32 v0, v1
+// GFX11: v_mov_b32_e32 v0, v1                    ; encoding: [0x01,0x03,0x00,0x7e]
+
+v_mov_b32_e32 v0, v1
+// GFX11: v_mov_b32_e32 v0, v1                    ; encoding: [0x01,0x03,0x00,0x7e]
+
+//===----------------------------------------------------------------------===//
+// VOP2.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2
+// GFX11: v_add_f16_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x64]
+
+v_add_f16_e32 v5, v1, v2
+// GFX11: v_add_f16_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x64]
+
+//===----------------------------------------------------------------------===//
+// VOPC.
+//===----------------------------------------------------------------------===//
+
+v_cmp_lt_f32 vcc_lo, v1, v2
+// GFX11: v_cmp_lt_f32_e32 vcc_lo, v1, v2         ; encoding: [0x01,0x05,0x22,0x7c]
+
+v_cmp_lt_f32_e32 vcc_lo, v1, v2
+// GFX11: v_cmp_lt_f32_e32 vcc_lo, v1, v2         ; encoding: [0x01,0x05,0x22,0x7c]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2
+// GFX11: v_cmpx_class_f16_e32 v1, v2             ; encoding: [0x01,0x05,0xfa,0x7d]
+
+v_cmpx_class_f16_e32 v1, v2
+// GFX11: v_cmpx_class_f16_e32 v1, v2             ; encoding: [0x01,0x05,0xfa,0x7d]
+
+//===----------------------------------------------------------------------===//
+// VOP1.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_bfrev_b32 v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x70,0x0a,0x7e,0x01,0x77,0x39,0x05]
+
+v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x70,0x0a,0x7e,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOP1.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_bfrev_b32 v5, v1 quad_perm:[3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x70,0x0a,0x7e,0x01,0x1b,0x00,0xff]
+
+v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x70,0x0a,0x7e,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOP2.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0x0a,0x64,0x01,0x77,0x39,0x05]
+
+v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0x0a,0x64,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOP2.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0x0a,0x64,0x01,0x1b,0x00,0xff]
+
+v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0x0a,0x64,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOPC.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_cmp_le_u16 v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1
+// GFX11: v_cmp_le_u16 vcc_lo, v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1 ; encoding: [0xea,0x04,0x76,0x7c,0x01,0xff,0x47,0xfa]
+
+v_cmp_le_u16_dpp v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1
+// GFX11: v_cmp_le_u16 vcc_lo, v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1 ; encoding: [0xea,0x04,0x76,0x7c,0x01,0xff,0x47,0xfa]
+
+//===----------------------------------------------------------------------===//
+// VOPC.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_cmp_gt_u16 v1, v2 row_shl:0x7 row_mask:0x0 bank_mask:0x0 fi:1
+// GFX11: v_cmp_gt_u16 vcc_lo, v1, v2 row_shl:7 row_mask:0x0 bank_mask:0x0 fi:1 ; encoding: [0xfa,0x04,0x78,0x7c,0x01,0x07,0x05,0x00]
+
+v_cmp_gt_u16_dpp v1, v2 row_shl:0x7 row_mask:0x0 bank_mask:0x0 fi:1
+// GFX11: v_cmp_gt_u16 vcc_lo, v1, v2 row_shl:7 row_mask:0x0 bank_mask:0x0 fi:1 ; encoding: [0xfa,0x04,0x78,0x7c,0x01,0x07,0x05,0x00]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0xfa,0x7d,0x01,0x77,0x39,0x05]
+
+v_cmpx_class_f16_dpp v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0xfa,0x7d,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0xfa,0x7d,0x01,0x1b,0x00,0xff]
+
+v_cmpx_class_f16_dpp v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0xfa,0x7d,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOP1 -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_sin_f32 v5, 0.5 mul:2
+// GFX11: v_sin_f32_e64 v5, 0.5 mul:2             ; encoding: [0x05,0x00,0xb5,0xd5,0xf0,0x00,0x00,0x08]
+
+v_sin_f32_e64 v5, 0.5 mul:2
+// GFX11: v_sin_f32_e64 v5, 0.5 mul:2             ; encoding: [0x05,0x00,0xb5,0xd5,0xf0,0x00,0x00,0x08]
+
+v_sin_f32_e64 v5, v1
+// GFX11: v_sin_f32_e64 v5, v1                    ; encoding: [0x05,0x00,0xb5,0xd5,0x01,0x01,0x00,0x00]
+
+v_sin_f32 v5, v1
+// GFX11: v_sin_f32_e32 v5, v1                    ; encoding: [0x01,0x6b,0x0a,0x7e]
+
+//===----------------------------------------------------------------------===//
+// VOP2 -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_add_f32 v5, v1, -v2
+// GFX11: v_add_f32_e64 v5, v1, -v2               ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x40]
+
+v_add_f32_e64 v5, v1, -v2
+// GFX11: v_add_f32_e64 v5, v1, -v2               ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x40]
+
+v_add_f32_e64 v5, v1, v2
+// GFX11: v_add_f32_e64 v5, v1, v2                ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x00]
+
+v_add_f32 v5, v1, v2
+// GFX11: v_add_f32_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x06]
+
+//===----------------------------------------------------------------------===//
+// VOPC -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_cmp_f_f32 s10, -v1, v2
+// GFX11: v_cmp_f_f32_e64 s10, -v1, v2            ; encoding: [0x0a,0x00,0x10,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmp_f_f32_e64 s10, -v1, v2
+// GFX11: v_cmp_f_f32_e64 s10, -v1, v2            ; encoding: [0x0a,0x00,0x10,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmp_f_f32_e64 vcc_lo, v1, v2
+// GFX11: v_cmp_f_f32_e64 vcc_lo, v1, v2          ; encoding: [0x6a,0x00,0x10,0xd4,0x01,0x05,0x02,0x00]
+
+v_cmp_f_f32 vcc_lo, v1, v2
+// GFX11: v_cmp_f_f32_e32 vcc_lo, v1, v2          ; encoding: [0x01,0x05,0x20,0x7c]
+
+//===----------------------------------------------------------------------===//
+// VOPCX -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_f_f32 -v1, v2
+// GFX11: v_cmpx_f_f32_e64 -v1, v2                ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmpx_f_f32_e64 -v1, v2
+// GFX11: v_cmpx_f_f32_e64 -v1, v2                ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmpx_f_f32_e64 v1, v2
+// GFX11: v_cmpx_f_f32_e64 v1, v2                 ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x00]
+
+v_cmpx_f_f32 v1, v2
+// GFX11: v_cmpx_f_f32_e...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-mc

Author: Brox Chen (broxigarchen)

Changes

This is another NFC patch.

Update mc test for a few true16 instructions by duplicating the file to fake16 versions and udpate mattr flag with +/-real-true16


Patch is 46.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135816.diff

8 Files Affected:

  • (added) llvm/test/MC/AMDGPU/bf16_imm-fake16.s (+114)
  • (modified) llvm/test/MC/AMDGPU/bf16_imm.s (+32-32)
  • (added) llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s (+353)
  • (modified) llvm/test/MC/AMDGPU/gfx11-promotions.s (+33-33)
  • (added) llvm/test/MC/AMDGPU/gfx1150_asm_features-fake16.s (+48)
  • (modified) llvm/test/MC/AMDGPU/gfx1150_asm_features.s (+10-10)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_t16.s (+59)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vop3_alias.s (+4-4)
diff --git a/llvm/test/MC/AMDGPU/bf16_imm-fake16.s b/llvm/test/MC/AMDGPU/bf16_imm-fake16.s
new file mode 100644
index 0000000000000..ee697bee6ab2d
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/bf16_imm-fake16.s
@@ -0,0 +1,114 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --version 5
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -mattr=-real-true16 -show-encoding %s | FileCheck %s
+
+v_dot2_bf16_bf16 v5, v1, v2, 100.0
+// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8     ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
+
+v_dot2_bf16_bf16 v2, v0, 1.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 1.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, 1.0, v0, v2
+// CHECK: v_dot2_bf16_bf16 v2, 1.0, v0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
+
+v_dot2_bf16_bf16 v5, v1, v2, 1.0
+// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 1.0        ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
+
+v_dot2_bf16_bf16 v2, v0, -1.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -1.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0.5, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.5, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -0.5, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -0.5, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 2.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 2.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -2.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -2.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 4.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 4.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, -4.0, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, -4.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
+
+// Check 1/(2*pi) rounded value and ideomatic fp32 0.15915494 value
+// which cannot be accurately represented in bf16.
+
+v_dot2_bf16_bf16 v2, v0, 0.158203125, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0.15915494, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, 0x3e22, v2
+// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+
+v_dot2_bf16_bf16 v2, v0, v2, 0.15915494
+// CHECK: v_dot2_bf16_bf16 v2, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
+
+v_dot2_f32_bf16 v2, v1, 0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0, v2           ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0x01,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0.5, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.5, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -0.5, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -0.5, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe3,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 1.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 1.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe5,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -1.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -1.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe7,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 2.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 2.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xe9,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -2.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -2.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xeb,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 4.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 4.0, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xed,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, -4.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, -4.0, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xef,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0.15915494, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.15915494, v2  ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xf1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, v1, 0x3e22, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0.15915494, v2  ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xf1,0x09,0x1c]
+
+v_dot2_f32_bf16 v2, 0.5, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 0.5, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf0,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -0.5, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -0.5, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf1,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 1.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 1.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf2,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -1.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -1.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf3,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 2.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 2.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf4,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -2.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -2.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf5,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 4.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 4.0, v1, v2         ; encoding: [0x02,0x40,0x1a,0xcc,0xf6,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, -4.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, -4.0, v1, v2        ; encoding: [0x02,0x40,0x1a,0xcc,0xf7,0x02,0x0a,0x1c]
+
+v_dot2_f32_bf16 v2, 100.0, v1, v2
+// CHECK: v_dot2_f32_bf16 v2, 0x42c8, v1, v2      ; encoding: [0x02,0x40,0x1a,0xcc,0xff,0x02,0x0a,0x1c,0xc8,0x42,0x00,0x00]
+
+v_dot2_f32_bf16 v2, v1, 100.0, v2
+// CHECK: v_dot2_f32_bf16 v2, v1, 0x42c8, v2      ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0xff,0x09,0x1c,0xc8,0x42,0x00,0x00]
diff --git a/llvm/test/MC/AMDGPU/bf16_imm.s b/llvm/test/MC/AMDGPU/bf16_imm.s
index 7cf18103adfe5..d79649073aa89 100644
--- a/llvm/test/MC/AMDGPU/bf16_imm.s
+++ b/llvm/test/MC/AMDGPU/bf16_imm.s
@@ -1,54 +1,54 @@
 // NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --version 5
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16 -show-encoding %s | FileCheck %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -mattr=+real-true16 -show-encoding %s | FileCheck %s
 
-v_dot2_bf16_bf16 v5, v1, v2, 100.0
-// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8     ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
+v_dot2_bf16_bf16 v5.l, v1, v2, 100.0
+// CHECK: v_dot2_bf16_bf16 v5.l, v1, v2, 0x42c8   ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xfe,0x03,0xc8,0x42,0x00,0x00]
 
-v_dot2_bf16_bf16 v2, v0, 1.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 1.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 1.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 1.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe5,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, 1.0, v0, v2
-// CHECK: v_dot2_bf16_bf16 v2, 1.0, v0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
+v_dot2_bf16_bf16 v2.l, 1.0, v0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, 1.0, v0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0xf2,0x00,0x0a,0x04]
 
-v_dot2_bf16_bf16 v5, v1, v2, 1.0
-// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 1.0        ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
+v_dot2_bf16_bf16 v5.l, v1, v2, 1.0
+// CHECK: v_dot2_bf16_bf16 v5.l, v1, v2, 1.0      ; encoding: [0x05,0x00,0x67,0xd6,0x01,0x05,0xca,0x03]
 
-v_dot2_bf16_bf16 v2, v0, -1.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -1.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -1.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -1.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe7,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0.5, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.5, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.5, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.5, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -0.5, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -0.5, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -0.5, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -0.5, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe3,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 2.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 2.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 2.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 2.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xe9,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -2.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -2.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -2.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -2.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xeb,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 4.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 4.0, v2        ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 4.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 4.0, v2.l    ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xed,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, -4.0, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, -4.0, v2       ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, -4.0, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, -4.0, v2.l   ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xef,0x09,0x04]
 
 // Check 1/(2*pi) rounded value and ideomatic fp32 0.15915494 value
 // which cannot be accurately represented in bf16.
 
-v_dot2_bf16_bf16 v2, v0, 0.158203125, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.158203125, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0.15915494, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, 0x3e22, v2
-// CHECK: v_dot2_bf16_bf16 v2, v0, 0.15915494, v2 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
+v_dot2_bf16_bf16 v2.l, v0, 0x3e22, v2.l
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, 0.15915494, v2.l ; encoding: [0x02,0x00,0x67,0xd6,0x00,0xf1,0x09,0x04]
 
-v_dot2_bf16_bf16 v2, v0, v2, 0.15915494
-// CHECK: v_dot2_bf16_bf16 v2, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
+v_dot2_bf16_bf16 v2.l, v0, v2, 0.15915494
+// CHECK: v_dot2_bf16_bf16 v2.l, v0, v2, 0.15915494 ; encoding: [0x02,0x00,0x67,0xd6,0x00,0x05,0xe2,0x03]
 
 v_dot2_f32_bf16 v2, v1, 0, v2
 // CHECK: v_dot2_f32_bf16 v2, v1, 0, v2           ; encoding: [0x02,0x40,0x1a,0xcc,0x01,0x01,0x09,0x1c]
diff --git a/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s b/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s
new file mode 100644
index 0000000000000..95a52ffe103fa
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11-promotions-fake16.s
@@ -0,0 +1,353 @@
+// RUN: llvm-mc -triple=amdgcn -show-encoding -mcpu=gfx1100 -mattr=+wavefrontsize32,-real-true16 %s | FileCheck --check-prefix=GFX11 %s
+
+// Check opcode promotions and forced suffices.
+// 1. When a suffix is optional, check that it may be omitted.
+// 2. When a suffix is optional, check that it may be specified w/o any effect.
+// 3. When a suffix is required, check that specifying it enforces opcode promotion.
+// 4. When a suffix is required, check that omitting the suffix results in a different encoding.
+
+//===----------------------------------------------------------------------===//
+// VOP1.
+//===----------------------------------------------------------------------===//
+
+v_mov_b32 v0, v1
+// GFX11: v_mov_b32_e32 v0, v1                    ; encoding: [0x01,0x03,0x00,0x7e]
+
+v_mov_b32_e32 v0, v1
+// GFX11: v_mov_b32_e32 v0, v1                    ; encoding: [0x01,0x03,0x00,0x7e]
+
+//===----------------------------------------------------------------------===//
+// VOP2.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2
+// GFX11: v_add_f16_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x64]
+
+v_add_f16_e32 v5, v1, v2
+// GFX11: v_add_f16_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x64]
+
+//===----------------------------------------------------------------------===//
+// VOPC.
+//===----------------------------------------------------------------------===//
+
+v_cmp_lt_f32 vcc_lo, v1, v2
+// GFX11: v_cmp_lt_f32_e32 vcc_lo, v1, v2         ; encoding: [0x01,0x05,0x22,0x7c]
+
+v_cmp_lt_f32_e32 vcc_lo, v1, v2
+// GFX11: v_cmp_lt_f32_e32 vcc_lo, v1, v2         ; encoding: [0x01,0x05,0x22,0x7c]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2
+// GFX11: v_cmpx_class_f16_e32 v1, v2             ; encoding: [0x01,0x05,0xfa,0x7d]
+
+v_cmpx_class_f16_e32 v1, v2
+// GFX11: v_cmpx_class_f16_e32 v1, v2             ; encoding: [0x01,0x05,0xfa,0x7d]
+
+//===----------------------------------------------------------------------===//
+// VOP1.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_bfrev_b32 v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x70,0x0a,0x7e,0x01,0x77,0x39,0x05]
+
+v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x70,0x0a,0x7e,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOP1.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_bfrev_b32 v5, v1 quad_perm:[3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x70,0x0a,0x7e,0x01,0x1b,0x00,0xff]
+
+v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0]
+// GFX11: v_bfrev_b32_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x70,0x0a,0x7e,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOP2.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0x0a,0x64,0x01,0x77,0x39,0x05]
+
+v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0x0a,0x64,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOP2.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_add_f16 v5, v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0x0a,0x64,0x01,0x1b,0x00,0xff]
+
+v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_add_f16_dpp v5, v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0x0a,0x64,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOPC.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_cmp_le_u16 v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1
+// GFX11: v_cmp_le_u16 vcc_lo, v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1 ; encoding: [0xea,0x04,0x76,0x7c,0x01,0xff,0x47,0xfa]
+
+v_cmp_le_u16_dpp v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1
+// GFX11: v_cmp_le_u16 vcc_lo, v1, v2 dpp8:[7,7,7,3,4,4,6,7] fi:1 ; encoding: [0xea,0x04,0x76,0x7c,0x01,0xff,0x47,0xfa]
+
+//===----------------------------------------------------------------------===//
+// VOPC.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_cmp_gt_u16 v1, v2 row_shl:0x7 row_mask:0x0 bank_mask:0x0 fi:1
+// GFX11: v_cmp_gt_u16 vcc_lo, v1, v2 row_shl:7 row_mask:0x0 bank_mask:0x0 fi:1 ; encoding: [0xfa,0x04,0x78,0x7c,0x01,0x07,0x05,0x00]
+
+v_cmp_gt_u16_dpp v1, v2 row_shl:0x7 row_mask:0x0 bank_mask:0x0 fi:1
+// GFX11: v_cmp_gt_u16 vcc_lo, v1, v2 row_shl:7 row_mask:0x0 bank_mask:0x0 fi:1 ; encoding: [0xfa,0x04,0x78,0x7c,0x01,0x07,0x05,0x00]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.DPP8.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0xfa,0x7d,0x01,0x77,0x39,0x05]
+
+v_cmpx_class_f16_dpp v1, v2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xe9,0x04,0xfa,0x7d,0x01,0x77,0x39,0x05]
+
+//===----------------------------------------------------------------------===//
+// VOPCX.DPP16.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0xfa,0x7d,0x01,0x1b,0x00,0xff]
+
+v_cmpx_class_f16_dpp v1, v2 quad_perm:[3,2,1,0]
+// GFX11: v_cmpx_class_f16 v1, v2 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xfa,0x04,0xfa,0x7d,0x01,0x1b,0x00,0xff]
+
+//===----------------------------------------------------------------------===//
+// VOP1 -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_sin_f32 v5, 0.5 mul:2
+// GFX11: v_sin_f32_e64 v5, 0.5 mul:2             ; encoding: [0x05,0x00,0xb5,0xd5,0xf0,0x00,0x00,0x08]
+
+v_sin_f32_e64 v5, 0.5 mul:2
+// GFX11: v_sin_f32_e64 v5, 0.5 mul:2             ; encoding: [0x05,0x00,0xb5,0xd5,0xf0,0x00,0x00,0x08]
+
+v_sin_f32_e64 v5, v1
+// GFX11: v_sin_f32_e64 v5, v1                    ; encoding: [0x05,0x00,0xb5,0xd5,0x01,0x01,0x00,0x00]
+
+v_sin_f32 v5, v1
+// GFX11: v_sin_f32_e32 v5, v1                    ; encoding: [0x01,0x6b,0x0a,0x7e]
+
+//===----------------------------------------------------------------------===//
+// VOP2 -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_add_f32 v5, v1, -v2
+// GFX11: v_add_f32_e64 v5, v1, -v2               ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x40]
+
+v_add_f32_e64 v5, v1, -v2
+// GFX11: v_add_f32_e64 v5, v1, -v2               ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x40]
+
+v_add_f32_e64 v5, v1, v2
+// GFX11: v_add_f32_e64 v5, v1, v2                ; encoding: [0x05,0x00,0x03,0xd5,0x01,0x05,0x02,0x00]
+
+v_add_f32 v5, v1, v2
+// GFX11: v_add_f32_e32 v5, v1, v2                ; encoding: [0x01,0x05,0x0a,0x06]
+
+//===----------------------------------------------------------------------===//
+// VOPC -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_cmp_f_f32 s10, -v1, v2
+// GFX11: v_cmp_f_f32_e64 s10, -v1, v2            ; encoding: [0x0a,0x00,0x10,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmp_f_f32_e64 s10, -v1, v2
+// GFX11: v_cmp_f_f32_e64 s10, -v1, v2            ; encoding: [0x0a,0x00,0x10,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmp_f_f32_e64 vcc_lo, v1, v2
+// GFX11: v_cmp_f_f32_e64 vcc_lo, v1, v2          ; encoding: [0x6a,0x00,0x10,0xd4,0x01,0x05,0x02,0x00]
+
+v_cmp_f_f32 vcc_lo, v1, v2
+// GFX11: v_cmp_f_f32_e32 vcc_lo, v1, v2          ; encoding: [0x01,0x05,0x20,0x7c]
+
+//===----------------------------------------------------------------------===//
+// VOPCX -> VOP3.
+//===----------------------------------------------------------------------===//
+
+v_cmpx_f_f32 -v1, v2
+// GFX11: v_cmpx_f_f32_e64 -v1, v2                ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmpx_f_f32_e64 -v1, v2
+// GFX11: v_cmpx_f_f32_e64 -v1, v2                ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x20]
+
+v_cmpx_f_f32_e64 v1, v2
+// GFX11: v_cmpx_f_f32_e64 v1, v2                 ; encoding: [0x7e,0x00,0x90,0xd4,0x01,0x05,0x02,0x00]
+
+v_cmpx_f_f32 v1, v2
+// GFX11: v_cmpx_f_f32_e...
[truncated]

@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-clean-up-4 branch from 5a21bc5 to bc2eeac Compare April 15, 2025 17:15
@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-clean-up-4 branch from bc2eeac to 25dc466 Compare April 15, 2025 17:17
@@ -0,0 +1,43 @@
// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 %s 2>&1 | FileCheck %s -check-prefix=GCN-ERR --implicit-check-not=error: --strict-whitespace
// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1200 -mattr=-real-true16 %s 2>&1 | FileCheck %s -check-prefix=GCN-ERR --implicit-check-not=error: --strict-whitespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also need to add -mattr=+real-true16 to llvm/test/MC/AMDGPU/gfx11_asm_vinterp_err.s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's in a seperate PR #135823

@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-clean-up-4 branch from 75d7a52 to fce2897 Compare April 15, 2025 19:44
@broxigarchen broxigarchen force-pushed the main-merge-true16-mc-clean-up-4 branch from fce2897 to 6d793ef Compare April 15, 2025 19:45
@broxigarchen broxigarchen requested a review from kosarev April 15, 2025 19:45
Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@broxigarchen broxigarchen merged commit 181872f into llvm:main Apr 16, 2025
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants