Skip to content

Conversation

@shiltian
Copy link
Contributor

These instructions use src0, imm, src1 as operand.

Fixes SWDEV-566579.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@shiltian shiltian requested review from arsenm and rampitec November 14, 2025 19:25
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

These instructions use src0, imm, src1 as operand.

Fixes SWDEV-566579.


Full diff: https://github.com/llvm/llvm-project/pull/168107.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+29-4)
  • (modified) llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir (+4-4)
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index 37bf2d2463ae2..6197d39b436db 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -3439,17 +3439,42 @@ getVGPRLoweringOperandTables(const MCInstrDesc &Desc) {
       AMDGPU::OpName::src0Y, AMDGPU::OpName::vsrc1Y, AMDGPU::OpName::vsrc2Y,
       AMDGPU::OpName::vdstY};
 
+  // VOP2 MADMK instructions use src0, imm, src1 scheme.
+  static const AMDGPU::OpName VOP2MADMKOps[4] = {
+      AMDGPU::OpName::src0, AMDGPU::OpName::imm, AMDGPU::OpName::src1,
+      AMDGPU::OpName::vdst};
+
   unsigned TSFlags = Desc.TSFlags;
 
   if (TSFlags &
       (SIInstrFlags::VOP1 | SIInstrFlags::VOP2 | SIInstrFlags::VOP3 |
        SIInstrFlags::VOP3P | SIInstrFlags::VOPC | SIInstrFlags::DPP)) {
+    switch (Desc.getOpcode()) {
     // LD_SCALE operands ignore MSB.
-    if (Desc.getOpcode() == AMDGPU::V_WMMA_LD_SCALE_PAIRED_B32 ||
-        Desc.getOpcode() == AMDGPU::V_WMMA_LD_SCALE_PAIRED_B32_gfx1250 ||
-        Desc.getOpcode() == AMDGPU::V_WMMA_LD_SCALE16_PAIRED_B64 ||
-        Desc.getOpcode() == AMDGPU::V_WMMA_LD_SCALE16_PAIRED_B64_gfx1250)
+    case AMDGPU::V_WMMA_LD_SCALE_PAIRED_B32:
+    case AMDGPU::V_WMMA_LD_SCALE_PAIRED_B32_gfx1250:
+    case AMDGPU::V_WMMA_LD_SCALE16_PAIRED_B64:
+    case AMDGPU::V_WMMA_LD_SCALE16_PAIRED_B64_gfx1250:
       return {};
+    case AMDGPU::V_FMAMK_F16_fake16_gfx11:
+    case AMDGPU::V_FMAMK_F16_fake16_gfx12:
+    case AMDGPU::V_FMAMK_F16_gfx10:
+    case AMDGPU::V_FMAMK_F16_t16_gfx11:
+    case AMDGPU::V_FMAMK_F16_t16_gfx12:
+    case AMDGPU::V_FMAMK_F32_gfx10:
+    case AMDGPU::V_FMAMK_F32_gfx11:
+    case AMDGPU::V_FMAMK_F32_gfx12:
+    case AMDGPU::V_FMAMK_F32_gfx940:
+    case AMDGPU::V_FMAMK_F64_gfx1250:
+    case AMDGPU::V_FMAMK_F16:
+    case AMDGPU::V_FMAMK_F16_t16:
+    case AMDGPU::V_FMAMK_F16_fake16:
+    case AMDGPU::V_FMAMK_F32:
+    case AMDGPU::V_FMAMK_F64:
+      return {VOP2MADMKOps, nullptr};
+    default:
+      break;
+    }
     return {VOPOps, nullptr};
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir b/llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir
index 7e1c28f8e7bbb..1d4deac42d528 100644
--- a/llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir
+++ b/llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir
@@ -332,19 +332,19 @@ body:             |
     ; GCN-NEXT: v_fmaak_f32 v0 /*v256*/, v1, v2 /*v258*/, 0x1
     $vgpr256 = V_FMAAK_F32 undef $vgpr1, undef $vgpr258, 1, implicit $exec, implicit $mode
 
-    ; GCN-NEXT: s_set_vgpr_msb 0x4445
+    ; GCN-NEXT: s_set_vgpr_msb 0x4451
     ; GCN-NEXT: v_fmamk_f32 v0 /*v256*/, v1 /*v257*/, 0x1, v2 /*v258*/
     $vgpr256 = V_FMAMK_F32 undef $vgpr257, 1, undef $vgpr258, implicit $exec, implicit $mode
 
-    ; GCN-NEXT: s_set_vgpr_msb 0x4505
+    ; GCN-NEXT: s_set_vgpr_msb 0x5111
     ; GCN-NEXT: v_fmamk_f32 v0, v1 /*v257*/, 0x1, v2 /*v258*/
     $vgpr0 = V_FMAMK_F32 undef $vgpr257, 1, undef $vgpr258, implicit $exec, implicit $mode
 
-    ; GCN-NEXT: s_set_vgpr_msb 0x541
+    ; GCN-NEXT: s_set_vgpr_msb 0x1141
     ; GCN-NEXT: v_fmamk_f32 v0 /*v256*/, v1 /*v257*/, 0x1, v2
     $vgpr256 = V_FMAMK_F32 undef $vgpr257, 1, undef $vgpr2, implicit $exec, implicit $mode
 
-    ; GCN-NEXT: s_set_vgpr_msb 0x4144
+    ; GCN-NEXT: s_set_vgpr_msb 0x4150
     ; GCN-NEXT: v_fmamk_f32 v0 /*v256*/, v1, 0x1, v2 /*v258*/
     $vgpr256 = V_FMAMK_F32 undef $vgpr1, 1, undef $vgpr258, implicit $exec, implicit $mode
 

@shiltian shiltian force-pushed the users/shiltian/fix-wrong-msb-encoding-for-fmamk-insts branch from 2f210fc to 4b01113 Compare November 14, 2025 20:00
@shiltian shiltian force-pushed the users/shiltian/fix-wrong-msb-encoding-for-fmamk-insts branch from c395b9e to e645aa1 Compare November 14, 2025 22:15
Copy link
Collaborator

@rampitec rampitec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shiltian shiltian enabled auto-merge (squash) November 14, 2025 22:48
@shiltian shiltian merged commit 72a6ae6 into main Nov 14, 2025
9 of 10 checks passed
@shiltian shiltian deleted the users/shiltian/fix-wrong-msb-encoding-for-fmamk-insts branch November 14, 2025 22:50
@rampitec
Copy link
Collaborator

rampitec commented Nov 15, 2025

JBTW, it would be much easier to read the test if you include the rest of the commentary about MSB values. Would be a nice follow-up, like ASM-SAME comments.

@shiltian
Copy link
Contributor Author

JBTW, it would be much easier to read the test if you include the rest of the commentary about MSB values. Would be a nice follow-up, like ASM-SAME comments.

I'll update them when I fix the register class for those F16 instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants