Skip to content

Conversation

@hjagasiaAMD
Copy link
Contributor

In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale operands, not permitted by instruction format.

In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale
operands, not permitted by instruction format.
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: None (hjagasiaAMD)

Changes

In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale operands, not permitted by instruction format.


Full diff: https://github.com/llvm/llvm-project/pull/168964.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp (+4)
  • (modified) llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir (+3-3)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
index 89c16dadb4b41..b5e3187289160 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
@@ -302,6 +302,10 @@ bool AMDGPURewriteAGPRCopyMFMAImpl::attemptReassignmentsToAGPR(
     const TargetRegisterClass *EquivalentAGPRRegClass =
         TRI.getEquivalentAGPRClass(MRI.getRegClass(InterferingReg));
 
+    // Do not reassign scale operands
+    if (EquivalentAGPRRegClass == &AMDGPU::AGPR_32RegClass)
+      return false;
+
     MCPhysReg Assignable = AMDGPU::NoRegister;
     if (EquivalentAGPRRegClass->contains(PrefPhysReg) &&
         LRM.checkInterference(ReassignLI, PrefPhysReg) ==
diff --git a/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir b/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir
index ab56c9982753f..12be806960b67 100644
--- a/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir
+++ b/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir
@@ -1,6 +1,6 @@
-# RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=greedy,amdgpu-rewrite-agpr-copy-mfma -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
-# CHECK: Illegal virtual register for instruction
-# CHECK: Expected a VGPR_32 register, but got a AGPR_32 register
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=greedy,amdgpu-rewrite-agpr-copy-mfma -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
+# CHECK-NOT: Illegal virtual register for instruction
+# CHECK-NOT: Expected a VGPR_32 register, but got a AGPR_32 register
  
 # Test for issue in amdgpu-rewrite-agpr-copy-mfma, which reassigns scale operand
 # in vgpr_32 register to agpr_32, not permitted by instruction format.

@ronlieb ronlieb requested review from arsenm and ronlieb November 20, 2025 22:30
# CHECK: Illegal virtual register for instruction
# CHECK: Expected a VGPR_32 register, but got a AGPR_32 register
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=greedy,amdgpu-rewrite-agpr-copy-mfma -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
# CHECK-NOT: Illegal virtual register for instruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-NOT checks are close to useless, especially for checking error messages. Generate checks for the actual output

TRI.getEquivalentAGPRClass(MRI.getRegClass(InterferingReg));

// Do not reassign scale operands
if (EquivalentAGPRRegClass == &AMDGPU::AGPR_32RegClass)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like the right condition. It just happens the scale operands are the only 32-bit input case. This should more directly check the operand constraint instead of checking a hardcodes class equality

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

🐧 Linux x64 Test Results

  • 186858 tests passed
  • 4906 tests skipped

✅ The build succeeded and all tests passed.

@hjagasiaAMD hjagasiaAMD requested a review from arsenm December 1, 2025 17:36
Copy link
Contributor

@ronlieb ronlieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, need @arsenm to do final approve

if (isRewriteCandidate(*MI)) {

int AGPROp = AMDGPU::getMFMASrcCVDstAGPROp(MI->getOpcode());
MachineInstrBuilder TmpMIB =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should not be creating temporary instructions

unsigned OpNo = &MO - &MI->getOperand(0);
const TargetRegisterClass *EquivalentAGPRRegClass =
TRI.getEquivalentAGPRClass(MRI.getRegClass(Reg));
const TargetRegisterClass *Allowed = TmpMI->getRegClassConstraintEffect(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want TargetInstrInfo::getRegClass to get the static constraint of the known operand (alternatively, you could check that the use is one of the known src0/src1 operands and not the _scale name)

@hjagasiaAMD hjagasiaAMD requested a review from arsenm December 2, 2025 16:27
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@ronlieb ronlieb merged commit 2183846 into llvm:main Dec 2, 2025
10 checks passed
kcloudy0717 pushed a commit to kcloudy0717/llvm-project that referenced this pull request Dec 4, 2025
In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale
operands, not permitted by instruction format.

---------

Co-authored-by: Matt Arsenault <[email protected]>
honeygoyal pushed a commit to honeygoyal/llvm-project that referenced this pull request Dec 9, 2025
In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale
operands, not permitted by instruction format.

---------

Co-authored-by: Matt Arsenault <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants