Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 21 additions & 4 deletions llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ class AMDGPURewriteAGPRCopyMFMAImpl {

/// Compute the register class constraints based on the uses of \p Reg,
/// excluding MFMA uses from which can be rewritten to change the register
/// class constraint. This should be nearly identical to
/// MachineRegisterInfo::recomputeRegClass.
/// class constraint. MFMA scale operands need to be constraint checked.
/// This should be nearly identical to MachineRegisterInfo::recomputeRegClass.

/// \p RewriteCandidates will collect the set of MFMA instructions that need
/// to have the opcode mutated to perform the replacement.
Expand Down Expand Up @@ -152,9 +152,26 @@ bool AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(

// We can swap the classes of dst + src2 as a pair to AGPR, so ignore the
// effects of rewrite candidates. It just so happens that we can use
// either AGPR or VGPR in src0/src1, so don't bother checking the
// constraint effects of the individual operands.
// either AGPR or VGPR in src0/src1. We still need to check constraint
// effects for scale variant, which does not allow AGPR.
if (isRewriteCandidate(*MI)) {

int AGPROp = AMDGPU::getMFMASrcCVDstAGPROp(MI->getOpcode());
MachineInstrBuilder TmpMIB =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should not be creating temporary instructions

BuildMI(*MI->getParent(), MI->getIterator(), MI->getDebugLoc(),
TII.get(AGPROp));
for (const MachineOperand &TmpMO : MI->operands())
TmpMIB.add(TmpMO);
MachineInstr *TmpMI = TmpMIB.getInstr();
unsigned OpNo = &MO - &MI->getOperand(0);
const TargetRegisterClass *EquivalentAGPRRegClass =
TRI.getEquivalentAGPRClass(MRI.getRegClass(Reg));
const TargetRegisterClass *Allowed = TmpMI->getRegClassConstraintEffect(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want TargetInstrInfo::getRegClass to get the static constraint of the known operand (alternatively, you could check that the use is one of the known src0/src1 operands and not the _scale name)

OpNo, EquivalentAGPRRegClass, &TII, &TRI);
TmpMI->eraseFromParent();
if (!Allowed || Allowed != EquivalentAGPRRegClass)
return false;

const MachineOperand *VDst =
TII.getNamedOperand(*MI, AMDGPU::OpName::vdst);
const MachineOperand *Src2 =
Expand Down
10 changes: 6 additions & 4 deletions llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-scale-to-agpr.mir
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# RUN: not --crash llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=greedy,amdgpu-rewrite-agpr-copy-mfma -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
# CHECK: Illegal virtual register for instruction
# CHECK: Expected a VGPR_32 register, but got a AGPR_32 register

# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx950 -run-pass=greedy,amdgpu-rewrite-agpr-copy-mfma -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
# CHECK: bb.1:
# CHECK: dead %{{[0-9]+}}:vreg_128_align2 = V_MFMA_SCALE_F32_16X16X128_F8F6F4_f4_f4_vgprcd_e64 %{{[0-9]+}}, %{{[0-9]+}}, %{{[0-9]+}}, 4, 4, %{{[0-9]+}}, %[[REG:[0-9]+]], 4, 0, implicit $mode, implicit $exec
# CHECK: %{{[0-9]+}}:agpr_32 = IMPLICIT_DEF
# CHECK: %[[REG]]:vgpr_32 = COPY %{{[0-9]+}}

# Test for issue in amdgpu-rewrite-agpr-copy-mfma, which reassigns scale operand
# in vgpr_32 register to agpr_32, not permitted by instruction format.
---
Expand Down