Skip to content

Commit 410764c

Browse files
authored
[RISCV] Commute True in foldVMergeToMask (#156499)
In order to fold a vmerge into a pseudo, the pseudo's passthru needs to be the same as vmerge's false operand. If they don't match we can try and commute the instruction if possible, e.g. here we can commute v9 and v8 to fold the vmerge: vsetvli zero, a0, e32, m1, ta, ma vfmadd.vv v9, v10, v8 vsetvli zero, zero, e32, m1, tu, ma vmerge.vvm v8, v8, v9, v0 vsetvli zero, a0, e32, m1, tu, mu vfmacc.vv v8, v9, v10, v0.t Previously this wasn't possible because we did the peephole in SelectionDAG, but now that it's been migrated to MachineInstr in #144076 we can reuse the commuting infrastructure in TargetInstrInfo. This fixes the extra vmv.v.v in the "mul" example here: #123069 (comment) It should also allow us to remove the isel patterns described in #141885 later.
1 parent 4ec8908 commit 410764c

File tree

4 files changed

+208
-276
lines changed

4 files changed

+208
-276
lines changed

llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -745,12 +745,24 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
745745
if (PassthruReg && !isKnownSameDefs(PassthruReg, FalseReg))
746746
return false;
747747

748+
std::optional<std::pair<unsigned, unsigned>> NeedsCommute;
749+
748750
// If True has a passthru operand then it needs to be the same as vmerge's
749751
// False, since False will be used for the result's passthru operand.
750752
Register TruePassthru = True.getOperand(True.getNumExplicitDefs()).getReg();
751753
if (RISCVII::isFirstDefTiedToFirstUse(True.getDesc()) && TruePassthru &&
752-
!isKnownSameDefs(TruePassthru, FalseReg))
753-
return false;
754+
!isKnownSameDefs(TruePassthru, FalseReg)) {
755+
// If True's passthru != False, check if it uses False in another operand
756+
// and try to commute it.
757+
int OtherIdx = True.findRegisterUseOperandIdx(FalseReg, TRI);
758+
if (OtherIdx == -1)
759+
return false;
760+
unsigned OpIdx1 = OtherIdx;
761+
unsigned OpIdx2 = True.getNumExplicitDefs();
762+
if (!TII->findCommutedOpIndices(True, OpIdx1, OpIdx2))
763+
return false;
764+
NeedsCommute = {OpIdx1, OpIdx2};
765+
}
754766

755767
// Make sure it doesn't raise any observable fp exceptions, since changing the
756768
// active elements will affect how fflags is set.
@@ -796,6 +808,14 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
796808
if (!ensureDominates(MaskOp, True))
797809
return false;
798810

811+
if (NeedsCommute) {
812+
auto [OpIdx1, OpIdx2] = *NeedsCommute;
813+
[[maybe_unused]] bool Commuted =
814+
TII->commuteInstruction(True, /*NewMI=*/false, OpIdx1, OpIdx2);
815+
assert(Commuted && "Failed to commute True?");
816+
Info = RISCV::lookupMaskedIntrinsicByUnmasked(True.getOpcode());
817+
}
818+
799819
True.setDesc(TII->get(Info->MaskedPseudo));
800820

801821
// Insert the mask operand.

llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1215,3 +1215,15 @@ define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthr
12151215
)
12161216
ret <vscale x 2 x i32> %b
12171217
}
1218+
1219+
define <vscale x 2 x float> @commute_vfmadd(<vscale x 2 x float> %passthru, <vscale x 2 x float> %x, <vscale x 2 x float> %y, <vscale x 2 x i1> %mask, i32 zeroext %evl) {
1220+
; CHECK-LABEL: commute_vfmadd:
1221+
; CHECK: # %bb.0:
1222+
; CHECK-NEXT: vsetvli zero, a0, e32, m1, tu, mu
1223+
; CHECK-NEXT: vfmacc.vv v8, v9, v10, v0.t
1224+
; CHECK-NEXT: ret
1225+
%fmul = fmul contract <vscale x 2 x float> %x, %y
1226+
%fadd = fadd contract <vscale x 2 x float> %fmul, %passthru
1227+
%merge = call <vscale x 2 x float> @llvm.vp.merge(<vscale x 2 x i1> %mask, <vscale x 2 x float> %fadd, <vscale x 2 x float> %passthru, i32 %evl)
1228+
ret <vscale x 2 x float> %merge
1229+
}

0 commit comments

Comments
 (0)