You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[RISCV] Commute True in foldVMergeToMask (#156499)
In order to fold a vmerge into a pseudo, the pseudo's passthru needs to
be the same as vmerge's false operand.
If they don't match we can try and commute the instruction if possible,
e.g. here we can commute v9 and v8 to fold the vmerge:
vsetvli zero, a0, e32, m1, ta, ma
vfmadd.vv v9, v10, v8
vsetvli zero, zero, e32, m1, tu, ma
vmerge.vvm v8, v8, v9, v0
vsetvli zero, a0, e32, m1, tu, mu
vfmacc.vv v8, v9, v10, v0.t
Previously this wasn't possible because we did the peephole in
SelectionDAG, but now that it's been migrated to MachineInstr in #144076
we can reuse the commuting infrastructure in TargetInstrInfo.
This fixes the extra vmv.v.v in the "mul" example here:
#123069 (comment)
It should also allow us to remove the isel patterns described in #141885
later.
Copy file name to clipboardExpand all lines: llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
+12Lines changed: 12 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1215,3 +1215,15 @@ define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthr
1215
1215
)
1216
1216
ret <vscale x 2 x i32> %b
1217
1217
}
1218
+
1219
+
define <vscale x 2 x float> @commute_vfmadd(<vscale x 2 x float> %passthru, <vscale x 2 x float> %x, <vscale x 2 x float> %y, <vscale x 2 x i1> %mask, i32zeroext%evl) {
1220
+
; CHECK-LABEL: commute_vfmadd:
1221
+
; CHECK: # %bb.0:
1222
+
; CHECK-NEXT: vsetvli zero, a0, e32, m1, tu, mu
1223
+
; CHECK-NEXT: vfmacc.vv v8, v9, v10, v0.t
1224
+
; CHECK-NEXT: ret
1225
+
%fmul = fmul contract <vscale x 2 x float> %x, %y
1226
+
%fadd = fadd contract <vscale x 2 x float> %fmul, %passthru
1227
+
%merge = call <vscale x 2 x float> @llvm.vp.merge(<vscale x 2 x i1> %mask, <vscale x 2 x float> %fadd, <vscale x 2 x float> %passthru, i32%evl)
0 commit comments