Skip to content

Commit 44cffbe

Browse files
mshockwavetopperc
andauthored
[RISCV] Propagate SDNode flags when combining (fmul (fneg X), ...) (#169460)
In #157388, we turned `(fmul (fneg X), Y)` into `(fneg (fmul X, Y))`. However, we forgot to propagate SDNode flags, specifically fast math flags, from the original FMUL to the new one. This hinders some of the subsequent (FMA) DAG combiner patterns that relied on the contraction flag and as a consequence, missed some of the opportunities to generate negation FMA instructions like `fnmadd`. This patch fixes this issue by propagating the flags. --------- Co-authored-by: Craig Topper <[email protected]>
1 parent 84df446 commit 44cffbe

File tree

2 files changed

+58
-1
lines changed

2 files changed

+58
-1
lines changed

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20822,7 +20822,8 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
2082220822
// Undo this and sink the fneg so we match more fmsub/fnmadd patterns.
2082320823
if (sd_match(N, m_FMul(m_Value(X), m_OneUse(m_FNeg(m_Value(Y))))))
2082420824
return DAG.getNode(ISD::FNEG, DL, VT,
20825-
DAG.getNode(ISD::FMUL, DL, VT, X, Y));
20825+
DAG.getNode(ISD::FMUL, DL, VT, X, Y, N->getFlags()),
20826+
N->getFlags());
2082620827

2082720828
// fmul X, (copysign 1.0, Y) -> fsgnjx X, Y
2082820829
SDValue N0 = N->getOperand(0);
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
2+
; RUN: llc -mtriple=riscv64 -mattr=+d,+m < %s | FileCheck %s
3+
4+
; What the original PR (#169460) tried to solve can only be revealed when a specific
5+
; set of FMA DAG combiner patterns were skipped due to hitting some recursion limits.
6+
; And this test is written in a way to hit that limit.
7+
8+
define double @fnmadd_non_trivial(ptr %p0, ptr %p1, ptr %dst, double %mul425) {
9+
; CHECK-LABEL: fnmadd_non_trivial:
10+
; CHECK: # %bb.0:
11+
; CHECK-NEXT: li a3, -2047
12+
; CHECK-NEXT: slli a3, a3, 51
13+
; CHECK-NEXT: fmv.d.x fa5, a3
14+
; CHECK-NEXT: lui a3, 2049
15+
; CHECK-NEXT: slli a3, a3, 39
16+
; CHECK-NEXT: fmv.d.x fa4, a3
17+
; CHECK-NEXT: lui a3, 8201
18+
; CHECK-NEXT: slli a3, a3, 37
19+
; CHECK-NEXT: fmv.d.x fa3, a3
20+
; CHECK-NEXT: li a3, 1023
21+
; CHECK-NEXT: fmv.d.x fa2, zero
22+
; CHECK-NEXT: slli a3, a3, 52
23+
; CHECK-NEXT: fsub.d fa1, fa2, fa0
24+
; CHECK-NEXT: fmadd.d fa1, fa1, fa3, fa4
25+
; CHECK-NEXT: fmadd.d fa4, fa0, fa3, fa4
26+
; CHECK-NEXT: fmv.d.x fa3, a3
27+
; CHECK-NEXT: lui a3, %hi(.LCPI0_0)
28+
; CHECK-NEXT: ld a3, %lo(.LCPI0_0)(a3)
29+
; CHECK-NEXT: fmul.d fa5, fa0, fa5
30+
; CHECK-NEXT: fnmadd.d fa4, fa4, fa2, fa3
31+
; CHECK-NEXT: fnmadd.d fa3, fa1, fa2, fa3
32+
; CHECK-NEXT: sd a3, 0(a2)
33+
; CHECK-NEXT: fsd fa5, 0(a0)
34+
; CHECK-NEXT: fnmadd.d fa5, fa4, fa2, fa0
35+
; CHECK-NEXT: fnmadd.d fa0, fa0, fa2, fa3
36+
; CHECK-NEXT: fsd fa5, 0(a1)
37+
; CHECK-NEXT: ret
38+
store double 0x3FEE666666666666, ptr %dst, align 8
39+
%mul413 = fmul double %mul425, -3.000000e+00
40+
store double %mul413, ptr %p0, align 8
41+
%mul428 = fmul contract double %mul425, 4.500000e+00
42+
%add429 = fadd nsz contract double %mul428, 3.000000e+00
43+
%mul430 = fmul contract double %add429, 0.000000e+00
44+
%sub432 = fadd nsz contract double %mul430, 1.000000e+00
45+
%mul433 = fmul contract double %sub432, 0.000000e+00
46+
%1 = fsub nsz contract double %mul433, %mul425
47+
store double %1, ptr %p1, align 8
48+
%mul441 = fmul contract double %mul425, 0.000000e+00
49+
%add443 = fsub double 0.000000e+00, %mul425
50+
%mul446 = fmul contract double %add443, 4.500000e+00
51+
%add447 = fadd nsz contract double %mul446, 3.000000e+00
52+
%mul448 = fmul contract double %add447, 0.000000e+00
53+
%sub450 = fadd nsz contract double %mul448, 1.000000e+00
54+
%2 = fsub nsz contract double %sub450, %mul441
55+
ret double %2
56+
}

0 commit comments

Comments
 (0)