-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[llvm] Ensure that soft float targets don't emit fma() libcalls.
#106615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-arm Author: Alex Rønne Petersen (alexrp) ChangesThe previous behavior could be harmful in some edge cases, such as emitting a call to Do this by just being more accurate in Patch is 30.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106615.diff 8 Files Affected:
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 4ab0433069ae66..9d721da6f0c315 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -19488,6 +19488,9 @@ bool ARMTargetLowering::allowTruncateForTailCall(Type *Ty1, Type *Ty2) const {
/// patterns (and we don't have the non-fused floating point instruction).
bool ARMTargetLowering::isFMAFasterThanFMulAndFAdd(const MachineFunction &MF,
EVT VT) const {
+ if (Subtarget->useSoftFloat())
+ return false;
+
if (!VT.isSimple())
return false;
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 6f84bd6c6e4ff4..21815700643126 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -786,6 +786,9 @@ EVT SystemZTargetLowering::getSetCCResultType(const DataLayout &DL,
bool SystemZTargetLowering::isFMAFasterThanFMulAndFAdd(
const MachineFunction &MF, EVT VT) const {
+ if (useSoftFloat())
+ return false;
+
VT = VT.getScalarType();
if (!VT.isSimple())
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 1a6be4eb5af1ef..9d63c8cfe29ac5 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -34423,6 +34423,9 @@ bool X86TargetLowering::isVectorLoadExtDesirable(SDValue ExtVal) const {
bool X86TargetLowering::isFMAFasterThanFMulAndFAdd(const MachineFunction &MF,
EVT VT) const {
+ if (Subtarget.useSoftFloat())
+ return false;
+
if (!Subtarget.hasAnyFMA())
return false;
diff --git a/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll b/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..02efce1dc07b87
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll
@@ -0,0 +1,75 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=arm < %s | FileCheck %s -check-prefix=SOFT-FLOAT
+; RUN: llc -mtriple=arm -mattr=+vfp4d16sp < %s | FileCheck %s -check-prefix=SOFT-FLOAT-VFP32
+; RUN: llc -mtriple=arm -mattr=+vfp4d16sp,+fp64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-VFP64
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f32:
+; SOFT-FLOAT: @ %bb.0:
+; SOFT-FLOAT-NEXT: push {r4, lr}
+; SOFT-FLOAT-NEXT: mov r4, r2
+; SOFT-FLOAT-NEXT: bl __mulsf3
+; SOFT-FLOAT-NEXT: mov r1, r4
+; SOFT-FLOAT-NEXT: bl __addsf3
+; SOFT-FLOAT-NEXT: pop {r4, lr}
+; SOFT-FLOAT-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP32-LABEL: fma_f32:
+; SOFT-FLOAT-VFP32: @ %bb.0:
+; SOFT-FLOAT-VFP32-NEXT: push {r4, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov r4, r2
+; SOFT-FLOAT-VFP32-NEXT: bl __mulsf3
+; SOFT-FLOAT-VFP32-NEXT: mov r1, r4
+; SOFT-FLOAT-VFP32-NEXT: bl __addsf3
+; SOFT-FLOAT-VFP32-NEXT: pop {r4, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP64-LABEL: fma_f32:
+; SOFT-FLOAT-VFP64: @ %bb.0:
+; SOFT-FLOAT-VFP64-NEXT: push {r4, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov r4, r2
+; SOFT-FLOAT-VFP64-NEXT: bl __mulsf3
+; SOFT-FLOAT-VFP64-NEXT: mov r1, r4
+; SOFT-FLOAT-VFP64-NEXT: bl __addsf3
+; SOFT-FLOAT-VFP64-NEXT: pop {r4, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov pc, lr
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f64:
+; SOFT-FLOAT: @ %bb.0:
+; SOFT-FLOAT-NEXT: push {r11, lr}
+; SOFT-FLOAT-NEXT: bl __muldf3
+; SOFT-FLOAT-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-NEXT: bl __adddf3
+; SOFT-FLOAT-NEXT: pop {r11, lr}
+; SOFT-FLOAT-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP32-LABEL: fma_f64:
+; SOFT-FLOAT-VFP32: @ %bb.0:
+; SOFT-FLOAT-VFP32-NEXT: push {r11, lr}
+; SOFT-FLOAT-VFP32-NEXT: bl __muldf3
+; SOFT-FLOAT-VFP32-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-VFP32-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-VFP32-NEXT: bl __adddf3
+; SOFT-FLOAT-VFP32-NEXT: pop {r11, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP64-LABEL: fma_f64:
+; SOFT-FLOAT-VFP64: @ %bb.0:
+; SOFT-FLOAT-VFP64-NEXT: push {r11, lr}
+; SOFT-FLOAT-VFP64-NEXT: bl __muldf3
+; SOFT-FLOAT-VFP64-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-VFP64-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-VFP64-NEXT: bl __adddf3
+; SOFT-FLOAT-VFP64-NEXT: pop {r11, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov pc, lr
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll b/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..a8b4244b36f9a3
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll
@@ -0,0 +1,162 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=mips < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=mips -mcpu mips32r2 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32R2
+; RUN: llc -mtriple=mips64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+; RUN: llc -mtriple=mips64 -mcpu mips64r2 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64R2
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: sw $16, 16($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32-NEXT: .cfi_offset 16, -8
+; SOFT-FLOAT-32-NEXT: jal __mulsf3
+; SOFT-FLOAT-32-NEXT: move $16, $6
+; SOFT-FLOAT-32-NEXT: move $4, $2
+; SOFT-FLOAT-32-NEXT: jal __addsf3
+; SOFT-FLOAT-32-NEXT: move $5, $16
+; SOFT-FLOAT-32-NEXT: lw $16, 16($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: jr $ra
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-32R2-LABEL: fma_f32:
+; SOFT-FLOAT-32R2: # %bb.0:
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32R2-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32R2-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: sw $16, 16($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 16, -8
+; SOFT-FLOAT-32R2-NEXT: jal __mulsf3
+; SOFT-FLOAT-32R2-NEXT: move $16, $6
+; SOFT-FLOAT-32R2-NEXT: move $4, $2
+; SOFT-FLOAT-32R2-NEXT: jal __addsf3
+; SOFT-FLOAT-32R2-NEXT: move $5, $16
+; SOFT-FLOAT-32R2-NEXT: lw $16, 16($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: jr $ra
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-64-LABEL: fma_f32:
+; SOFT-FLOAT-64: # %bb.0:
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64-NEXT: move $16, $6
+; SOFT-FLOAT-64-NEXT: sll $4, $4, 0
+; SOFT-FLOAT-64-NEXT: jal __mulsf3
+; SOFT-FLOAT-64-NEXT: sll $5, $5, 0
+; SOFT-FLOAT-64-NEXT: sll $4, $2, 0
+; SOFT-FLOAT-64-NEXT: jal __addsf3
+; SOFT-FLOAT-64-NEXT: sll $5, $16, 0
+; SOFT-FLOAT-64-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: jr $ra
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, 16
+;
+; SOFT-FLOAT-64R2-LABEL: fma_f32:
+; SOFT-FLOAT-64R2: # %bb.0:
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64R2-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64R2-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64R2-NEXT: move $16, $6
+; SOFT-FLOAT-64R2-NEXT: sll $4, $4, 0
+; SOFT-FLOAT-64R2-NEXT: jal __mulsf3
+; SOFT-FLOAT-64R2-NEXT: sll $5, $5, 0
+; SOFT-FLOAT-64R2-NEXT: sll $4, $2, 0
+; SOFT-FLOAT-64R2-NEXT: jal __addsf3
+; SOFT-FLOAT-64R2-NEXT: sll $5, $16, 0
+; SOFT-FLOAT-64R2-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: jr $ra
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, 16
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f64:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32-NEXT: jal __muldf3
+; SOFT-FLOAT-32-NEXT: nop
+; SOFT-FLOAT-32-NEXT: move $4, $2
+; SOFT-FLOAT-32-NEXT: lw $6, 40($sp)
+; SOFT-FLOAT-32-NEXT: lw $7, 44($sp)
+; SOFT-FLOAT-32-NEXT: jal __adddf3
+; SOFT-FLOAT-32-NEXT: move $5, $3
+; SOFT-FLOAT-32-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: jr $ra
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-32R2-LABEL: fma_f64:
+; SOFT-FLOAT-32R2: # %bb.0:
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32R2-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32R2-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32R2-NEXT: jal __muldf3
+; SOFT-FLOAT-32R2-NEXT: nop
+; SOFT-FLOAT-32R2-NEXT: move $4, $2
+; SOFT-FLOAT-32R2-NEXT: lw $6, 40($sp)
+; SOFT-FLOAT-32R2-NEXT: lw $7, 44($sp)
+; SOFT-FLOAT-32R2-NEXT: jal __adddf3
+; SOFT-FLOAT-32R2-NEXT: move $5, $3
+; SOFT-FLOAT-32R2-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: jr $ra
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-64-LABEL: fma_f64:
+; SOFT-FLOAT-64: # %bb.0:
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64-NEXT: jal __muldf3
+; SOFT-FLOAT-64-NEXT: move $16, $6
+; SOFT-FLOAT-64-NEXT: move $4, $2
+; SOFT-FLOAT-64-NEXT: jal __adddf3
+; SOFT-FLOAT-64-NEXT: move $5, $16
+; SOFT-FLOAT-64-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: jr $ra
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, 16
+;
+; SOFT-FLOAT-64R2-LABEL: fma_f64:
+; SOFT-FLOAT-64R2: # %bb.0:
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64R2-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64R2-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64R2-NEXT: jal __muldf3
+; SOFT-FLOAT-64R2-NEXT: move $16, $6
+; SOFT-FLOAT-64R2-NEXT: move $4, $2
+; SOFT-FLOAT-64R2-NEXT: jal __adddf3
+; SOFT-FLOAT-64R2-NEXT: move $5, $16
+; SOFT-FLOAT-64R2-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: jr $ra
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, 16
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll b/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..3a85dd66132ee7
--- /dev/null
+++ b/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=sparc < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=sparc64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: .cfi_startproc
+; SOFT-FLOAT-32-NEXT: ! %bb.0:
+; SOFT-FLOAT-32-NEXT: save %sp, -96, %sp
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-32-NEXT: .cfi_window_save
+; SOFT-FLOAT-32-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-32-NEXT: mov %i0, %o0
+; SOFT-FLOAT-32-NEXT: call __mulsf3
+; SOFT-FLOAT-32-NEXT: mov %i1, %o1
+; SOFT-FLOAT-32-NEXT: call __addsf3
+; SOFT-FLOAT-32-NEXT: mov %i2, %o1
+; SOFT-FLOAT-32-NEXT: ret
+; SOFT-FLOAT-32-NEXT: restore %g0, %o0, %o0
+;
+; SOFT-FLOAT-64-LABEL: fma_f32:
+; SOFT-FLOAT-64: .cfi_startproc
+; SOFT-FLOAT-64-NEXT: ! %bb.0:
+; SOFT-FLOAT-64-NEXT: save %sp, -176, %sp
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-64-NEXT: .cfi_window_save
+; SOFT-FLOAT-64-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-64-NEXT: srl %i0, 0, %o0
+; SOFT-FLOAT-64-NEXT: call __mulsf3
+; SOFT-FLOAT-64-NEXT: srl %i1, 0, %o1
+; SOFT-FLOAT-64-NEXT: call __addsf3
+; SOFT-FLOAT-64-NEXT: srl %i2, 0, %o1
+; SOFT-FLOAT-64-NEXT: ret
+; SOFT-FLOAT-64-NEXT: restore %g0, %o0, %o0
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f64:
+; SOFT-FLOAT-32: .cfi_startproc
+; SOFT-FLOAT-32-NEXT: ! %bb.0:
+; SOFT-FLOAT-32-NEXT: save %sp, -96, %sp
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-32-NEXT: .cfi_window_save
+; SOFT-FLOAT-32-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-32-NEXT: mov %i0, %o0
+; SOFT-FLOAT-32-NEXT: mov %i1, %o1
+; SOFT-FLOAT-32-NEXT: mov %i2, %o2
+; SOFT-FLOAT-32-NEXT: call __muldf3
+; SOFT-FLOAT-32-NEXT: mov %i3, %o3
+; SOFT-FLOAT-32-NEXT: mov %i4, %o2
+; SOFT-FLOAT-32-NEXT: call __adddf3
+; SOFT-FLOAT-32-NEXT: mov %i5, %o3
+; SOFT-FLOAT-32-NEXT: mov %o0, %i0
+; SOFT-FLOAT-32-NEXT: ret
+; SOFT-FLOAT-32-NEXT: restore %g0, %o1, %o1
+;
+; SOFT-FLOAT-64-LABEL: fma_f64:
+; SOFT-FLOAT-64: .cfi_startproc
+; SOFT-FLOAT-64-NEXT: ! %bb.0:
+; SOFT-FLOAT-64-NEXT: save %sp, -176, %sp
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-64-NEXT: .cfi_window_save
+; SOFT-FLOAT-64-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-64-NEXT: mov %i0, %o0
+; SOFT-FLOAT-64-NEXT: call __muldf3
+; SOFT-FLOAT-64-NEXT: mov %i1, %o1
+; SOFT-FLOAT-64-NEXT: call __adddf3
+; SOFT-FLOAT-64-NEXT: mov %i2, %o1
+; SOFT-FLOAT-64-NEXT: ret
+; SOFT-FLOAT-64-NEXT: restore %g0, %o0, %o0
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll b/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..81804975f9661f
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=s390x < %s | FileCheck %s -check-prefix=SOFT-FLOAT
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f32:
+; SOFT-FLOAT: # %bb.0:
+; SOFT-FLOAT-NEXT: stmg %r13, %r15, 104(%r15)
+; SOFT-FLOAT-NEXT: .cfi_offset %r13, -56
+; SOFT-FLOAT-NEXT: .cfi_offset %r14, -48
+; SOFT-FLOAT-NEXT: .cfi_offset %r15, -40
+; SOFT-FLOAT-NEXT: aghi %r15, -160
+; SOFT-FLOAT-NEXT: .cfi_def_cfa_offset 320
+; SOFT-FLOAT-NEXT: llgfr %r2, %r2
+; SOFT-FLOAT-NEXT: llgfr %r3, %r3
+; SOFT-FLOAT-NEXT: lr %r13, %r4
+; SOFT-FLOAT-NEXT: brasl %r14, __mulsf3@PLT
+; SOFT-FLOAT-NEXT: llgfr %r3, %r13
+; SOFT-FLOAT-NEXT: brasl %r14, __addsf3@PLT
+; SOFT-FLOAT-NEXT: # kill: def $r2l killed $r2l killed $r2d
+; SOFT-FLOAT-NEXT: lmg %r13, %r15, 264(%r15)
+; SOFT-FLOAT-NEXT: br %r14
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f64:
+; SOFT-FLOAT: # %bb.0:
+; SOFT-FLOAT-NEXT: stmg %r13, %r15, 104(%r15)
+; SOFT-FLOAT-NEXT: .cfi_offset %r13, -56
+; SOFT-FLOAT-NEXT: .cfi_offset %r14, -48
+; SOFT-FLOAT-NEXT: .cfi_offset %r15, -40
+; SOFT-FLOAT-NEXT: aghi %r15, -160
+; SOFT-FLOAT-NEXT: .cfi_def_cfa_offset 320
+; SOFT-FLOAT-NEXT: lgr %r13, %r4
+; SOFT-FLOAT-NEXT: brasl %r14, __muldf3@PLT
+; SOFT-FLOAT-NEXT: lgr %r3, %r13
+; SOFT-FLOAT-NEXT: brasl %r14, __adddf3@PLT
+; SOFT-FLOAT-NEXT: lmg %r13, %r15, 264(%r15)
+; SOFT-FLOAT-NEXT: br %r14
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/X86/fmuladd-soft-float.ll b/llvm/test/CodeGen/X86/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..aa535326589033
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fmuladd-soft-float.ll
@@ -0,0 +1,288 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --extra_scrub --x86_scrub_rip --version 5
+; RUN: llc -mtriple=i386 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=i386 -mattr +fma < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32-FMA
+; RUN: llc -mtriple=i386 -mattr +fma4 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32-FMA4
+; RUN: llc -mtriple=x86_64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+; RUN: llc -mtriple=x86_64 -mattr +fma < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64-FMA
+; RUN: llc -mtriple=x86_64 -mattr +fma4 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64-FMA4
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: pushl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 8
+; SOFT-FLOAT-32-NEXT: .cfi_offset %esi, -8
+; SOFT-FLOAT-32-NEXT: movl {{[0-9]+}}(%esp), %esi
+; SOFT-FLOAT-32-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: calll __mulsf3
+; SOFT-FLOAT-32-NEXT: addl $8, %esp
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset -8
+; SOFT-FLOAT-32-NEXT: pushl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: pushl %eax
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: calll __addsf3
+; SOFT-FLOAT-32-NEXT: addl $8, %esp
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset -8
+; SOFT-FLOAT-32-NEXT: popl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: retl
+;
+; SOFT-FLOAT-32-FMA-LABEL: fma_f32:
+; SOFT-FLOAT-32-FMA: # %bb.0:
+; SOFT-FLOAT-32-FMA-NEXT: pushl %esi
+; SOFT-FLOAT-32-FMA-NEXT: .cfi_def_cfa_offset 8
+; SOFT-FLOAT-32-FMA-NEXT: .cfi_offset %esi, -8
+; SOFT-FLOAT-32-FMA-NEXT: movl {{[0-9]+}}(%esp), %esi
+; SOFT-FLOAT-32-FMA-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-FMA-N...
[truncated]
|
|
@llvm/pr-subscribers-backend-x86 Author: Alex Rønne Petersen (alexrp) ChangesThe previous behavior could be harmful in some edge cases, such as emitting a call to Do this by just being more accurate in Patch is 30.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106615.diff 8 Files Affected:
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 4ab0433069ae66..9d721da6f0c315 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -19488,6 +19488,9 @@ bool ARMTargetLowering::allowTruncateForTailCall(Type *Ty1, Type *Ty2) const {
/// patterns (and we don't have the non-fused floating point instruction).
bool ARMTargetLowering::isFMAFasterThanFMulAndFAdd(const MachineFunction &MF,
EVT VT) const {
+ if (Subtarget->useSoftFloat())
+ return false;
+
if (!VT.isSimple())
return false;
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 6f84bd6c6e4ff4..21815700643126 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -786,6 +786,9 @@ EVT SystemZTargetLowering::getSetCCResultType(const DataLayout &DL,
bool SystemZTargetLowering::isFMAFasterThanFMulAndFAdd(
const MachineFunction &MF, EVT VT) const {
+ if (useSoftFloat())
+ return false;
+
VT = VT.getScalarType();
if (!VT.isSimple())
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 1a6be4eb5af1ef..9d63c8cfe29ac5 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -34423,6 +34423,9 @@ bool X86TargetLowering::isVectorLoadExtDesirable(SDValue ExtVal) const {
bool X86TargetLowering::isFMAFasterThanFMulAndFAdd(const MachineFunction &MF,
EVT VT) const {
+ if (Subtarget.useSoftFloat())
+ return false;
+
if (!Subtarget.hasAnyFMA())
return false;
diff --git a/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll b/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..02efce1dc07b87
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/fmuladd-soft-float.ll
@@ -0,0 +1,75 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=arm < %s | FileCheck %s -check-prefix=SOFT-FLOAT
+; RUN: llc -mtriple=arm -mattr=+vfp4d16sp < %s | FileCheck %s -check-prefix=SOFT-FLOAT-VFP32
+; RUN: llc -mtriple=arm -mattr=+vfp4d16sp,+fp64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-VFP64
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f32:
+; SOFT-FLOAT: @ %bb.0:
+; SOFT-FLOAT-NEXT: push {r4, lr}
+; SOFT-FLOAT-NEXT: mov r4, r2
+; SOFT-FLOAT-NEXT: bl __mulsf3
+; SOFT-FLOAT-NEXT: mov r1, r4
+; SOFT-FLOAT-NEXT: bl __addsf3
+; SOFT-FLOAT-NEXT: pop {r4, lr}
+; SOFT-FLOAT-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP32-LABEL: fma_f32:
+; SOFT-FLOAT-VFP32: @ %bb.0:
+; SOFT-FLOAT-VFP32-NEXT: push {r4, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov r4, r2
+; SOFT-FLOAT-VFP32-NEXT: bl __mulsf3
+; SOFT-FLOAT-VFP32-NEXT: mov r1, r4
+; SOFT-FLOAT-VFP32-NEXT: bl __addsf3
+; SOFT-FLOAT-VFP32-NEXT: pop {r4, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP64-LABEL: fma_f32:
+; SOFT-FLOAT-VFP64: @ %bb.0:
+; SOFT-FLOAT-VFP64-NEXT: push {r4, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov r4, r2
+; SOFT-FLOAT-VFP64-NEXT: bl __mulsf3
+; SOFT-FLOAT-VFP64-NEXT: mov r1, r4
+; SOFT-FLOAT-VFP64-NEXT: bl __addsf3
+; SOFT-FLOAT-VFP64-NEXT: pop {r4, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov pc, lr
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f64:
+; SOFT-FLOAT: @ %bb.0:
+; SOFT-FLOAT-NEXT: push {r11, lr}
+; SOFT-FLOAT-NEXT: bl __muldf3
+; SOFT-FLOAT-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-NEXT: bl __adddf3
+; SOFT-FLOAT-NEXT: pop {r11, lr}
+; SOFT-FLOAT-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP32-LABEL: fma_f64:
+; SOFT-FLOAT-VFP32: @ %bb.0:
+; SOFT-FLOAT-VFP32-NEXT: push {r11, lr}
+; SOFT-FLOAT-VFP32-NEXT: bl __muldf3
+; SOFT-FLOAT-VFP32-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-VFP32-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-VFP32-NEXT: bl __adddf3
+; SOFT-FLOAT-VFP32-NEXT: pop {r11, lr}
+; SOFT-FLOAT-VFP32-NEXT: mov pc, lr
+;
+; SOFT-FLOAT-VFP64-LABEL: fma_f64:
+; SOFT-FLOAT-VFP64: @ %bb.0:
+; SOFT-FLOAT-VFP64-NEXT: push {r11, lr}
+; SOFT-FLOAT-VFP64-NEXT: bl __muldf3
+; SOFT-FLOAT-VFP64-NEXT: ldr r2, [sp, #8]
+; SOFT-FLOAT-VFP64-NEXT: ldr r3, [sp, #12]
+; SOFT-FLOAT-VFP64-NEXT: bl __adddf3
+; SOFT-FLOAT-VFP64-NEXT: pop {r11, lr}
+; SOFT-FLOAT-VFP64-NEXT: mov pc, lr
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll b/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..a8b4244b36f9a3
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/fmuladd-soft-float.ll
@@ -0,0 +1,162 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=mips < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=mips -mcpu mips32r2 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32R2
+; RUN: llc -mtriple=mips64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+; RUN: llc -mtriple=mips64 -mcpu mips64r2 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64R2
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: sw $16, 16($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32-NEXT: .cfi_offset 16, -8
+; SOFT-FLOAT-32-NEXT: jal __mulsf3
+; SOFT-FLOAT-32-NEXT: move $16, $6
+; SOFT-FLOAT-32-NEXT: move $4, $2
+; SOFT-FLOAT-32-NEXT: jal __addsf3
+; SOFT-FLOAT-32-NEXT: move $5, $16
+; SOFT-FLOAT-32-NEXT: lw $16, 16($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: jr $ra
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-32R2-LABEL: fma_f32:
+; SOFT-FLOAT-32R2: # %bb.0:
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32R2-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32R2-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: sw $16, 16($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 16, -8
+; SOFT-FLOAT-32R2-NEXT: jal __mulsf3
+; SOFT-FLOAT-32R2-NEXT: move $16, $6
+; SOFT-FLOAT-32R2-NEXT: move $4, $2
+; SOFT-FLOAT-32R2-NEXT: jal __addsf3
+; SOFT-FLOAT-32R2-NEXT: move $5, $16
+; SOFT-FLOAT-32R2-NEXT: lw $16, 16($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: jr $ra
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-64-LABEL: fma_f32:
+; SOFT-FLOAT-64: # %bb.0:
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64-NEXT: move $16, $6
+; SOFT-FLOAT-64-NEXT: sll $4, $4, 0
+; SOFT-FLOAT-64-NEXT: jal __mulsf3
+; SOFT-FLOAT-64-NEXT: sll $5, $5, 0
+; SOFT-FLOAT-64-NEXT: sll $4, $2, 0
+; SOFT-FLOAT-64-NEXT: jal __addsf3
+; SOFT-FLOAT-64-NEXT: sll $5, $16, 0
+; SOFT-FLOAT-64-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: jr $ra
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, 16
+;
+; SOFT-FLOAT-64R2-LABEL: fma_f32:
+; SOFT-FLOAT-64R2: # %bb.0:
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64R2-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64R2-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64R2-NEXT: move $16, $6
+; SOFT-FLOAT-64R2-NEXT: sll $4, $4, 0
+; SOFT-FLOAT-64R2-NEXT: jal __mulsf3
+; SOFT-FLOAT-64R2-NEXT: sll $5, $5, 0
+; SOFT-FLOAT-64R2-NEXT: sll $4, $2, 0
+; SOFT-FLOAT-64R2-NEXT: jal __addsf3
+; SOFT-FLOAT-64R2-NEXT: sll $5, $16, 0
+; SOFT-FLOAT-64R2-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: jr $ra
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, 16
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f64:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32-NEXT: jal __muldf3
+; SOFT-FLOAT-32-NEXT: nop
+; SOFT-FLOAT-32-NEXT: move $4, $2
+; SOFT-FLOAT-32-NEXT: lw $6, 40($sp)
+; SOFT-FLOAT-32-NEXT: lw $7, 44($sp)
+; SOFT-FLOAT-32-NEXT: jal __adddf3
+; SOFT-FLOAT-32-NEXT: move $5, $3
+; SOFT-FLOAT-32-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32-NEXT: jr $ra
+; SOFT-FLOAT-32-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-32R2-LABEL: fma_f64:
+; SOFT-FLOAT-32R2: # %bb.0:
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, -24
+; SOFT-FLOAT-32R2-NEXT: .cfi_def_cfa_offset 24
+; SOFT-FLOAT-32R2-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
+; SOFT-FLOAT-32R2-NEXT: .cfi_offset 31, -4
+; SOFT-FLOAT-32R2-NEXT: jal __muldf3
+; SOFT-FLOAT-32R2-NEXT: nop
+; SOFT-FLOAT-32R2-NEXT: move $4, $2
+; SOFT-FLOAT-32R2-NEXT: lw $6, 40($sp)
+; SOFT-FLOAT-32R2-NEXT: lw $7, 44($sp)
+; SOFT-FLOAT-32R2-NEXT: jal __adddf3
+; SOFT-FLOAT-32R2-NEXT: move $5, $3
+; SOFT-FLOAT-32R2-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
+; SOFT-FLOAT-32R2-NEXT: jr $ra
+; SOFT-FLOAT-32R2-NEXT: addiu $sp, $sp, 24
+;
+; SOFT-FLOAT-64-LABEL: fma_f64:
+; SOFT-FLOAT-64: # %bb.0:
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64-NEXT: jal __muldf3
+; SOFT-FLOAT-64-NEXT: move $16, $6
+; SOFT-FLOAT-64-NEXT: move $4, $2
+; SOFT-FLOAT-64-NEXT: jal __adddf3
+; SOFT-FLOAT-64-NEXT: move $5, $16
+; SOFT-FLOAT-64-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64-NEXT: jr $ra
+; SOFT-FLOAT-64-NEXT: daddiu $sp, $sp, 16
+;
+; SOFT-FLOAT-64R2-LABEL: fma_f64:
+; SOFT-FLOAT-64R2: # %bb.0:
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, -16
+; SOFT-FLOAT-64R2-NEXT: .cfi_def_cfa_offset 16
+; SOFT-FLOAT-64R2-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: sd $16, 0($sp) # 8-byte Folded Spill
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 31, -8
+; SOFT-FLOAT-64R2-NEXT: .cfi_offset 16, -16
+; SOFT-FLOAT-64R2-NEXT: jal __muldf3
+; SOFT-FLOAT-64R2-NEXT: move $16, $6
+; SOFT-FLOAT-64R2-NEXT: move $4, $2
+; SOFT-FLOAT-64R2-NEXT: jal __adddf3
+; SOFT-FLOAT-64R2-NEXT: move $5, $16
+; SOFT-FLOAT-64R2-NEXT: ld $16, 0($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
+; SOFT-FLOAT-64R2-NEXT: jr $ra
+; SOFT-FLOAT-64R2-NEXT: daddiu $sp, $sp, 16
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll b/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..3a85dd66132ee7
--- /dev/null
+++ b/llvm/test/CodeGen/SPARC/fmuladd-soft-float.ll
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=sparc < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=sparc64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: .cfi_startproc
+; SOFT-FLOAT-32-NEXT: ! %bb.0:
+; SOFT-FLOAT-32-NEXT: save %sp, -96, %sp
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-32-NEXT: .cfi_window_save
+; SOFT-FLOAT-32-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-32-NEXT: mov %i0, %o0
+; SOFT-FLOAT-32-NEXT: call __mulsf3
+; SOFT-FLOAT-32-NEXT: mov %i1, %o1
+; SOFT-FLOAT-32-NEXT: call __addsf3
+; SOFT-FLOAT-32-NEXT: mov %i2, %o1
+; SOFT-FLOAT-32-NEXT: ret
+; SOFT-FLOAT-32-NEXT: restore %g0, %o0, %o0
+;
+; SOFT-FLOAT-64-LABEL: fma_f32:
+; SOFT-FLOAT-64: .cfi_startproc
+; SOFT-FLOAT-64-NEXT: ! %bb.0:
+; SOFT-FLOAT-64-NEXT: save %sp, -176, %sp
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-64-NEXT: .cfi_window_save
+; SOFT-FLOAT-64-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-64-NEXT: srl %i0, 0, %o0
+; SOFT-FLOAT-64-NEXT: call __mulsf3
+; SOFT-FLOAT-64-NEXT: srl %i1, 0, %o1
+; SOFT-FLOAT-64-NEXT: call __addsf3
+; SOFT-FLOAT-64-NEXT: srl %i2, 0, %o1
+; SOFT-FLOAT-64-NEXT: ret
+; SOFT-FLOAT-64-NEXT: restore %g0, %o0, %o0
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f64:
+; SOFT-FLOAT-32: .cfi_startproc
+; SOFT-FLOAT-32-NEXT: ! %bb.0:
+; SOFT-FLOAT-32-NEXT: save %sp, -96, %sp
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-32-NEXT: .cfi_window_save
+; SOFT-FLOAT-32-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-32-NEXT: mov %i0, %o0
+; SOFT-FLOAT-32-NEXT: mov %i1, %o1
+; SOFT-FLOAT-32-NEXT: mov %i2, %o2
+; SOFT-FLOAT-32-NEXT: call __muldf3
+; SOFT-FLOAT-32-NEXT: mov %i3, %o3
+; SOFT-FLOAT-32-NEXT: mov %i4, %o2
+; SOFT-FLOAT-32-NEXT: call __adddf3
+; SOFT-FLOAT-32-NEXT: mov %i5, %o3
+; SOFT-FLOAT-32-NEXT: mov %o0, %i0
+; SOFT-FLOAT-32-NEXT: ret
+; SOFT-FLOAT-32-NEXT: restore %g0, %o1, %o1
+;
+; SOFT-FLOAT-64-LABEL: fma_f64:
+; SOFT-FLOAT-64: .cfi_startproc
+; SOFT-FLOAT-64-NEXT: ! %bb.0:
+; SOFT-FLOAT-64-NEXT: save %sp, -176, %sp
+; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_register %fp
+; SOFT-FLOAT-64-NEXT: .cfi_window_save
+; SOFT-FLOAT-64-NEXT: .cfi_register %o7, %i7
+; SOFT-FLOAT-64-NEXT: mov %i0, %o0
+; SOFT-FLOAT-64-NEXT: call __muldf3
+; SOFT-FLOAT-64-NEXT: mov %i1, %o1
+; SOFT-FLOAT-64-NEXT: call __adddf3
+; SOFT-FLOAT-64-NEXT: mov %i2, %o1
+; SOFT-FLOAT-64-NEXT: ret
+; SOFT-FLOAT-64-NEXT: restore %g0, %o0, %o0
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll b/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..81804975f9661f
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=s390x < %s | FileCheck %s -check-prefix=SOFT-FLOAT
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f32:
+; SOFT-FLOAT: # %bb.0:
+; SOFT-FLOAT-NEXT: stmg %r13, %r15, 104(%r15)
+; SOFT-FLOAT-NEXT: .cfi_offset %r13, -56
+; SOFT-FLOAT-NEXT: .cfi_offset %r14, -48
+; SOFT-FLOAT-NEXT: .cfi_offset %r15, -40
+; SOFT-FLOAT-NEXT: aghi %r15, -160
+; SOFT-FLOAT-NEXT: .cfi_def_cfa_offset 320
+; SOFT-FLOAT-NEXT: llgfr %r2, %r2
+; SOFT-FLOAT-NEXT: llgfr %r3, %r3
+; SOFT-FLOAT-NEXT: lr %r13, %r4
+; SOFT-FLOAT-NEXT: brasl %r14, __mulsf3@PLT
+; SOFT-FLOAT-NEXT: llgfr %r3, %r13
+; SOFT-FLOAT-NEXT: brasl %r14, __addsf3@PLT
+; SOFT-FLOAT-NEXT: # kill: def $r2l killed $r2l killed $r2d
+; SOFT-FLOAT-NEXT: lmg %r13, %r15, 264(%r15)
+; SOFT-FLOAT-NEXT: br %r14
+ %1 = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
+ ret float %1
+}
+
+define double @fma_f64(double %a, double %b, double %c) "use-soft-float"="true" {
+; SOFT-FLOAT-LABEL: fma_f64:
+; SOFT-FLOAT: # %bb.0:
+; SOFT-FLOAT-NEXT: stmg %r13, %r15, 104(%r15)
+; SOFT-FLOAT-NEXT: .cfi_offset %r13, -56
+; SOFT-FLOAT-NEXT: .cfi_offset %r14, -48
+; SOFT-FLOAT-NEXT: .cfi_offset %r15, -40
+; SOFT-FLOAT-NEXT: aghi %r15, -160
+; SOFT-FLOAT-NEXT: .cfi_def_cfa_offset 320
+; SOFT-FLOAT-NEXT: lgr %r13, %r4
+; SOFT-FLOAT-NEXT: brasl %r14, __muldf3@PLT
+; SOFT-FLOAT-NEXT: lgr %r3, %r13
+; SOFT-FLOAT-NEXT: brasl %r14, __adddf3@PLT
+; SOFT-FLOAT-NEXT: lmg %r13, %r15, 264(%r15)
+; SOFT-FLOAT-NEXT: br %r14
+ %1 = call double @llvm.fmuladd.f64(double %a, double %b, double %c)
+ ret double %1
+}
+
+declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
+declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
diff --git a/llvm/test/CodeGen/X86/fmuladd-soft-float.ll b/llvm/test/CodeGen/X86/fmuladd-soft-float.ll
new file mode 100644
index 00000000000000..aa535326589033
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fmuladd-soft-float.ll
@@ -0,0 +1,288 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --extra_scrub --x86_scrub_rip --version 5
+; RUN: llc -mtriple=i386 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32
+; RUN: llc -mtriple=i386 -mattr +fma < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32-FMA
+; RUN: llc -mtriple=i386 -mattr +fma4 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-32-FMA4
+; RUN: llc -mtriple=x86_64 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64
+; RUN: llc -mtriple=x86_64 -mattr +fma < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64-FMA
+; RUN: llc -mtriple=x86_64 -mattr +fma4 < %s | FileCheck %s -check-prefix=SOFT-FLOAT-64-FMA4
+
+define float @fma_f32(float %a, float %b, float %c) "use-soft-float"="true" {
+; SOFT-FLOAT-32-LABEL: fma_f32:
+; SOFT-FLOAT-32: # %bb.0:
+; SOFT-FLOAT-32-NEXT: pushl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 8
+; SOFT-FLOAT-32-NEXT: .cfi_offset %esi, -8
+; SOFT-FLOAT-32-NEXT: movl {{[0-9]+}}(%esp), %esi
+; SOFT-FLOAT-32-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: calll __mulsf3
+; SOFT-FLOAT-32-NEXT: addl $8, %esp
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset -8
+; SOFT-FLOAT-32-NEXT: pushl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: pushl %eax
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: calll __addsf3
+; SOFT-FLOAT-32-NEXT: addl $8, %esp
+; SOFT-FLOAT-32-NEXT: .cfi_adjust_cfa_offset -8
+; SOFT-FLOAT-32-NEXT: popl %esi
+; SOFT-FLOAT-32-NEXT: .cfi_def_cfa_offset 4
+; SOFT-FLOAT-32-NEXT: retl
+;
+; SOFT-FLOAT-32-FMA-LABEL: fma_f32:
+; SOFT-FLOAT-32-FMA: # %bb.0:
+; SOFT-FLOAT-32-FMA-NEXT: pushl %esi
+; SOFT-FLOAT-32-FMA-NEXT: .cfi_def_cfa_offset 8
+; SOFT-FLOAT-32-FMA-NEXT: .cfi_offset %esi, -8
+; SOFT-FLOAT-32-FMA-NEXT: movl {{[0-9]+}}(%esp), %esi
+; SOFT-FLOAT-32-FMA-NEXT: pushl {{[0-9]+}}(%esp)
+; SOFT-FLOAT-32-FMA-N...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
3de0e81 to
35095dc
Compare
|
Can you update the description in TargetLowering.h to mention this restriction? |
35095dc to
d56728b
Compare
|
Done. |
d56728b to
6c3c7d6
Compare
|
Pushed updated tests. |
|
ping 🙂 |
1fbfd27 to
9d0471b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel like the right place for this, I would expect there to be some legality check that would fail. But there's already precedent for directly handling it here. Anyway basically lgtm
For what it's worth, what happened for the affected targets was that the
Handling this special case in |
9d0471b to
d1bfa9c
Compare
|
Updated tests to use IR |
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with nit
d1bfa9c to
d899b81
Compare
The previous behavior could be harmful in some edge cases, such as emitting a call to fma() in the fma() implementation itself. Do this by just being more accurate in isFMAFasterThanFMulAndFAdd(). This was already done for PowerPC; this commit just extends that to Arm, z/Arch, and x86. MIPS and SPARC already got it right, but I added tests for them too, for good measure.
d899b81 to
839f221
Compare
|
ping (CI failure looks like unrelated breakage on main) |
The previous behavior could be harmful in some edge cases, such as emitting a call to
fma()in thefma()implementation itself.Do this by just being more accurate in
isFMAFasterThanFMulAndFAdd(). This was already done for PowerPC; this commit just extends that to Arm, z/Arch, and x86. MIPS and SPARC already got it right, but I added tests for them too, for good measure.Note: I don't have commit access.