-
Notifications
You must be signed in to change notification settings - Fork 15.2k
X86: Do not return invalid cost for fp16 conversion #114128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-x86 Author: Matthias Braun (MatzeB) ChangesReturning invalid instruction when converting from/to fp16 in Full diff: https://github.com/llvm/llvm-project/pull/114128.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index bae223243b3dc9..ef16636a2ea544 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -3068,6 +3068,13 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
if (auto KindCost = Entry->Cost[CostKind])
return *KindCost;
}
+
+ if ((ISD == ISD::FP_ROUND && SimpleDstTy == MVT::f16) ||
+ (ISD == ISD::FP_EXTEND && SimpleSrcTy == MVT::f16)) {
+ // fp16 conversions not covered yet require a libcall, return a
+ // large (arbitrary) number.
+ return InstructionCost(64);
+ }
}
// Fall back to legalized types.
@@ -3174,11 +3181,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
TTI::CastContextHint::None, CostKind);
}
- if (ISD == ISD::FP_ROUND && LTDest.second.getScalarType() == MVT::f16) {
- // Conversion requires a libcall.
- return InstructionCost::getInvalid();
- }
-
// TODO: Allow non-throughput costs that aren't binary.
auto AdjustCost = [&CostKind](InstructionCost Cost,
InstructionCost N = 1) -> InstructionCost {
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll b/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
index bcea147d724f53..f23043f0c47f4a 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
@@ -453,14 +453,9 @@ define void @fpround_v16xf32_v16xf16(ptr %s0, ptr %d0) {
;
; CHECK-F16C-LABEL: define void @fpround_v16xf32_v16xf16(
; CHECK-F16C-SAME: ptr [[S0:%.*]], ptr [[D0:%.*]]) #[[ATTR0]] {
-; CHECK-F16C-NEXT: [[S8:%.*]] = getelementptr inbounds float, ptr [[S0]], i64 8
-; CHECK-F16C-NEXT: [[D8:%.*]] = getelementptr inbounds half, ptr [[D0]], i64 8
-; CHECK-F16C-NEXT: [[TMP1:%.*]] = load <8 x float>, ptr [[S0]], align 4
-; CHECK-F16C-NEXT: [[TMP2:%.*]] = fptrunc <8 x float> [[TMP1]] to <8 x half>
-; CHECK-F16C-NEXT: [[TMP3:%.*]] = load <8 x float>, ptr [[S8]], align 4
-; CHECK-F16C-NEXT: [[TMP4:%.*]] = fptrunc <8 x float> [[TMP3]] to <8 x half>
-; CHECK-F16C-NEXT: store <8 x half> [[TMP2]], ptr [[D0]], align 2
-; CHECK-F16C-NEXT: store <8 x half> [[TMP4]], ptr [[D8]], align 2
+; CHECK-F16C-NEXT: [[TMP1:%.*]] = load <16 x float>, ptr [[S0]], align 4
+; CHECK-F16C-NEXT: [[TMP2:%.*]] = fptrunc <16 x float> [[TMP1]] to <16 x half>
+; CHECK-F16C-NEXT: store <16 x half> [[TMP2]], ptr [[D0]], align 2
; CHECK-F16C-NEXT: ret void
;
; CHECK-AVX512-LABEL: define void @fpround_v16xf32_v16xf16(
|
9ae0843 to
eb9f427
Compare
JoelWee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turnaround on this Matthias! This looks like it fixes the failure in jax/.../lax_test :)
| if ((ISD == ISD::FP_ROUND && SimpleDstTy == MVT::f16) || | ||
| (ISD == ISD::FP_EXTEND && SimpleSrcTy == MVT::f16)) { | ||
| // fp16 conversions not covered by any table entries require a libcall, | ||
| // return a large (arbitrary) number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "return a large (arbitrary) number to model this."
eb9f427 to
2e43458
Compare
RKSimon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/7729 Here is the relevant piece of the build log for the reference |
Returning invalid instruction costs when converting from/to fp16 in `X86TTIImpl::getCastInstrCost` when there is no hardware support available was triggering asserts. This changes the code to return a large (arbitrary) number to model the fact that libcalls are used to implement the conversion. This also simplifies the code by only reporting costs for the scalar fp16 conversion; vectorized costs being left to the fallback assuming scalarization. This is a follow-up to assertion issues reported for the changes in llvm#113195
Returning invalid instruction costs when converting from/to fp16 in `X86TTIImpl::getCastInstrCost` when there is no hardware support available was triggering asserts. This changes the code to return a large (arbitrary) number to model the fact that libcalls are used to implement the conversion. This also simplifies the code by only reporting costs for the scalar fp16 conversion; vectorized costs being left to the fallback assuming scalarization. This is a follow-up to assertion issues reported for the changes in llvm#113195 upstream commit: 255e441
Returning invalid instruction costs when converting from/to fp16 in
X86TTIImpl::getCastInstrCostwhen there is no hardware support available was triggering asserts. This changes the code to return a large (arbitrary) number to model the fact that libcalls are used to implement the conversion.This also simplifies the code by only reporting costs for the scalar fp16 conversion; vectorized costs being left to the fallback assuming scalarization.
This is a follow-up to assertion issues reported for the changes in #113195