Skip to content

Conversation

@MatzeB
Copy link
Contributor

@MatzeB MatzeB commented Oct 29, 2024

Returning invalid instruction costs when converting from/to fp16 in X86TTIImpl::getCastInstrCost when there is no hardware support available was triggering asserts. This changes the code to return a large (arbitrary) number to model the fact that libcalls are used to implement the conversion.

This also simplifies the code by only reporting costs for the scalar fp16 conversion; vectorized costs being left to the fallback assuming scalarization.

This is a follow-up to assertion issues reported for the changes in #113195

@llvmbot
Copy link
Member

llvmbot commented Oct 29, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-x86

Author: Matthias Braun (MatzeB)

Changes

Returning invalid instruction when converting from/to fp16 in X86TTIImpl::getCastInstrCost when there is no hardware support available was triggering asserts. This changes the code to return a large number instead.


Full diff: https://github.com/llvm/llvm-project/pull/114128.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86TargetTransformInfo.cpp (+7-5)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll (+3-8)
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index bae223243b3dc9..ef16636a2ea544 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -3068,6 +3068,13 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
         if (auto KindCost = Entry->Cost[CostKind])
           return *KindCost;
     }
+
+    if ((ISD == ISD::FP_ROUND && SimpleDstTy == MVT::f16) ||
+        (ISD == ISD::FP_EXTEND && SimpleSrcTy == MVT::f16)) {
+      // fp16 conversions not covered yet require a libcall, return a
+      // large (arbitrary) number.
+      return InstructionCost(64);
+    }
   }
 
   // Fall back to legalized types.
@@ -3174,11 +3181,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
                             TTI::CastContextHint::None, CostKind);
   }
 
-  if (ISD == ISD::FP_ROUND && LTDest.second.getScalarType() == MVT::f16) {
-    // Conversion requires a libcall.
-    return InstructionCost::getInvalid();
-  }
-
   // TODO: Allow non-throughput costs that aren't binary.
   auto AdjustCost = [&CostKind](InstructionCost Cost,
                                 InstructionCost N = 1) -> InstructionCost {
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll b/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
index bcea147d724f53..f23043f0c47f4a 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/conversion-fp16.ll
@@ -453,14 +453,9 @@ define void @fpround_v16xf32_v16xf16(ptr %s0, ptr %d0) {
 ;
 ; CHECK-F16C-LABEL: define void @fpround_v16xf32_v16xf16(
 ; CHECK-F16C-SAME: ptr [[S0:%.*]], ptr [[D0:%.*]]) #[[ATTR0]] {
-; CHECK-F16C-NEXT:    [[S8:%.*]] = getelementptr inbounds float, ptr [[S0]], i64 8
-; CHECK-F16C-NEXT:    [[D8:%.*]] = getelementptr inbounds half, ptr [[D0]], i64 8
-; CHECK-F16C-NEXT:    [[TMP1:%.*]] = load <8 x float>, ptr [[S0]], align 4
-; CHECK-F16C-NEXT:    [[TMP2:%.*]] = fptrunc <8 x float> [[TMP1]] to <8 x half>
-; CHECK-F16C-NEXT:    [[TMP3:%.*]] = load <8 x float>, ptr [[S8]], align 4
-; CHECK-F16C-NEXT:    [[TMP4:%.*]] = fptrunc <8 x float> [[TMP3]] to <8 x half>
-; CHECK-F16C-NEXT:    store <8 x half> [[TMP2]], ptr [[D0]], align 2
-; CHECK-F16C-NEXT:    store <8 x half> [[TMP4]], ptr [[D8]], align 2
+; CHECK-F16C-NEXT:    [[TMP1:%.*]] = load <16 x float>, ptr [[S0]], align 4
+; CHECK-F16C-NEXT:    [[TMP2:%.*]] = fptrunc <16 x float> [[TMP1]] to <16 x half>
+; CHECK-F16C-NEXT:    store <16 x half> [[TMP2]], ptr [[D0]], align 2
 ; CHECK-F16C-NEXT:    ret void
 ;
 ; CHECK-AVX512-LABEL: define void @fpround_v16xf32_v16xf16(

Copy link
Contributor

@JoelWee JoelWee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround on this Matthias! This looks like it fixes the failure in jax/.../lax_test :)

if ((ISD == ISD::FP_ROUND && SimpleDstTy == MVT::f16) ||
(ISD == ISD::FP_EXTEND && SimpleSrcTy == MVT::f16)) {
// fp16 conversions not covered by any table entries require a libcall,
// return a large (arbitrary) number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "return a large (arbitrary) number to model this."

@MatzeB MatzeB force-pushed the no_invalid_f16_conversion_cost branch from eb9f427 to 2e43458 Compare October 29, 2024 21:47
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MatzeB MatzeB merged commit 255e441 into llvm:main Oct 30, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 30, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 11 "Add check check-libc-amdgcn-amd-amdhsa".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/7729

Here is the relevant piece of the build log for the reference
Step 11 (Add check check-libc-amdgcn-amd-amdhsa) failure: 1200 seconds without output running [b'ninja', b'-j 32', b'check-libc-amdgcn-amd-amdhsa'], attempting to kill
...
[2561/2681] Linking CXX executable libc/test/src/string/libc.test.src.string.strcpy_test.__hermetic__.__build__
[2562/2681] Linking CXX executable libc/test/src/locale/libc.test.src.locale.localeconv_test.__hermetic__.__build__
[2563/2681] Linking CXX executable libc/test/src/inttypes/libc.test.src.inttypes.strtoumax_test.__hermetic__.__build__
[2564/2681] Linking CXX executable libc/test/src/stdlib/libc.test.src.stdlib.strtof_test.__hermetic__.__build__
[2565/2681] Linking CXX executable libc/test/src/string/libc.test.src.string.strncat_test.__hermetic__.__build__
[2566/2681] Linking CXX executable libc/test/src/string/libc.test.src.string.memmove_test.__hermetic__.__build__
[2567/2681] Linking CXX executable libc/test/src/inttypes/libc.test.src.inttypes.strtoimax_test.__hermetic__.__build__
[2568/2681] Linking CXX executable libc/test/src/time/libc.test.src.time.clock_gettime_test.__hermetic__.__build__
[2569/2681] Linking CXX executable libc/test/src/stdio/libc.test.src.stdio.asprintf_test.__hermetic__.__build__
[2570/2681] Linking CXX executable libc/test/src/stdio/libc.test.src.stdio.vasprintf_test.__hermetic__.__build__
command timed out: 1200 seconds without output running [b'ninja', b'-j 32', b'check-libc-amdgcn-amd-amdhsa'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1470.255292

NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Returning invalid instruction costs when converting from/to fp16 in
`X86TTIImpl::getCastInstrCost` when there is no hardware support
available was triggering asserts. This changes the code to return a
large (arbitrary) number to model the fact that libcalls are used to
implement the conversion.

This also simplifies the code by only reporting costs for the scalar
fp16 conversion; vectorized costs being left to the fallback assuming
scalarization.

This is a follow-up to assertion issues reported for the changes in
llvm#113195
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Aug 13, 2025
Returning invalid instruction costs when converting from/to fp16 in
`X86TTIImpl::getCastInstrCost` when there is no hardware support
available was triggering asserts. This changes the code to return a
large (arbitrary) number to model the fact that libcalls are used to
implement the conversion.

This also simplifies the code by only reporting costs for the scalar
fp16 conversion; vectorized costs being left to the fallback assuming
scalarization.

This is a follow-up to assertion issues reported for the changes in
llvm#113195

upstream commit: 255e441
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants