Skip to content

Commit 24cf336

Browse files
authored
[BACKEND] Fp8E5M2_to_Bf16 should fallback to the slow path for sm < 90 (#6858)
1 parent 8d99aa1 commit 24cf336

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/ElementwiseOpToLLVM.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -428,8 +428,9 @@ struct FpToFpOpConversion
428428
Fp16_to_Fp8E5M2_RTNE(computeCapability >= 89)},
429429
{{F16TyID, F8E5M2TyID, RoundingMode::RTZ}, Fp16_to_Fp8E5M2_RTZ},
430430
// F8 -> BF16
431+
// mul{.rnd}.bf16 and mul{.rnd}.bf16x2 requires sm_90 or higher.
431432
{{F8E5M2TyID, BF16TyID, undefRounding},
432-
Fp8E5M2_to_Bf16(computeCapability >= 89)},
433+
Fp8E5M2_to_Bf16(computeCapability >= 90)},
433434
{{F8E4M3TyID, BF16TyID, undefRounding},
434435
Fp8E4M3Nv_to_Bf16(computeCapability >= 89)},
435436
// BF16 -> F8

0 commit comments

Comments
 (0)