Skip to content

Commit fe5c3cb

Browse files
Added Conditions of SM90 and ISA7.8 for Using cvt.ftz.f32.bf16 Instruction (#165774)
Updated the conditions for generating the cvt.ftz.f32.bf16 instruction to include sm90 and isa7.8, so that ftz is only generated when it is supported. --------- Co-authored-by: Justin Fargnoli <[email protected]>
1 parent 7398591 commit fe5c3cb

File tree

2 files changed

+311
-34
lines changed

2 files changed

+311
-34
lines changed

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2267,7 +2267,7 @@ def : Pat<(f32 (fpround f64:$a)), (CVT_f32_f64 $a, CvtRN)>;
22672267
def : Pat<(f32 (fpextend f16:$a)), (CVT_f32_f16 $a, CvtNONE_FTZ)>, Requires<[doF32FTZ]>;
22682268
def : Pat<(f32 (fpextend f16:$a)), (CVT_f32_f16 $a, CvtNONE)>;
22692269
// fpextend bf16 -> f32
2270-
def : Pat<(f32 (fpextend bf16:$a)), (CVT_f32_bf16 $a, CvtNONE_FTZ)>, Requires<[doF32FTZ]>;
2270+
def : Pat<(f32 (fpextend bf16:$a)), (CVT_f32_bf16 $a, CvtNONE_FTZ)>, Requires<[doF32FTZ, hasPTX<78>, hasSM<90>]>;
22712271
def : Pat<(f32 (fpextend bf16:$a)), (CVT_f32_bf16 $a, CvtNONE)>, Requires<[hasPTX<71>, hasSM<80>]>;
22722272

22732273
// fpextend f16 -> f64

0 commit comments

Comments
 (0)