Skip to content

Commit 82113a4

Browse files
authored
[LLVM][NVPTX] Remove nonexistent ftz ops (llvm#106100)
According to the PTX [spec](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-max), max & min instructions do not support the `ftz` modifier for `bf16` & `bf16x2` types. This PR removes them from instr info, and the non-ftz legal versions will be emitted instead.
1 parent ecd9e0b commit 82113a4

File tree

2 files changed

+349
-13
lines changed

2 files changed

+349
-13
lines changed

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -334,25 +334,12 @@ multiclass FMINIMUMMAXIMUM<string OpcStr, bit NaN, SDNode OpNode> {
334334
!strconcat(OpcStr, ".f16x2 \t$dst, $a, $b;"),
335335
[(set Int32Regs:$dst, (OpNode (v2f16 Int32Regs:$a), (v2f16 Int32Regs:$b)))]>,
336336
Requires<[useFP16Math, hasSM<80>, hasPTX<70>]>;
337-
def bf16rr_ftz :
338-
NVPTXInst<(outs Int16Regs:$dst),
339-
(ins Int16Regs:$a, Int16Regs:$b),
340-
!strconcat(OpcStr, ".ftz.bf16 \t$dst, $a, $b;"),
341-
[(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
342-
Requires<[hasBF16Math, doF32FTZ, hasSM<80>, hasPTX<70>]>;
343337
def bf16rr :
344338
NVPTXInst<(outs Int16Regs:$dst),
345339
(ins Int16Regs:$a, Int16Regs:$b),
346340
!strconcat(OpcStr, ".bf16 \t$dst, $a, $b;"),
347341
[(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
348342
Requires<[hasBF16Math, hasSM<80>, hasPTX<70>]>;
349-
350-
def bf16x2rr_ftz :
351-
NVPTXInst<(outs Int32Regs:$dst),
352-
(ins Int32Regs:$a, Int32Regs:$b),
353-
!strconcat(OpcStr, ".ftz.bf16x2 \t$dst, $a, $b;"),
354-
[(set Int32Regs:$dst, (OpNode (v2bf16 Int32Regs:$a), (v2bf16 Int32Regs:$b)))]>,
355-
Requires<[hasBF16Math, hasSM<80>, hasPTX<70>, doF32FTZ]>;
356343
def bf16x2rr :
357344
NVPTXInst<(outs Int32Regs:$dst),
358345
(ins Int32Regs:$a, Int32Regs:$b),

0 commit comments

Comments
 (0)