Commit ddacd46
[AMD] Clamp Results in Downcasting to FP8E4M3 and FP8E5M2 (#7337)
There are several conversion ops on the NV side using `satfinite` mode,
but on the AMD side, some of those are in non-saturation mode. We need
to align AMD ops with NV.
For example, fp32 to OCP fp8 on mi350 is lowered to
`ROCDL::CvtScaleF32PkFp8F32Op`, and is eventually lowered to
`v_cvt_scalef32_pk_fp8_f32`, which, according to ISA, is in
non-saturation mode. But on the NV side, it's lowered to
`cvt.rn.satfinite.e4m3x2.f32`, which is in saturation mode.
Other examples including:
| Conversion | ROCDL dialect | Instruction |
| ----------------- | ----------------------------- |
-------------------------- |
| fp32 to fp8e4m3fn | ROCDL::CvtScaleF32PkFp8F32Op |
v_cvt_scalef32_pk_fp8_f32 |
| fp32 to fp8e5m2 | ROCDL::CvtScaleF32PkBf8F32Op |
v_cvt_scalef32_pk_bf8_f32 |
| fp16 to fp8e4m3fn | ROCDL::CvtScaleF32PkFp8F16Op |
v_cvt_scalef32_pk_fp8_f16 |
| fp16 to fp8e5m2 | ROCDL::CvtScaleF32PkBf8F16Op |
v_cvt_scalef32_pk_bf8_f16 |
| bf16 to fp8e4m3fn | ROCDL::CvtScaleF32PkFp8Bf16Op |
v_cvt_scalef32_pk_fp8_bf16 |
| bf16 to fp8e5m2 | ROCDL::CvtScaleF32PkBf8Bf16Op |
v_cvt_scalef32_pk_bf8_bf16 |
This PR fixed this issue by enabling the `FP16_OVFL` flag in the Mode
register before these conversion instrs.
---------
Co-authored-by: ravil-mobile <[email protected]>1 parent 677a30c commit ddacd46
File tree
3 files changed
+86
-3
lines changed- python
- test/unit/language
- triton_kernels/tests
- third_party/amd/lib/TritonAMDGPUToLLVM
3 files changed
+86
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
149 | | - | |
150 | | - | |
| 149 | + | |
| 150 | + | |
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
68 | 79 | | |
69 | 80 | | |
70 | 81 | | |
| |||
84 | 95 | | |
85 | 96 | | |
86 | 97 | | |
| 98 | + | |
87 | 99 | | |
88 | 100 | | |
89 | 101 | | |
| |||
0 commit comments