Commit cba8b9d
authored
[release/2.8][ROCm][inductor] Improved fast_tanh code generation (#2803)
In the ROCm fork of PyTorch 2.8, Inductor currently has codegen support
for fast_tanhf. However, there were some NaN issues in the original
Triton implementation of fast_tanhf .
Upstream Triton has an improved fast_tanhf where the NaN issues are now
fixed. This upstream commit has been backported to ROCm fork of Triton
(see code comments).
A bump in the Triton commit is also needed.
Other notes:
- In support of
[SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271)
- Triton 3.4 backport of upstream Triton commit
ROCm/triton#900
- Similar to #2802,
#2804
- Related to pytorch#1620521 parent 63e525b commit cba8b9d
File tree
2 files changed
+8
-3
lines changed- .ci/docker/ci_commit_pins
- torch/_inductor/codegen
2 files changed
+8
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
1232 | 1232 | | |
1233 | 1233 | | |
1234 | 1234 | | |
1235 | | - | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
1236 | 1241 | | |
1237 | 1242 | | |
1238 | 1243 | | |
| |||
0 commit comments