Commit 9dc9120
authored
[release/2.7][ROCm][inductor] Improved fast_tanh code generation (#2802)
In the ROCm fork of PyTorch 2.7, Inductor currently has codegen support
for fast_tanhf. However, it is currently guarded by
`TORCHINDUCTOR_USE_FAST_MATH` environment variable due to some NaN
issues in the original Triton implementation of fast_tanhf.
Upstream Triton has an improved fast_tanhf where the NaN issues are now
fixed. This upstream commit has been backported to ROCm fork of Triton
(see code comments).
Thus, I have removed the conditionalization on Triton versions as well.
A bump in the Triton commit is also needed.
Other notes:
- In support of
[SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271)
- Triton 3.3 backport of upstream Triton commit
ROCm/triton#902
- Similar to #2803,
#2804
- Related to pytorch#1620521 parent e311287 commit 9dc9120
File tree
2 files changed
+5
-6
lines changed- .ci/docker/ci_commit_pins
- torch/_inductor/codegen
2 files changed
+5
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1217 | 1217 | | |
1218 | 1218 | | |
1219 | 1219 | | |
1220 | | - | |
1221 | | - | |
1222 | | - | |
1223 | | - | |
1224 | | - | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
1225 | 1224 | | |
1226 | 1225 | | |
1227 | 1226 | | |
| |||
0 commit comments