-
Notifications
You must be signed in to change notification settings - Fork 75
[release/2.7][ROCm][inductor] Improved fast_tanh code generation #2802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Jenkins build for 1b1fde5fcc342c2c0d3c69bf95a91501fc39b324 commit finished as FAILURE |
|
I have confirmed that it resolves the reproducer in the Jira. |
torch/_inductor/codegen/triton.py
Outdated
| return f"libdevice.tanh({x})" | ||
| # On ROCm, always use fast_tanhf | ||
| # Requires ROCm fork of Triton 3.3, 3.4, 3.5 or upstream Triton 3.6+ | ||
| if torch.version.hip: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.7 uses 3.3 IIUC
We should at least support 3.2 in 2.7, so lets conditionalise on triton > (3,3) if 3.3 supports this.
|
Jenkins build for f416c7119ad1443bf022a37a8f3f21b201aa4bbc commit finished as FAILURE |
In the ROCm fork of PyTorch 2.8, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf . Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments). A bump in the Triton commit is also needed. Other notes: - In support of [SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271) - Triton 3.4 backport of upstream Triton commit ROCm/triton#900 - Similar to #2802, #2804 - Related to pytorch#162052
In the ROCm fork of PyTorch 2.9, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf . Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments). A bump in the Triton commit is also needed. Other notes: - In support of [SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271) - Triton 3.5 backport of upstream Triton commit ROCm/triton#901 - Similar to #2802, #2803 - Related to pytorch#162052
In the ROCm fork of PyTorch 2.9, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf . Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments). A bump in the Triton commit is also needed. Other notes: - In support of [SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271) - Triton 3.5 backport of upstream Triton commit ROCm/triton#901 - Similar to #2802, #2803 - Related to pytorch#162052
In the ROCm fork of PyTorch 2.7, Inductor currently has codegen support for fast_tanhf. However, it is currently guarded by
TORCHINDUCTOR_USE_FAST_MATHenvironment variable due to some NaN issues in the original Triton implementation of fast_tanhf.Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments).
Thus, I have removed the conditionalization on Triton versions as well. A bump in the Triton commit is also needed.
Other notes: