Skip to content

Conversation

@amd-sriram
Copy link

@amd-sriram amd-sriram commented Nov 17, 2025

Fixes #SWDEV-565427, github issue

Motivation

Apex is not building because the hip version from hipcc --version is not matching torch.version.hip.

torch.version.hip has been changed due to recent commit to fix issue. This commit stores rocm version in torch.version.hip.

The solution is to fix the torch.version.hip so that it uses the hipcc header values and removes the trailing hash code. In addition, torch.version.rocm variable is created to store the rocm version.

Technical Details

Fix torch.version.hip

HIP_VERSION variable is computed in https://github.com/ROCm/hip/blob/develop/cmake/FindHIP.cmake. This runs hipcc –version and extracts the output of HIP version line.

e.g.

hipcc --version 
HIP version: 7.1.25421-32f9fa6ca5 

For recent dockers, HIP_VERSION variable contains the hash code at the end.

For the torch.version.hip to be parsable with packaging code, it should not contain the hash code.

import torch  
from packaging import version  
print(version.parse(torch.version.hip)) 

torch.version.hip is a variable mentioned in torch/version.py which is created by tools/generate_torch_version.py and called in the installation process - torch/CMakeLists.txt.

Before the revert, the torch.version.hip was based on HIP_VERSION variable. For torch.version.hip to be parsable, HIP_VERSION should also be parsable.

This extra code removes the trailing hashcode from the HIP_VERSION variable so that the torch.version.hip is parsable by packaging version parse method.

Add torch.version.rocm

Code changes:

  • Add rocm variable to torch/version.py.tpl
  • Add code to write rocm variable in tools/generate_torch_version.py
  • Write rocm version in installation process - torch/CMakeLists.txt

Create unit test to check both torch.version.hip and torch.version.rocm are parsable.

Testing

Tested on docker registry-sc-harbor.amd.com/framework/compute-rocm-dkms-no-npi-hipclang:16831_ubuntu24.04_py3.12_pytorch_rocm7.1_internal_testing_5fc1aeaa

Successfully build pytorch and apex. Tested above parsing torch.version.hip code.

>>> import torch
>>> torch.version.hip
'7.1.25421'
>>> torch.version.rocm
'7.2.0'

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 17, 2025

Jenkins build for 36f944a73a15ecc42def83abcbb5a8b993113ad4 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Copy link

@naromero77amd naromero77amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend that we create a unit test for this API:
version.parse(torch.version.hip)
to enforce the restrictions that are needed by Apex.

@amd-sriram amd-sriram marked this pull request as draft November 17, 2025 22:02
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 18, 2025

Jenkins build for d2170fdd152ef799a9623ae12314211235d2b44b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@pruthvistony
Copy link
Collaborator

pruthvistony commented Nov 18, 2025

This change shouldnt be pushed to develop branch, it should be pushed to upstream/main. And IFU should bring it to develop branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants