Skip to content

Commit 72cf48e

Browse files
pytorchbotnWEIdia
andauthored
[AARCH64][CD][CUDA13][Triton][PTXAS] Turn on BUILD_BUNDLE_PTXAS=1 (pytorch#164236)
[AARCH64][CD][CUDA13][Triton][PTXAS] Turn on BUILD_BUNDLE_PTXAS=1 (pytorch#163988) See also pytorch#163972, which was intended to be this PR. Triton (release/3.5.x) by default ships CUDA12.8 ptxas. This PR tries to bundle a ptxas version for cuda13, so that it can help pytorch#163801 when users run on new devices like THOR and Spark. Fixes pytorch#163801 Test Plan: Check binary size increase against nightly or v2.9RC Install the binary from into a working THOR and GB200/GH100 machine (reproduce the original issue first on THOR), then install the binary built from this PR and we expect the issue to be gone without any additional user setting. Testing on GB200 is to ensure no regression. Reference: pytorch#119750 and pytorch/builder@5c814e2 Note: with this PR, the pytorch world's torch.compile is supposed to find ptxas via "torch/_inductor/runtime/compile_tasks.py" and "_set_triton_ptxas_path". Use cases that do not go through "_set_triton_ptxas_path" may not be able to use the cuda13 ptxas binary. However, as is, the triton world does not know the existence of this new cuda13 ptxas. So IF a users thinks there is already pytorch/bin/ptxas and delete the ptxas from triton, then https://github.com/triton-lang/triton/blob/c6ad34f7eb42630533412d93ca2cc00a4b4f8f3c/python/triton/knobs.py#L216 would still complain ptxas not found (if removed - it won't know this new one available) Pull Request resolved: pytorch#163988 Approved by: https://github.com/atalman (cherry picked from commit 3b4ad4a) Co-authored-by: Wei Wang <[email protected]>
1 parent a21a4bf commit 72cf48e

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

.ci/aarch64_linux/aarch64_ci_build.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ fi
1515
# Compress the fatbin with -compress-mode=size for CUDA 13
1616
if [[ "$DESIRED_CUDA" == *"13"* ]]; then
1717
export TORCH_NVCC_FLAGS="-compress-mode=size"
18+
# Bundle ptxas into the cu13 wheel, see https://github.com/pytorch/pytorch/issues/163801
19+
export BUILD_BUNDLE_PTXAS=1
1820
fi
1921

2022
SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"

torch/_inductor/runtime/compile_tasks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ def _reload_python_module(
4040
def _set_triton_ptxas_path() -> None:
4141
if os.environ.get("TRITON_PTXAS_PATH") is not None:
4242
return
43-
ptxas = Path(__file__).absolute().parents[1] / "bin" / "ptxas"
43+
ptxas = Path(__file__).absolute().parents[2] / "bin" / "ptxas"
4444
if not ptxas.exists():
4545
return
4646
if ptxas.is_file() and os.access(ptxas, os.X_OK):

0 commit comments

Comments
 (0)