Skip to content

Commit c269ef6

Browse files
juntaoclaude
andcommitted
Add CUDA release variants for x86_64 and aarch64
Add two CUDA 12.6 matrix entries to the release workflow, matching the second-state/qwen3_asr_rs release structure: transcribe-linux-x86_64-cuda.zip — CUDA 12.6 libtorch (x86_64) transcribe-linux-aarch64-cuda.zip — CUDA 12.6 libtorch (Jetson/Grace) Each CUDA variant installs cuda-toolkit-12-6 for linking (aarch64 also adds libopenblas-dev), sets -rpath-link and LD_LIBRARY_PATH so the linker resolves CUDA symbols from libtorch, and bundles the CUDA-enabled libtorch/lib/ in the release zip alongside both binaries and vocab.json. Signed-off-by: Michael Yuan <michael@secondstate.io> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 0848c77 commit c269ef6

File tree

1 file changed

+43
-0
lines changed

1 file changed

+43
-0
lines changed

.github/workflows/release.yml

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,15 @@ jobs:
6565
libtorch-url: >-
6666
https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-x86_64-2.7.1.tar.gz
6767
68+
# ── Linux x86_64 — CUDA 12.6 libtorch ────────────────────────────
69+
- os: ubuntu-latest
70+
name: Linux x86_64 (CUDA)
71+
backend: tch
72+
asset-name: transcribe-linux-x86_64-cuda
73+
needs-cuda: true
74+
libtorch-url: >-
75+
https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-x86_64-cuda12.6-2.7.1.tar.gz
76+
6877
# ── Linux aarch64 — CPU libtorch (SVE256, Graviton3 / Altra) ─────
6978
- os: ubuntu-24.04-arm
7079
name: Linux aarch64
@@ -73,6 +82,15 @@ jobs:
7382
libtorch-url: >-
7483
https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-aarch64-2.7.1.tar.gz
7584
85+
# ── Linux aarch64 — CUDA 12.6 libtorch (Jetson / Grace) ─────────
86+
- os: ubuntu-24.04-arm
87+
name: Linux aarch64 (CUDA)
88+
backend: tch
89+
asset-name: transcribe-linux-aarch64-cuda
90+
needs-cuda: true
91+
libtorch-url: >-
92+
https://github.com/second-state/libtorch-releases/releases/download/v2.7.1/libtorch-cxx11-abi-aarch64-cuda12.6-2.7.1.tar.gz
93+
7694
# ── macOS Apple Silicon — MLX backend ────────────────────────────
7795
- os: macos-15
7896
name: macOS Apple Silicon
@@ -87,6 +105,23 @@ jobs:
87105
- name: Install Rust stable
88106
uses: dtolnay/rust-toolchain@stable
89107

108+
# ── CUDA toolkit (for linking CUDA-enabled libtorch) ─────────────────
109+
- name: Install CUDA toolkit (x86_64)
110+
if: matrix.needs-cuda && runner.arch == 'X64'
111+
run: |
112+
wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
113+
sudo dpkg -i cuda-keyring_1.1-1_all.deb
114+
sudo apt-get update
115+
sudo apt-get install -y cuda-toolkit-12-6
116+
117+
- name: Install CUDA toolkit and BLAS (aarch64)
118+
if: matrix.needs-cuda && runner.arch == 'ARM64'
119+
run: |
120+
wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/sbsa/cuda-keyring_1.1-1_all.deb
121+
sudo dpkg -i cuda-keyring_1.1-1_all.deb
122+
sudo apt-get update
123+
sudo apt-get install -y cuda-toolkit-12-6 libopenblas-dev
124+
90125
# ── libtorch (Linux) ──────────────────────────────────────────────────
91126
- name: Download and extract libtorch
92127
if: matrix.backend == 'tch'
@@ -111,6 +146,14 @@ jobs:
111146
cmake --build /tmp/mlx-c/build --parallel "$(sysctl -n hw.logicalcpu)"
112147
sudo cmake --install /tmp/mlx-c/build
113148
149+
# ── Set linker flags for CUDA builds ──────────────────────────────────
150+
- name: Set CUDA linker flags
151+
if: matrix.needs-cuda
152+
run: |
153+
FLAGS="-C link-arg=-Wl,-rpath-link,/usr/local/cuda/lib64"
154+
echo "RUSTFLAGS=$FLAGS" >> "$GITHUB_ENV"
155+
echo "LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH}" >> "$GITHUB_ENV"
156+
114157
# ── Build ─────────────────────────────────────────────────────────────
115158
- name: Build release binaries (tch-backend)
116159
if: matrix.backend == 'tch'

0 commit comments

Comments
 (0)