Skip to content

Commit d83995d

Browse files
cyang49njhill
authored andcommitted
Enable CUDA ARCH SM 8.9 for exllama builds
This PR enables SM 8.9 binary build for exllama kernels to support L40S (Ada). As for Pytorch, the stock build doesn't include SM 8.9. Pytorch developers claim that (1) CUDA automatically uses SM 8.6 binary when running on SM 8.9 GPUs and (2) CUDA binaries aren't shipped with SM 8.9 binaries. I think we can keep using stock pytorch pre-built package for now.
1 parent 11b8402 commit d83995d

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -220,15 +220,15 @@ FROM python-builder as exllama-kernels-builder
220220
WORKDIR /usr/src
221221

222222
COPY server/exllama_kernels/ .
223-
RUN TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX" python setup.py build
223+
RUN TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX;8.9" python setup.py build
224224

225225
## Build transformers exllamav2 kernels ########################################
226226
FROM python-builder as exllamav2-kernels-builder
227227

228228
WORKDIR /usr/src
229229

230230
COPY server/exllamav2_kernels/ .
231-
RUN TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX" python setup.py build
231+
RUN TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX;8.9" python setup.py build
232232

233233
## Flash attention cached build image ##########################################
234234
FROM base as flash-att-cache

0 commit comments

Comments
 (0)