Skip to content

Commit 35bae11

Browse files
authored
fix gh200 tests on main (#11246)
Signed-off-by: youkaichao <[email protected]>
1 parent 88a412e commit 35bae11

File tree

2 files changed

+3
-6
lines changed

2 files changed

+3
-6
lines changed

.buildkite/run-gh200-test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ set -ex
66

77
# Try building the docker image
88
DOCKER_BUILDKIT=1 docker build . \
9-
--target test \
10-
-platform "linux/arm64" \
9+
--target vllm-openai \
10+
--platform "linux/arm64" \
1111
-t gh200-test \
1212
--build-arg max_jobs=66 \
1313
--build-arg nvcc_threads=2 \

docs/source/serving/deploying_with_docker.rst

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,16 +54,13 @@ of PyTorch Nightly and should be considered **experimental**. Using the flag `--
5454
# Example of building on Nvidia GH200 server. (Memory usage: ~12GB, Build time: ~1475s / ~25 min, Image size: 7.26GB)
5555
$ DOCKER_BUILDKIT=1 sudo docker build . \
5656
--target vllm-openai \
57-
-platform "linux/arm64" \
57+
--platform "linux/arm64" \
5858
-t vllm/vllm-gh200-openai:latest \
5959
--build-arg max_jobs=66 \
6060
--build-arg nvcc_threads=2 \
6161
--build-arg torch_cuda_arch_list="9.0+PTX" \
6262
--build-arg vllm_fa_cmake_gpu_arches="90-real"
6363
64-
65-
66-
6764
To run vLLM:
6865

6966
.. code-block:: console

0 commit comments

Comments
 (0)