You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .cd/README.md
+17-8Lines changed: 17 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,21 +28,30 @@ Supports a wide range of validated models including LLaMa, Mistral, and Qwen fam
28
28
29
29
## How to Use
30
30
31
+
### 0. Clone the Repository
32
+
33
+
Before proceeding with any of the steps below, make sure to clone the vLLM fork repository and navigate to the `.cd` directory. This ensures you have all necessary files and scripts for running the server or benchmarks.
The recommended and easiest way to start the vLLM server is with Docker Compose. At a minimum, set the following environment variables:
34
43
35
44
-`MODEL` - Select a model from the table above.
36
45
-`HF_TOKEN` - Your Hugging Face token (generate one at <https://huggingface.co>).
37
-
-`DOCKER_IMAGE` - The vLLM Docker image URL from Gaudi or local repository.
46
+
-`DOCKER_IMAGE` - The vLLM Docker image URL from Gaudi or local repository. When using the Gaudi repository, please select Docker images with the vllm-installer* prefix in the file name.
0 commit comments