33Installation
44============
55
6- vLLM is a Python library that also contains some C++ and CUDA code.
7- This additional code requires compilation on the user's machine.
6+ vLLM is a Python library that also contains pre-compiled C++ and CUDA (11.8) binaries.
87
98Requirements
109------------
1110
1211* OS: Linux
13- * Python: 3.8 or higher
14- * CUDA: 11.0 -- 11.8
12+ * Python: 3.8 -- 3.11
1513* GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, etc.)
1614
17- .. note ::
18- As of now, vLLM does not support CUDA 12.
19- If you are using Hopper or Lovelace GPUs, please use CUDA 11.8 instead of CUDA 12.
20-
21- .. tip ::
22- If you have trouble installing vLLM, we recommend using the NVIDIA PyTorch Docker image.
23-
24- .. code-block :: console
25-
26- $ # Pull the Docker image with CUDA 11.8.
27- $ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3
28-
29- Inside the Docker container, please execute :code: `pip uninstall torch ` before installing vLLM.
30-
3115Install with pip
3216----------------
3317
@@ -40,7 +24,7 @@ You can install vLLM using pip:
4024 $ conda activate myenv
4125
4226 $ # Install vLLM.
43- $ pip install vllm # This may take 5-10 minutes.
27+ $ pip install vllm
4428
4529
4630 .. _build_from_source :
@@ -55,3 +39,11 @@ You can also build and install vLLM from source:
5539 $ git clone https://github.com/vllm-project/vllm.git
5640 $ cd vllm
5741 $ pip install -e . # This may take 5-10 minutes.
42+
43+ .. tip ::
44+ If you have trouble building vLLM, we recommend using the NVIDIA PyTorch Docker image.
45+
46+ .. code-block :: console
47+
48+ $ # Pull the Docker image with CUDA 11.8.
49+ $ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3
0 commit comments