Skip to content

Commit 02c36e5

Browse files
authored
Merge pull request #11 from vllm-project/dev_experience
Installing and Developing vLLM with Ease
2 parents 4577c6a + 70f6a15 commit 02c36e5

File tree

1 file changed

+120
-0
lines changed

1 file changed

+120
-0
lines changed

_posts/2025-01-10-dev-experience.md

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
layout: post
3+
title: "Installing and Developing vLLM with Ease"
4+
author: "vLLM Team"
5+
image: /assets/logos/vllm-logo-only-light.png
6+
---
7+
8+
The field of LLM inference is advancing at an unprecedented pace. With new models and features emerging weekly, the traditional software release pipeline often struggles to keep up. At vLLM, we aim to provide more than just a software package. We’re building a system—a trusted, trackable, and participatory ecosystem for LLM inference. This blog post highlights how vLLM enables users to install and develop with ease while staying at the forefront of innovation.
9+
10+
## TL;DR:
11+
12+
* Flexible and fast installation options from stable releases to nightly builds.
13+
* Streamlined development workflow for both Python and C++/CUDA developers.
14+
* Robust version tracking capabilities for production deployments.
15+
16+
## Seamless Installation of vLLM Versions
17+
18+
### Install Released Versions
19+
20+
We periodically release stable versions of vLLM to the [Python Package Index](https://pypi.org/project/vllm/), ensuring users can easily install them using standard Python package managers. For example:
21+
22+
```sh
23+
pip install vllm
24+
```
25+
26+
For those who prefer a faster package manager, [**uv**](https://github.com/astral-sh/uv) has been gaining traction in the vLLM community. After setting up a Python environment with uv, installing vLLM is straightforward:
27+
28+
```sh
29+
uv pip install vllm
30+
```
31+
32+
Refer to the [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu-cuda.html#install-released-versions) for more details on setting up [**uv**](https://github.com/astral-sh/uv). Using a simple server-grade setup (Intel 8th Gen CPU), we observe that [**uv**](https://github.com/astral-sh/uv) is 200x faster than pip:
33+
34+
```sh
35+
# with cached packages, clean virtual environment
36+
$ time pip install vllm
37+
...
38+
pip install vllm 59.09s user 3.82s system 83% cpu 1:15.68 total
39+
40+
# with cached packages, clean virtual environment
41+
$ time uv pip install vllm
42+
...
43+
uv pip install vllm 0.17s user 0.57s system 193% cpu 0.383 total
44+
```
45+
46+
### Install the Latest vLLM from the Main Branch
47+
48+
To meet the community’s need for cutting-edge features and models, we provide nightly wheels for every commit on the main branch.
49+
50+
**Using pip**:
51+
52+
```sh
53+
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
54+
```
55+
56+
Adding `--pre` ensures pip includes pre-released versions in its search.
57+
58+
**Using uv**:
59+
60+
```sh
61+
uv pip install vllm --extra-index-url https://wheels.vllm.ai/nightly
62+
```
63+
64+
## Development Made Simple
65+
66+
We understand that an active, engaged developer community is the backbone of innovation. That’s why vLLM offers smooth workflows for developers, regardless of whether they’re modifying Python code or working with kernels.
67+
68+
### Python Developers
69+
70+
For Python developers who need to tweak and test vLLM’s Python code, there’s no need to compile kernels. This setup enables you to start development quickly.
71+
72+
```sh
73+
git clone https://github.com/vllm-project/vllm.git
74+
cd vllm
75+
VLLM_USE_PRECOMPILED=1 pip install -e .
76+
```
77+
78+
The `VLLM_USE_PRECOMPILED=1` flag instructs the installer to use pre-compiled CUDA kernels instead of building them from source, significantly reducing installation time. This is perfect for developers focusing on Python-level features like API improvements, model support, or integration work.
79+
80+
This lightweight process runs efficiently, even on a laptop. Refer to our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu-cuda.html#python-only-build-without-compilation) for more advanced usage.
81+
82+
### C++/Kernel Developers
83+
84+
For advanced contributors working with C++ code or CUDA kernels, we incorporate a compilation cache to minimize build time and streamline kernel development. Please check our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu-cuda.html#full-build-with-compilation) for more details.
85+
86+
## Track Changes with Ease
87+
88+
The fast-evolving nature of LLM inference means interfaces and behaviors are still stabilizing. vLLM has been integrated into many workflows, including [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), [veRL](https://github.com/volcengine/verl), [open_instruct](https://github.com/allenai/open-instruct), [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory), etc. We collaborate with these projects to stabilize interfaces and behaviors for LLM inference. To facilitate the process, we provide powerful tools for these advanced users to track changes across versions.
89+
90+
### Installing a Specific Commit
91+
92+
To simplify tracking and testing, we provide wheels for every commit in the main branch. Users can easily install any specific commit, which can be particularly useful to bisect and track the changes.
93+
94+
We recommend using [**uv**](https://github.com/astral-sh/uv) to install a specific commit:
95+
96+
```sh
97+
# use full commit hash from the main branch
98+
export VLLM_COMMIT=72d9c316d3f6ede485146fe5aabd4e61dbc59069
99+
uv pip install vllm --extra-index-url https://wheels.vllm.ai/${VLLM_COMMIT}
100+
```
101+
102+
In [**uv**](https://github.com/astral-sh/uv), packages in `--extra-index-url` have [higher priority than the default index](https://docs.astral.sh/uv/pip/compatibility/#packages-that-exist-on-multiple-indexes), which makes it possible to install a developing version prior to the latest public release (at the time of writing, it is v0.6.6.post1).
103+
104+
In contrast, pip combines packages from `--extra-index-url` and the default index, choosing only the latest version, which makes it difficult to install a developing version prior to the released version. Therefore, for pip users, it requires specifying a placeholder wheel name to install a specific commit:
105+
106+
```sh
107+
# use full commit hash from the main branch
108+
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd
109+
pip install https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
110+
```
111+
112+
## Conclusion
113+
114+
At vLLM, our commitment extends beyond delivering high-performance software. We’re building a system that empowers trust, enables transparent tracking of changes, and invites active participation. Together, we can shape the future of AI, pushing the boundaries of innovation while making it accessible to all.
115+
116+
For collaboration requests or inquiries, reach out at [[email protected]](mailto:[email protected]). Join our growing community on [GitHub](https://github.com/vllm-project/vllm) or connect with us on the [vLLM Slack](https://slack.vllm.ai/). Together, let’s drive AI innovation forward.
117+
118+
## Acknowledgments
119+
120+
We extend our gratitude to the [uv community](https://docs.astral.sh/uv/) — particularly [Charlie Marsh](https://github.com/charliermarsh) — for creating a fast, innovative package manager. Special thanks to [Kevin Luu](https://github.com/khluu) (Anyscale), [Daniele Trifirò](https://github.com/dtrifiro) (Red Hat), and [Michael Goin](https://github.com/mgoin) (Neural Magic) for their invaluable contributions to streamlining workflows. [Kaichao You](https://github.com/youkaichao) and [Simon Mo](https://github.com/simon-mo) from the UC Berkeley team lead these efforts.

0 commit comments

Comments
 (0)