-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Description
https://x.com/vllm_project/status/1942450223881605593 showed that it's possible to run vLLM in a free-threaded Python interpreter. That involved a lot of custom work to get dependencies to build, and the situation now (4 months later) is a lot better. This is meant as a tracking issue for all the dependencies of vLLM - to start with on Linux x86-64 (CPU and CUDA) - and as a "work list" for getting the remaining issues with those dependencies resolved. With a goal of uv pip install vllm to do the right thing out of the box in a clean 3.14t environment.
Python 3.14t is necessary, both because CPython 3.14t itself is much more stable than 3.13t and because there are a number of important packages (e.g., cffi, aiohttp) that do support 3.14t but won't support 3.13t. Of course this depends on vLLM itself supporting Python 3.14 (the default, with-GIL build) first - see gh-26994 for that.
This is the dependency graph for vLLM, generated from the default dependencies for the latest release (0.11.0) on PyPI, with packages with compiled code with free-threading wheels on PyPI marked in green and those without in red (easier to browse in a fresh browser tab):
PyTorch 2.9.0 has 3.14/3.14t wheels marked as "preview"; 2.10.0 in January will contain full support.
On Linux x86-64, here are all the packages from the build, cpu and common requirements files that don't cleanly install from PyPI:
-
msgspec: has support in itsmainbranch, installs fine from source -
numba: a release with support is being worked on, see numba#9928 -
opencv-python-headless: no support yet, tracking issue is opencv#27933- To build from source: clone the repo, check out opencv-python#1051, ensure the build dependencies in
pyproject.tomlare installed (latest versions, ignore the pins) and then build withexport ENABLE_HEADLESS=1 && uv pip install . --no-build-isolation
- To build from source: clone the repo, check out opencv-python#1051, ensure the build dependencies in
-
outlines-core: builds fine from source with the change in outlines-core#235 for Python 3.14 support -
llguidance:doesn't buildnow builds from source onmainafter llguidance#255, needs some fixes (xref llguidance#256) -
openai-harmony: latest version on PyPI builds fine from source (xref harmony#87 for support/wheels) -
safetensors: builds from source (mainbranch) -
tokenizers: latest version on PyPI builds fine from source -
xgrammar: latest version (0.1.27) builds fine from source
And the CUDA-specific dependencies that don't yet have support:
-
flashinfer-python: to investigate, no upstream issue yet -
ray: revisit after 3.14 support has landed -
xformers: no support yet, not expected soon (xref xformers#1345)
All other packages install fine. Note that there are dependencies that have optional C extensions that aren't yet working so those packages will be pulled in as pure Python currently (e.g., cbor2 and protobuf). And there are packages with known thread-safety issues (e.g., msgpack, protobuf). Those packages may have wheels or seem to install fine, but do not yet have a check in the "PyPI release" column in https://py-free-threading.github.io/tracking/.
So as of today, everything except for llguidance is installable from PyPI or from source with a modest bit of effort. Something along these lines for cpu:
uv venv --python=3.14t venv-vllm
source venv-vllm/bin/activate
uv pip install torch torchaudio torchvision triton --torch-backend cpu
# Comment out `torch` in `build.txt`
uv pip install -r requirements/build.txt
# Now install the dependencies in the checklist above one by one
# You need GCC and a Rust compiler installed
# Comment out `llguidance` in `common.txt`; also need to loose remove some == pins
uv pip install -r requirements/common.txt