Skip to content

[Installation]: getting vLLM installed with a free-threaded Python interpreter (3.14t) #28762

@rgommers

Description

@rgommers

https://x.com/vllm_project/status/1942450223881605593 showed that it's possible to run vLLM in a free-threaded Python interpreter. That involved a lot of custom work to get dependencies to build, and the situation now (4 months later) is a lot better. This is meant as a tracking issue for all the dependencies of vLLM - to start with on Linux x86-64 (CPU and CUDA) - and as a "work list" for getting the remaining issues with those dependencies resolved. With a goal of uv pip install vllm to do the right thing out of the box in a clean 3.14t environment.

Python 3.14t is necessary, both because CPython 3.14t itself is much more stable than 3.13t and because there are a number of important packages (e.g., cffi, aiohttp) that do support 3.14t but won't support 3.13t. Of course this depends on vLLM itself supporting Python 3.14 (the default, with-GIL build) first - see gh-26994 for that.

This is the dependency graph for vLLM, generated from the default dependencies for the latest release (0.11.0) on PyPI, with packages with compiled code with free-threading wheels on PyPI marked in green and those without in red (easier to browse in a fresh browser tab):

Image

PyTorch 2.9.0 has 3.14/3.14t wheels marked as "preview"; 2.10.0 in January will contain full support.

On Linux x86-64, here are all the packages from the build, cpu and common requirements files that don't cleanly install from PyPI:

  • msgspec: has support in its main branch, installs fine from source
  • numba: a release with support is being worked on, see numba#9928
  • opencv-python-headless: no support yet, tracking issue is opencv#27933
    • To build from source: clone the repo, check out opencv-python#1051, ensure the build dependencies in pyproject.toml are installed (latest versions, ignore the pins) and then build with export ENABLE_HEADLESS=1 && uv pip install . --no-build-isolation
  • outlines-core: builds fine from source with the change in outlines-core#235 for Python 3.14 support
  • llguidance: doesn't build now builds from source on main after llguidance#255, needs some fixes (xref llguidance#256)
  • openai-harmony: latest version on PyPI builds fine from source (xref harmony#87 for support/wheels)
  • safetensors: builds from source (main branch)
  • tokenizers: latest version on PyPI builds fine from source
  • xgrammar: latest version (0.1.27) builds fine from source

And the CUDA-specific dependencies that don't yet have support:

  • flashinfer-python: to investigate, no upstream issue yet
  • ray: revisit after 3.14 support has landed
  • xformers: no support yet, not expected soon (xref xformers#1345)

All other packages install fine. Note that there are dependencies that have optional C extensions that aren't yet working so those packages will be pulled in as pure Python currently (e.g., cbor2 and protobuf). And there are packages with known thread-safety issues (e.g., msgpack, protobuf). Those packages may have wheels or seem to install fine, but do not yet have a check in the "PyPI release" column in https://py-free-threading.github.io/tracking/.

So as of today, everything except for llguidance is installable from PyPI or from source with a modest bit of effort. Something along these lines for cpu:

uv venv --python=3.14t venv-vllm
source venv-vllm/bin/activate
uv pip install torch torchaudio torchvision triton --torch-backend cpu
# Comment out `torch` in `build.txt`
uv pip install -r requirements/build.txt

# Now install the dependencies in the checklist above one by one
# You need GCC and a Rust compiler installed 

# Comment out `llguidance` in `common.txt`; also need to loose remove some == pins
uv pip install -r requirements/common.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions