Replies: 4 comments
-
|
Hello, thank you for your contribution. Moved to discussions as this is not exactly an Issue. Feel free to update vllm-for-windows branch with your code changes on generate_kernels.py and update README guide if you consider it appropriate. |
Beta Was this translation helpful? Give feedback.
-
you can actually escape the path here: |
Beta Was this translation helpful? Give feedback.
-
|
On another note: trying to build with pinned pytorch 2.11 (torch==2.11.0.dev20260216+cu126) nightly build + CUDA 12.8 + msvc2022 I actually got cuda kernel compilation error (fixed by renaming |
Beta Was this translation helpful? Give feedback.
-
|
cu130support? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
📚 The doc issue
Hello y'all! So, I was following the readme instructions for building from source and I stumbled on multiple errors during the process. Both because I have CUDA 12.8 but others appeared for reasons you'll see below.
First of I had to clone a specific branch and change the cuda version on the pytorch installation.
Then, all commands ran smoothly until the last
pip install . --no-build-isolationwhich gave me a lot of errors so I needed some time with trials and errors until it worked. Gemini 3 Pro helped me a lot here. Here are all the steps I had to go through to make it work.I always use command prompt unless I need to use powershell for some specific commands.
Create venv
Cloned vLLM for Windows v0.11.0 instead of latest
As per @SystemPanic comment here, I used branch v0.11.0 when cloning the repo because this version does not require pytorch 2.8 built from source.
git clone --single-branch --branch v0.11.0 https://github.com/SystemPanic/vllm-windows.gitPytorch installation
When installing pytorch, I made sure to use the cu128 versions:
pip install torch==2.7.1+cu128 torchaudio==2.7.1+cu128 torchvision==0.22.1+cu128 --index-url https://download.pytorch.org/whl/cu128Visual Studio installation
This one is not related to the CUDA version but I think it would be nice to add to the README.md file because it simply says "Visual Studio 2019 or newer is required" and it doesn't directly say what you need to actually install.
This makes sure that you'll have all the needed dependencies to compile the package.
Fix NVCC error caused by Visual Studio being too recent
I installed the latest VS version and NVCC was erroring out so I had to add these env variables
Set env variables
I used this variables but it obviously will depend on your context and machine so don't simply copy them blindly specially the cuDNN because you may not want to use it/may not have it installed.
Replace any cu126 mention to cu128
I replaced all "cu126" occurrences with "cu128" in all files inside the project folder before running the pip install commands.
Enable long paths
I was getting "filename too long" errors when compiling vLLM for Windows so I had to enable long paths in Windows
Powershell commands
In PowerShell running as administrator, run these commands:
git config --global core.longpaths trueSet-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1Rename vllm folder
I renamed my folder from C:\vllm-windows to C:\v
Edit Marlin "generate_kernels.py"
This file was erroring out because it tries to use Linux command "rm -rf" so I changed it to use python os lib instead
Step 1: Open the file in your preferred code editor
C:\v\csrc\quantization\gptq_marlin\generate_kernels.py
Step 2: Edit line 54
From
subprocess.call(["rm", "-f", filename])To
Shorten CUDA folder path creating a junction folder
The CUDA folder with spaces was also erroring out when building so I created a simpler junction folder.
In the the same CMD:
Build commands
Every time I needed to retry the build process, it's important to delete the build folder to delete any cache.
Suggest a potential alternative/fix
No response
Before submitting a new issue...
Beta Was this translation helpful? Give feedback.
All reactions