-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
System Information:
- OS: Windows
- GPU: NVIDIA GeForce RTX 5060 Ti
- NVIDIA Driver Version: 577.00
- CUDA Version (from
nvidia-smi
): 12.9 - Python Version: 3.12
- Visual Studio: Visual Studio 2019 with "Desktop development with C++" workload
Problem Description:
I am unable to get llama-cpp-python to use my GPU. When I run a script to load a model with n_gpu_layers=-1, I get the error ggml_cuda_init: failed to initialize CUDA: (null), and all layers are
loaded on the CPU.
Troubleshooting Steps Taken:
- Installed llama-cpp-python using the following command in the "x64 Native Tools Command Prompt for VS 2019" with a Python virtual environment activated:
1 set CMAKE_ARGS="-DGGML_CUDA=on" && pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir - Verified that the command completes successfully, but the resulting installation does not use the GPU.
- Tried using the deprecated LLAMA_CUBLAS flag, which resulted in a build error (as expected).
- Performed a full cleanup of the environment:
- pip uninstall llama-cpp-python
- pip cache purge
- Manually deleted leftover ~* directories from site-packages.
- Reinstalled after the cleanup, but the problem persists.
- Installed PyTorch with CUDA 12.1 support (pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121) before reinstalling llama-cpp-python, but this did not
resolve the issue. - Confirmed that the correct Python interpreter and virtual environment are being used.
- The run_with_llama_cpp.py script being used is:
1 from llama_cpp import Llama
2
3 llm = Llama(
4 model_path="models/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
5 n_gpu_layers=-1,
6 n_ctx=4096,
7 verbose=True
8 )
9
10 output = llm(
11 "AI is going to ",
12 max_tokens=32,
13 stop=["."],
14 echo=True
15 )
16
17 print(output)
Request:
Could you please provide any insights into why the CUDA initialization might be failing, or suggest any further diagnostic steps? I can provide the full verbose build log if needed.
Metadata
Metadata
Assignees
Labels
No labels