# Current Behavior Can't run DeepSeek-R1-Distill-Qwen-32B-GGUF, I get a tokenizer error: "llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'" Model url: https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF # Environment and Context Hardware: VPS with 32 GB RAM OS: Debian 12 Using the current version of llama-cpp-python # Suggestion Upgrade to the required version of llama.cpp: https://github.com/ggerganov/llama.cpp/releases/tag/b4514