### Name and Version literally head Device 0: NVIDIA GeForce RTX 3080, compute capability 8.6, VMM: yes version: 6134 (be48528b) built with MSVC 19.41.34120.0 for x64 also tested at the office on ubuntu, on head ### Operating systems Linux ### Which llama.cpp modules do you know to be affected? llama-server ### Command line ```shell ` -fa -ngl 999999 -ctk q4_0 -ctv q4_0 -c 128000 --jinja -m Qwen3-4B-Q5_K_S.gguf --port 2483 --slots` ``` ### Problem description & steps to reproduce If you call `chat.completions.create` on a reasoning model for example here Qwen3 with `tool_choice: "required"` It will refuse to do the reasoning because of the grammar implemented. ### First Bad Commit _No response_ ### Relevant log output ```shell ```