Commit 5fac79c
authored
Thinking model disabled assistant prefill (ggml-org#15404)
* feat: Set enable_thinking IFF not disabled and supported
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Fix inverted logic condition for prefill error
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Always parse the enable_thinking kwarg to overwrite the default value
From what I can tell, this started as a Qwen3-specific keyword, but from
the use in `chat.cpp` translates this inputs.enable_thinking to the right
thinking kwarg for the given model, this is now more of a standardized
kwarg, so it should always override the default value when sent as part of
the chat_template_kwargs field in the API.
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Don't limit tempalte expansion check to jinja
With the use_jinja check, non-jinja models would enable thinking and always
fail assistant prefill
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* feat: Add the error text to json type errors in json_value
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* feat: Explicitly reject string values for "enable_thinking"
There are too many possible "truthy" / "falsy" strings and too many
ambiguous strings that don't have a clear truthy/falsy value, so the
simplest thing to do here is to reject the request. Ideally, this would be
a 422 (Unprocessable Entity), but right now it's coming back as a 500.
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* refactor: Move logic for detecting template enable_thinking support to common
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
* fix: Use raw pointer for common chat template function
Branch: gabe-l-hart/thinking-model-disabled-agent-prefill
Signed-off-by: Gabe Goodhart <[email protected]>
---------
Signed-off-by: Gabe Goodhart <[email protected]>1 parent 408ff52 commit 5fac79c
4 files changed
+35
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
166 | 179 | | |
167 | 180 | | |
168 | 181 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
| 202 | + | |
| 203 | + | |
202 | 204 | | |
203 | 205 | | |
204 | 206 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2267 | 2267 | | |
2268 | 2268 | | |
2269 | 2269 | | |
| 2270 | + | |
| 2271 | + | |
| 2272 | + | |
| 2273 | + | |
| 2274 | + | |
| 2275 | + | |
2270 | 2276 | | |
2271 | 2277 | | |
2272 | 2278 | | |
| |||
2275 | 2281 | | |
2276 | 2282 | | |
2277 | 2283 | | |
2278 | | - | |
| 2284 | + | |
2279 | 2285 | | |
2280 | 2286 | | |
2281 | 2287 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
58 | | - | |
| 57 | + | |
| 58 | + | |
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| |||
708 | 708 | | |
709 | 709 | | |
710 | 710 | | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
711 | 721 | | |
712 | 722 | | |
713 | 723 | | |
| |||
724 | 734 | | |
725 | 735 | | |
726 | 736 | | |
727 | | - | |
| 737 | + | |
728 | 738 | | |
729 | 739 | | |
730 | 740 | | |
| |||
0 commit comments