model : gpt-oss simulate think tags when reasoning #15230

aldehir · 2025-08-11T08:58:32Z

Problem: the current implementation generates <|channel|>analysis<|message|>... in the output and clients send it back verbatim. This causes an exception in the official gpt-oss jinja chat templates.

This PR parses out the reasoning and does one of the following:

reasoning_format = auto - send it in the reasoning_content field.
reasoning_format = none - wrap it in <think></think>, sent to the content field to avoid exceptions.

Additionally, it parses out any final channels. It does not yet support tool use. More comprehensive parsing is being worked on #15181.

@ggerganov @ngxson

ngxson · 2025-08-11T09:13:19Z

src/llama-vocab.cpp

-        // @ngxson : quick hack for gpt-oss, always render these tokens
-        for (const auto & t : token_to_id) {
-            if (t.first == "<|channel|>" || t.first == "<|message|>" || t.first == "<|start|>") {
-                id_to_token[t.second].attr = LLAMA_TOKEN_ATTR_USER_DEFINED;
-            }
-        }
-


Are we sure about removing this? It will prevent rendering these token without --special

I am admittedly new to the code base, however for the web server it seems placing those tokens in preserved_tokens is sufficient to make them render.

I tested it with llama-cli and I see now it does omit them there. I will revert it.

In addition to this, I think <|constrain|> and <|end|> should also be added in the condition

ggerganov · 2025-08-11T09:26:12Z

After 6d75412, the llama-cli now crashes after the first message.

ngxson · 2025-08-11T09:37:04Z

After 6d75412, the llama-cli now crashes after the first message.

I remembered the new jinja template has a check to prevent certain tags from being in the input text. Not sure what's the best way to fix, maybe we need a patch/hotfix for that.

aldehir · 2025-08-11T09:43:18Z

Yes, I am seeing that as well. It's the exception thrown from the jinja template:

You have passed a message containing <|channel|> tags in the content field. Instead of doing this, you should pass analysis messages...

I can revert the last commit as a temporary workaround. It seems to work with --special in f058384 as well.

This reverts commit 6d75412.

aldehir · 2025-08-11T10:30:28Z

At least for now, I'm going to keep it as I originally had it since it is somewhat usable. I haven't explored the CLI enough to have any good input.

ngxson · 2025-08-11T12:06:16Z

IIRC chat.cpp also allow patching the jinja template, see an example in common_chat_params_init_deepseek_r1

We can temporary patch the jinja version of harmony, so that it doesn't throw an error. Then later on we can spend more time to do a proper fix

ngxson · 2025-08-11T12:54:38Z

Also for visiblity, I think probably we don't need to replace the reasoning tags to <think></think> as introduced in this PR. With the migration of the new frontend to Svelte, we will eventually support reasoning_content field which is be much cleaner. We plan to release this version this weekend, so let's hold off this PR a bit.

In the meantime, having tool call support (on your other PR) is a very good feature.

ggerganov · 2025-08-11T12:58:00Z

@ngxson I would like to have a the WebUI chat experience fixed quickly to the state that was working before the jinja template updates. Do you have a suggestion for a fix? I can also revery the GGUF models back to the old template?

ngxson · 2025-08-11T12:59:12Z

@ggerganov yes I can try to patch out the exception in jinja template

model : gpt-oss simulate think tags when reasoning

f058384

ngxson reviewed Aug 11, 2025

View reviewed changes

model : gpt-oss revert removal of user defined tokens

6d75412

Revert "model : gpt-oss revert removal of user defined tokens"

6cb886b

This reverts commit 6d75412.

This comment has been minimized.

Sign in to view

ngxson mentioned this pull request Aug 11, 2025

chat : hotfix gpt-oss jinja raising an exception #15243

Merged

aldehir closed this Aug 11, 2025

aldehir mentioned this pull request Aug 12, 2025

gpt-oss: implement harmony parsing #15181

Merged

model : gpt-oss simulate think tags when reasoning #15230

model : gpt-oss simulate think tags when reasoning #15230

Uh oh!

Conversation

aldehir commented Aug 11, 2025

Uh oh!

ngxson Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

aldehir Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

ngxson Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov commented Aug 11, 2025

Uh oh!

ngxson commented Aug 11, 2025

Uh oh!

aldehir commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented Aug 11, 2025

Uh oh!

This comment has been minimized.

ngxson commented Aug 11, 2025

Uh oh!

ngxson commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Aug 11, 2025

Uh oh!

ngxson commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aldehir Aug 11, 2025 •

edited

Loading

aldehir commented Aug 11, 2025 •

edited

Loading

ngxson commented Aug 11, 2025 •

edited

Loading