GPT-OSS: parse commentary tool calls; handle glued 'json'; add unit tests (#15102) #15158

Nerexis · 2025-08-07T20:30:32Z

Summary

Implements proper GPT-OSS output parsing for tool calls in common/chat.cpp.
Recognizes commentary tool-call headers and the final content header:
- commentary: <|start|>assistant<|channel|>commentary to=functions.NAME json<|
  message|>{...}
- final: <|start|>assistant<|channel|>final<|message|>...content...
Handles a common variant where models emit NAMEjson (e.g., "shelljson") with n
o space before json by trimming the “json” suffix from the function name.
Adds unit tests to cover both standard and glued json cases, and multiple tool

Fixes

Fixes Eval bug: Tools In Prompt Crashing On gpt-oss 20b #15102: avoids crashes/empty responses when tools are present in the req
uest and GPT-OSS emits commentary headers (including “glued json” cases). Ensure
s structured tool_calls are returned and final content is parsed correctly.

Behavior before

GPT-OSS responses with tools in the request could be parsed as raw content, so
metimes leading to runtime errors or empty messages (e.g., when “json” was glued
to the function name).

Behavior after

Tool calls are properly extracted into tool_calls (OpenAI-compatible format),
and final channel content is captured as assistant content. “Glued json” is tole
rated.

Scope and impact

Code touched: parser only (common/chat.cpp) and tests (tests/test-chat.cpp).
No changes to ggml; no operators added/modified.
No public API changes.
No additional dependencies.

Implementation details

common/chat.cpp:
- Function: common_chat_parse_gpt_oss
- Regex recognizes commentary and final blocks.
- When commentary header is NAMEjson (no space), the trailing “json” suffix is
  stripped from the function name.
- Parses JSON arguments and records tool_calls; consumes final block as assist
  ant content.
tests/test-chat.cpp:
- Adds test_gpt_oss_parser() with three cases:
  - commentary + explicit json + final
  - commentary with NAMEjson + final
  - multiple commentary tool_calls followed by final
- Integrated into the test runner.

Testing

Builds: llama-server and test-chat targets build successfully.
Unit tests: Added parser-focused tests for GPT-OSS (fast; no model required).
Manual check: Verified the parser behavior aligns with reported logs for both
standard and glued “json” headers.
Perplexity/performance: Not applicable (parser-only change, no runtime path in
model eval). No changes to llama-perplexity or llama-bench expected.

Compatibility and risks

Streaming: parser operates safely when partial JSON is not yet complete (consi
stent with existing parsing behavior).
Reasoning: analysis channel text is ignored; final content channel is used as
assistant content.
No behavior changes to other chat formats.

Checklist

Separate PR for a single fix/feature: yes.
No extra dependencies/files/headers: yes.
Cross-platform, no platform-specific code: yes.
Follows existing style and naming patterns; formatted per project conventions:
yes.
CI: please run full CI. Perplexity and performance should be unaffected (parse
r-only).
Maintainers: “Allow edits from maintainers” is recommended for faster iteratio
n.

Module

Module: chat (common/chat.cpp)

…x; add unit tests; tidy comments

marceldev89 · 2025-08-07T21:20:17Z

Nice! Give it a quick test with opencode and it just stops at the last tool call in the screenshot.

EDIT:
Also tried with qwen-code. It seemed to behave better but eventually stopped at the same point as well it seems.

brandonj60 · 2025-08-08T00:57:18Z

In non-tool calls, this seems to cause the final channel to be merged into the analysis channel? Unless I'm misunderstanding something.

hmmroger · 2025-08-08T05:00:34Z

common/chat.cpp

 }
 static void common_chat_parse_gpt_oss(common_chat_msg_parser & builder) {
-    // TODO @ngxson : this won't work with --special enabled, we should fix that
-    builder.try_parse_reasoning("<|channel|>analysis<|message|>", "<|start|>assistant<|channel|>final<|message|>");


Removing try_parse_reasoning would lose the ability to capture reasoning to reasoning_content field in the response. For gpt-oss, reasoning is the message from analysis channel.

hmmroger · 2025-08-08T05:03:02Z

common/chat.cpp

+                }
+                builder.add_content(builder.consume_rest());
+                return;
+            }


This condition block should be kept as is. When builder.syntex().parse_tool_calls is false, we just consume_rest without trying to parse tools. Move the tool processing after this block.

victorb · 2025-08-08T13:35:09Z

I'm one of the people from #15102, this PR seems to fix the crash, and work a bit better with the tool calling, but doesn't fully work. I made some additional changes which seems to enable full tool usage, and proper reasoning_content/content split, but I'm not 100% sure if it's the correct fixes I made, I'm very new to cpp development :)

These are the changes I ended up doing:

diff --git a/common/chat.cpp b/common/chat.cpp
index 2e17f3e6..5a9b8041 100644
--- a/common/chat.cpp
+++ b/common/chat.cpp
@@ -1322,8 +1322,21 @@ static void common_chat_parse_gpt_oss(common_chat_msg_parser & builder) {
     static const common_regex next_block(
         "<\\|start\\|>assistant<\\|channel\\|>(?:commentary\\s+to=functions\\.([^\\s<\\|]+)(?:\\s+json)?|final)<\\|message\\|>");

+    auto handle_prelude = [&](const std::string & prelude) {
+        const std::string analysis_tag = "<|channel|>analysis<|message|>";
+        auto pos = prelude.find(analysis_tag);
+        if (pos != std::string::npos) {
+            auto reasoning = prelude.substr(pos + analysis_tag.size());
+            auto stripped = string_strip(reasoning);
+            if (!stripped.empty()) {
+                builder.add_reasoning_content(stripped);
+            }
+        }
+    };
+
     if (!builder.syntax().parse_tool_calls) {
-        if (auto res = builder.try_find_regex(next_block)) {
+        if (auto res = builder.try_find_regex(next_block, std::string::npos, /* add_prelude_to_content = */ false)) {
+            handle_prelude(res->prelude);
             while (true) {
                 std::string fname = builder.str(res->groups[1]);
                 const std::string header = builder.str(res->groups[0]);
@@ -1336,24 +1349,29 @@ static void common_chat_parse_gpt_oss(common_chat_msg_parser & builder) {
                     if (!builder.try_consume_json_with_dumped_args({{}})) {
                         break;
                     }
-                    res = builder.try_find_regex(next_block);
+                    res = builder.try_find_regex(next_block, std::string::npos, /* add_prelude_to_content = */ false);
                     if (!res) break;
+                    handle_prelude(res->prelude);
                     continue;
                 }
                 builder.add_content(builder.consume_rest());
                 return;
             }
         }
-        builder.add_content(builder.consume_rest());
+        // No more headers; treat the rest as potential analysis
+        auto rest = builder.consume_rest();
+        handle_prelude(rest);
         return;
     }

     while (true) {
-        auto res = builder.try_find_regex(next_block);
+        auto res = builder.try_find_regex(next_block, std::string::npos, /* add_prelude_to_content = */ false);
         if (!res) {
-            builder.add_content(builder.consume_rest());
+            auto rest = builder.consume_rest();
+            handle_prelude(rest);
             return;
         }
+        handle_prelude(res->prelude);
         std::string fname = builder.str(res->groups[1]);
         const std::string header = builder.str(res->groups[0]);
         if (fname.size() > 4 && header.find(" json<|message|>") == std::string::npos) {

If I run this PR as-is, I get the following output:

sending:
[
    ChatMessage {
        role: "system",
        content: Some(
            "You are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "developer",
        content: Some(
            "# Instructions\nAnswer calmly and use tools when helpful.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "user",
        content: Some(
            "What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "PNtcQJjaSs39KlfATrbrpYa9cc9ExYD0",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Barcelona\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Barcelona: ☀\u{fe0f}   +22°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "PNtcQJjaSs39KlfATrbrpYa9cc9ExYD0",
        ),
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "y4Z7oOTBmew4xz2qX1WJfFFyhziR6HLz",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Stockholm\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Stockholm: 🌦   +20°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "y4Z7oOTBmew4xz2qX1WJfFFyhziR6HLz",
        ),
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "e6A8jVb2x9PP1ChfYyyq1er3hCNLWtKs",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Lima\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Lima: 🌦   +15°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "e6A8jVb2x9PP1ChfYyyq1er3hCNLWtKs",
        ),
    },
]
{"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"<|channel|>analysis<|message|>The user asked: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have the weather data from the function calls:\n\nBarcelona: +22°C\nStockholm: +20°C\nLima: +15°C\n\nWe need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. \"sorted by their temperatures.\" Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, \"sorted by their temperatures\" is ambiguous. Maybe we should clarify? But the instruction says: \"Always use tool calling to verify anything you can verify.\" We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.\n\nBut the user might want to see the list sorted by temperature, maybe ascending. Let's output:\n\n- Lima: 15°C\n- Stockholm: 20°C\n- Barcelona: 22°C\n\nAlso include the icons. Provide the list. Possibly each line with city: icon and temperature.\n\nWe need to display them in a list sorted by temperatures. We'll use the results.\n\nLet's produce the final answer.Here’s the current weather for each city, sorted from the coldest to the warmest:\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦     | **+15 °C** |\n| **Stockholm** | 🌦 | **+20 °C** |\n| **Barcelona** | ☀️ | **+22 °C** |\n\nLet me know if you’d like them sorted the other way around or if you need more details!"}}],"created":1754659714,"model":"gpt-oss-20b-MXFP4.gguf","system_fingerprint":"b6112-2bf7f3f1","object":"chat.completion","usage":{"completion_tokens":505,"prompt_tokens":382,"total_tokens":887},"id":"chatcmpl-ZG4EdIZW3W1WSckiXZjWMMB1pHXhiSoJ","__verbose":{"index":0,"content":"<|channel|>analysis<|message|>The user asked: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have the weather data from the function calls:\n\nBarcelona: +22°C\nStockholm: +20°C\nLima: +15°C\n\nWe need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. \"sorted by their temperatures.\" Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, \"sorted by their temperatures\" is ambiguous. Maybe we should clarify? But the instruction says: \"Always use tool calling to verify anything you can verify.\" We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.\n\nBut the user might want to see the list sorted by temperature, maybe ascending. Let's output:\n\n- Lima: 15°C\n- Stockholm: 20°C\n- Barcelona: 22°C\n\nAlso include the icons. Provide the list. Possibly each line with city: icon and temperature.\n\nWe need to display them in a list sorted by temperatures. We'll use the results.\n\nLet's produce the final answer.<|start|>assistant<|channel|>final<|message|>Here’s the current weather for each city, sorted from the coldest to the warmest:\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦     | **+15 °C** |\n| **Stockholm** | 🌦 | **+20 °C** |\n| **Barcelona** | ☀️ | **+22 °C** |\n\nLet me know if you’d like them sorted the other way around or if you need more details!","tokens":[],"id_slot":0,"stop":true,"model":"gpt-oss-20b-MXFP4.gguf","tokens_predicted":505,"tokens_evaluated":382,"generation_settings":{"n_predict":-1,"seed":4294967295,"temperature":0.800000011920929,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":40,"top_p":0.949999988079071,"min_p":0.05000000074505806,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":0.0,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":131072,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":-1,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"","grammar_lazy":false,"grammar_triggers":[],"preserved_tokens":[],"chat_format":"GPT-OSS","reasoning_format":"auto","reasoning_in_content":false,"thinking_forced_open":false,"samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"timings_per_token":false,"post_sampling_probs":false,"lora":[]},"prompt":"<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-08-08\n\nReasoning: high\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.\nCalls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>developer<|message|># Instructions\n\nYou are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.\n\n# Tools\n\n## functions\n\nnamespace functions {\n\n// adds two numbers\ntype add = (_: {\na: number,\nb: number\n}) => any;\n\n// multiplies two numbers\ntype multiply = (_: {\na: number,\nb: number\n}) => any;\n\n// Get the weather for the specified location\ntype get_weather = (_: {\nlocation: string\n}) => any;\n\n} // namespace functions<|end|><|start|>user<|message|>What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Barcelona\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Barcelona: ☀️   +22°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Stockholm\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Stockholm: 🌦   +20°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Lima\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Lima: 🌦   +15°C\\\\n\\\"}\"<|end|><|start|>assistant","has_new_line":true,"truncated":false,"stop_type":"eos","stopping_word":"","tokens_cached":886,"timings":{"prompt_n":49,"prompt_ms":30.405,"prompt_per_token_ms":0.6205102040816327,"prompt_per_second":1611.5770432494655,"predicted_n":505,"predicted_ms":2119.47,"predicted_per_token_ms":4.1969702970297025,"predicted_per_second":238.26711394829843}},"timings":{"prompt_n":49,"prompt_ms":30.405,"prompt_per_token_ms":0.6205102040816327,"prompt_per_second":1611.5770432494655,"predicted_n":505,"predicted_ms":2119.47,"predicted_per_token_ms":4.1969702970297025,"predicted_per_second":238.26711394829843}}
[src/lib.rs:43:9] &val = Object {
    "choices": Array [
        Object {
            "finish_reason": String("stop"),
            "index": Number(0),
            "message": Object {
                "role": String("assistant"),
                "content": String("<|channel|>analysis<|message|>The user asked: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have the weather data from the function calls:\n\nBarcelona: +22°C\nStockholm: +20°C\nLima: +15°C\n\nWe need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. \"sorted by their temperatures.\" Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, \"sorted by their temperatures\" is ambiguous. Maybe we should clarify? But the instruction says: \"Always use tool calling to verify anything you can verify.\" We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.\n\nBut the user might want to see the list sorted by temperature, maybe ascending. Let's output:\n\n- Lima: 15°C\n- Stockholm: 20°C\n- Barcelona: 22°C\n\nAlso include the icons. Provide the list. Possibly each line with city: icon and temperature.\n\nWe need to display them in a list sorted by temperatures. We'll use the results.\n\nLet's produce the final answer.Here’s the current weather for each city, sorted from the coldest to the warmest:\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦     | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\nLet me know if you’d like them sorted the other way around or if you need more details!"),
            },
        },
    ],
    "created": Number(1754659714),
    "model": String("gpt-oss-20b-MXFP4.gguf"),
    "system_fingerprint": String("b6112-2bf7f3f1"),
    "object": String("chat.completion"),
    "usage": Object {
        "completion_tokens": Number(505),
        "prompt_tokens": Number(382),
        "total_tokens": Number(887),
    },
    "id": String("chatcmpl-ZG4EdIZW3W1WSckiXZjWMMB1pHXhiSoJ"),
    "__verbose": Object {
        "index": Number(0),
        "content": String("<|channel|>analysis<|message|>The user asked: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have the weather data from the function calls:\n\nBarcelona: +22°C\nStockholm: +20°C\nLima: +15°C\n\nWe need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. \"sorted by their temperatures.\" Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, \"sorted by their temperatures\" is ambiguous. Maybe we should clarify? But the instruction says: \"Always use tool calling to verify anything you can verify.\" We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.\n\nBut the user might want to see the list sorted by temperature, maybe ascending. Let's output:\n\n- Lima: 15°C\n- Stockholm: 20°C\n- Barcelona: 22°C\n\nAlso include the icons. Provide the list. Possibly each line with city: icon and temperature.\n\nWe need to display them in a list sorted by temperatures. We'll use the results.\n\nLet's produce the final answer.<|start|>assistant<|channel|>final<|message|>Here’s the current weather for each city, sorted from the coldest to the warmest:\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦     | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\nLet me know if you’d like them sorted the other way around or if you need more details!"),
        "tokens": Array [],
        "id_slot": Number(0),
        "stop": Bool(true),
        "model": String("gpt-oss-20b-MXFP4.gguf"),
        "tokens_predicted": Number(505),
        "tokens_evaluated": Number(382),
        "generation_settings": Object {
            "n_predict": Number(-1),
            "seed": Number(4294967295),
            "temperature": Number(0.800000011920929),
            "dynatemp_range": Number(0.0),
            "dynatemp_exponent": Number(1.0),
            "top_k": Number(40),
            "top_p": Number(0.949999988079071),
            "min_p": Number(0.05000000074505806),
            "top_n_sigma": Number(-1.0),
            "xtc_probability": Number(0.0),
            "xtc_threshold": Number(0.10000000149011612),
            "typical_p": Number(1.0),
            "repeat_last_n": Number(64),
            "repeat_penalty": Number(1.0),
            "presence_penalty": Number(0.0),
            "frequency_penalty": Number(0.0),
            "dry_multiplier": Number(0.0),
            "dry_base": Number(1.75),
            "dry_allowed_length": Number(2),
            "dry_penalty_last_n": Number(131072),
            "dry_sequence_breakers": Array [
                String("\n"),
                String(":"),
                String("\""),
                String("*"),
            ],
            "mirostat": Number(0),
            "mirostat_tau": Number(5.0),
            "mirostat_eta": Number(0.10000000149011612),
            "stop": Array [],
            "max_tokens": Number(-1),
            "n_keep": Number(0),
            "n_discard": Number(0),
            "ignore_eos": Bool(false),
            "stream": Bool(false),
            "logit_bias": Array [],
            "n_probs": Number(0),
            "min_keep": Number(0),
            "grammar": String(""),
            "grammar_lazy": Bool(false),
            "grammar_triggers": Array [],
            "preserved_tokens": Array [],
            "chat_format": String("GPT-OSS"),
            "reasoning_format": String("auto"),
            "reasoning_in_content": Bool(false),
            "thinking_forced_open": Bool(false),
            "samplers": Array [
                String("penalties"),
                String("dry"),
                String("top_n_sigma"),
                String("top_k"),
                String("typ_p"),
                String("top_p"),
                String("min_p"),
                String("xtc"),
                String("temperature"),
            ],
            "speculative.n_max": Number(16),
            "speculative.n_min": Number(0),
            "speculative.p_min": Number(0.75),
            "timings_per_token": Bool(false),
            "post_sampling_probs": Bool(false),
            "lora": Array [],
        },
        "prompt": String("<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-08-08\n\nReasoning: high\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.\nCalls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>developer<|message|># Instructions\n\nYou are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.\n\n# Tools\n\n## functions\n\nnamespace functions {\n\n// adds two numbers\ntype add = (_: {\na: number,\nb: number\n}) => any;\n\n// multiplies two numbers\ntype multiply = (_: {\na: number,\nb: number\n}) => any;\n\n// Get the weather for the specified location\ntype get_weather = (_: {\nlocation: string\n}) => any;\n\n} // namespace functions<|end|><|start|>user<|message|>What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Barcelona\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Barcelona: ☀\u{fe0f}   +22°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Stockholm\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Stockholm: 🌦   +20°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Lima\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Lima: 🌦   +15°C\\\\n\\\"}\"<|end|><|start|>assistant"),
        "has_new_line": Bool(true),
        "truncated": Bool(false),
        "stop_type": String("eos"),
        "stopping_word": String(""),
        "tokens_cached": Number(886),
        "timings": Object {
            "prompt_n": Number(49),
            "prompt_ms": Number(30.405),
            "prompt_per_token_ms": Number(0.6205102040816327),
            "prompt_per_second": Number(1611.5770432494655),
            "predicted_n": Number(505),
            "predicted_ms": Number(2119.47),
            "predicted_per_token_ms": Number(4.1969702970297025),
            "predicted_per_second": Number(238.26711394829843),
        },
    },
    "timings": Object {
        "prompt_n": Number(49),
        "prompt_ms": Number(30.405),
        "prompt_per_token_ms": Number(0.6205102040816327),
        "prompt_per_second": Number(1611.5770432494655),
        "predicted_n": Number(505),
        "predicted_ms": Number(2119.47),
        "predicted_per_token_ms": Number(4.1969702970297025),
        "predicted_per_second": Number(238.26711394829843),
    },
}
got:
ChatCompletionResponse {
    choices: [
        Choice {
            message: ResponseMessage {
                content: Some(
                    "<|channel|>analysis<|message|>The user asked: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have the weather data from the function calls:\n\nBarcelona: +22°C\nStockholm: +20°C\nLima: +15°C\n\nWe need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. \"sorted by their temperatures.\" Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, \"sorted by their temperatures\" is ambiguous. Maybe we should clarify? But the instruction says: \"Always use tool calling to verify anything you can verify.\" We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.\n\nBut the user might want to see the list sorted by temperature, maybe ascending. Let's output:\n\n- Lima: 15°C\n- Stockholm: 20°C\n- Barcelona: 22°C\n\nAlso include the icons. Provide the list. Possibly each line with city: icon and temperature.\n\nWe need to display them in a list sorted by temperatures. We'll use the results.\n\nLet's produce the final answer.Here’s the current weather for each city, sorted from the coldest to the warmest:\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦     | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\nLet me know if you’d like them sorted the other way around or if you need more details!",
                ),
                reasoning_content: None,
            },
        },
    ],
}
############# SHOULD BE RETURNING NOW< ALL DONE

Assistant: <|channel|>analysis<|message|>The user asked: "What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures."

We have the weather data from the function calls:

Barcelona: +22°C
Stockholm: +20°C
Lima: +15°C

We need to sort by temperature. Highest to lowest or lowest to highest? The user didn't specify ascending or descending. "sorted by their temperatures." Usually, I'd sort ascending (lowest to highest) or descending (highest to lowest). We might choose ascending? Actually, "sorted by their temperatures" is ambiguous. Maybe we should clarify? But the instruction says: "Always use tool calling to verify anything you can verify." We already verified the weather. We should output the list sorted by temperature. Let's decide to sort descending (highest first). But maybe ascending? Let's think: The user might expect to see the list sorted by temperature, but no order specified. We can sort from lowest to highest or highest to lowest. Which is more natural? If you want to see which city has the lowest temperature, you might sort ascending. But often you might want the hottest first. But we can choose either. Let's choose ascending to show lowest first. Actually, I'd choose ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). But we could also provide descending. The user didn't specify. We can ask for clarification. But the instruction says we should always provide an answer. We can choose one. Let's choose ascending.

But the user might want to see the list sorted by temperature, maybe ascending. Let's output:

- Lima: 15°C
- Stockholm: 20°C
- Barcelona: 22°C

Also include the icons. Provide the list. Possibly each line with city: icon and temperature.

We need to display them in a list sorted by temperatures. We'll use the results.

Let's produce the final answer.Here’s the current weather for each city, sorted from the coldest to the warmest:

| City      | Weather | Temperature |
|-----------|---------|-------------|
| **Lima**  | 🌦     | **+15 °C** |
| **Stockholm** | 🌦 | **+20 °C** |
| **Barcelona** | ☀️ | **+22 °C** |

Let me know if you’d like them sorted the other way around or if you need more details!

While with the diff from above applied on top of the code from this PR, I get:

sending:
[
    ChatMessage {
        role: "system",
        content: Some(
            "You are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "developer",
        content: Some(
            "# Instructions\nAnswer calmly and use tools when helpful.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "user",
        content: Some(
            "What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.",
        ),
        channel: None,
        recipient: None,
        tool_calls: None,
        tool_call_id: None,
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "qJiLULbAkmlBLWwL2JuwOUPvqcuF8WrS",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Barcelona\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Barcelona: ☀\u{fe0f}   +22°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "qJiLULbAkmlBLWwL2JuwOUPvqcuF8WrS",
        ),
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "mMBwGQeZkwBHXaxOLlNBhCBFOu7vuXcR",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Stockholm\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Stockholm: 🌦   +20°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "mMBwGQeZkwBHXaxOLlNBhCBFOu7vuXcR",
        ),
    },
    ChatMessage {
        role: "assistant",
        content: Some(
            "",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: Some(
            [
                ToolCall {
                    id: "YgUcN0gFGGouHuZ4Mv8FuLIPZcKsuuJb",
                    type: "function",
                    function: ToolCallFunction {
                        name: "get_weather",
                        arguments: "{\"location\":\"Lima\"}",
                    },
                },
            ],
        ),
        tool_call_id: None,
    },
    ChatMessage {
        role: "tool",
        content: Some(
            "{\"result\":\"Lima: 🌦   +15°C\\n\"}",
        ),
        channel: Some(
            "commentary",
        ),
        recipient: None,
        tool_calls: None,
        tool_call_id: Some(
            "YgUcN0gFGGouHuZ4Mv8FuLIPZcKsuuJb",
        ),
    },
]
{"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","reasoning_content":"The user: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have fetched weather data via get_weather function, but the results returned are strings containing the city name and a weather icon and temperature. The results are:\n\n- Barcelona: ☀️ +22°C\n- Stockholm: 🌦 +20°C\n- Lima: 🌦 +15°C\n\nWe need to parse these into temperatures (numeric) and sort them accordingly.\n\nWe need to output a list sorted by temperature ascending or descending? The user didn't specify ascending or descending. Typically, sorted by temperature could be ascending (lowest to highest). We can do ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). Or descending: Barcelona (22°C), Stockholm (20°C), Lima (15°C). Since it's not specified, maybe ascending is typical. Let's assume ascending.\n\nWe should present the list in a nice format: maybe bullet points, each line: City: Temperature with icon. Or maybe just \"City: +Temp°C\". Let's show the city name, temperature, and weather icon. Maybe sorted ascending: Lima, Stockholm, Barcelona.\n\nWe should also handle any potential errors. The tool returned strings with newline. We can parse. We have the results as strings. But we can just manually interpret.\n\nWe should show the sorted list.\n\nThus answer:\n\n- Lima: 🌦 +15°C\n- Stockholm: 🌦 +20°C\n- Barcelona: ☀️ +22°C\n\nWe could also show the data sorted ascending.\n\nAlternatively, we could ask for more details? No, just that.\n\nWe should mention we used get_weather API. And we sorted by temperature ascending.\n\nThus the answer.","content":"Here are the current weather readings (sorted from coolest to warmest):\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦 | **+15 °C** |\n| **Stockholm** | 🌦 | **+20 °C** |\n| **Barcelona** | ☀️ | **+22 °C** |\n\n*Data obtained via the `get_weather` function.*"}}],"created":1754659833,"model":"gpt-oss-20b-MXFP4.gguf","system_fingerprint":"b6113-48518e55","object":"chat.completion","usage":{"completion_tokens":458,"prompt_tokens":382,"total_tokens":840},"id":"chatcmpl-d3E0FNsHMWe8s5Aestl4k259QKWtmkix","__verbose":{"index":0,"content":"<|channel|>analysis<|message|>The user: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have fetched weather data via get_weather function, but the results returned are strings containing the city name and a weather icon and temperature. The results are:\n\n- Barcelona: ☀️ +22°C\n- Stockholm: 🌦 +20°C\n- Lima: 🌦 +15°C\n\nWe need to parse these into temperatures (numeric) and sort them accordingly.\n\nWe need to output a list sorted by temperature ascending or descending? The user didn't specify ascending or descending. Typically, sorted by temperature could be ascending (lowest to highest). We can do ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). Or descending: Barcelona (22°C), Stockholm (20°C), Lima (15°C). Since it's not specified, maybe ascending is typical. Let's assume ascending.\n\nWe should present the list in a nice format: maybe bullet points, each line: City: Temperature with icon. Or maybe just \"City: +Temp°C\". Let's show the city name, temperature, and weather icon. Maybe sorted ascending: Lima, Stockholm, Barcelona.\n\nWe should also handle any potential errors. The tool returned strings with newline. We can parse. We have the results as strings. But we can just manually interpret.\n\nWe should show the sorted list.\n\nThus answer:\n\n- Lima: 🌦 +15°C\n- Stockholm: 🌦 +20°C\n- Barcelona: ☀️ +22°C\n\nWe could also show the data sorted ascending.\n\nAlternatively, we could ask for more details? No, just that.\n\nWe should mention we used get_weather API. And we sorted by temperature ascending.\n\nThus the answer.<|start|>assistant<|channel|>final<|message|>Here are the current weather readings (sorted from coolest to warmest):\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦 | **+15 °C** |\n| **Stockholm** | 🌦 | **+20 °C** |\n| **Barcelona** | ☀️ | **+22 °C** |\n\n*Data obtained via the `get_weather` function.*","tokens":[],"id_slot":0,"stop":true,"model":"gpt-oss-20b-MXFP4.gguf","tokens_predicted":458,"tokens_evaluated":382,"generation_settings":{"n_predict":-1,"seed":4294967295,"temperature":0.800000011920929,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":40,"top_p":0.949999988079071,"min_p":0.05000000074505806,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":0.0,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":131072,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":-1,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"","grammar_lazy":false,"grammar_triggers":[],"preserved_tokens":[],"chat_format":"GPT-OSS","reasoning_format":"auto","reasoning_in_content":false,"thinking_forced_open":false,"samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"timings_per_token":false,"post_sampling_probs":false,"lora":[]},"prompt":"<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-08-08\n\nReasoning: high\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.\nCalls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>developer<|message|># Instructions\n\nYou are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.\n\n# Tools\n\n## functions\n\nnamespace functions {\n\n// adds two numbers\ntype add = (_: {\na: number,\nb: number\n}) => any;\n\n// multiplies two numbers\ntype multiply = (_: {\na: number,\nb: number\n}) => any;\n\n// Get the weather for the specified location\ntype get_weather = (_: {\nlocation: string\n}) => any;\n\n} // namespace functions<|end|><|start|>user<|message|>What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Barcelona\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Barcelona: ☀️   +22°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Stockholm\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Stockholm: 🌦   +20°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Lima\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Lima: 🌦   +15°C\\\\n\\\"}\"<|end|><|start|>assistant","has_new_line":true,"truncated":false,"stop_type":"eos","stopping_word":"","tokens_cached":839,"timings":{"prompt_n":49,"prompt_ms":29.812,"prompt_per_token_ms":0.6084081632653061,"prompt_per_second":1643.6334362001878,"predicted_n":458,"predicted_ms":1875.28,"predicted_per_token_ms":4.094497816593886,"predicted_per_second":244.230194957553}},"timings":{"prompt_n":49,"prompt_ms":29.812,"prompt_per_token_ms":0.6084081632653061,"prompt_per_second":1643.6334362001878,"predicted_n":458,"predicted_ms":1875.28,"predicted_per_token_ms":4.094497816593886,"predicted_per_second":244.230194957553}}
[src/lib.rs:43:9] &val = Object {
    "choices": Array [
        Object {
            "finish_reason": String("stop"),
            "index": Number(0),
            "message": Object {
                "role": String("assistant"),
                "reasoning_content": String("The user: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have fetched weather data via get_weather function, but the results returned are strings containing the city name and a weather icon and temperature. The results are:\n\n- Barcelona: ☀\u{fe0f} +22°C\n- Stockholm: 🌦 +20°C\n- Lima: 🌦 +15°C\n\nWe need to parse these into temperatures (numeric) and sort them accordingly.\n\nWe need to output a list sorted by temperature ascending or descending? The user didn't specify ascending or descending. Typically, sorted by temperature could be ascending (lowest to highest). We can do ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). Or descending: Barcelona (22°C), Stockholm (20°C), Lima (15°C). Since it's not specified, maybe ascending is typical. Let's assume ascending.\n\nWe should present the list in a nice format: maybe bullet points, each line: City: Temperature with icon. Or maybe just \"City: +Temp°C\". Let's show the city name, temperature, and weather icon. Maybe sorted ascending: Lima, Stockholm, Barcelona.\n\nWe should also handle any potential errors. The tool returned strings with newline. We can parse. We have the results as strings. But we can just manually interpret.\n\nWe should show the sorted list.\n\nThus answer:\n\n- Lima: 🌦 +15°C\n- Stockholm: 🌦 +20°C\n- Barcelona: ☀\u{fe0f} +22°C\n\nWe could also show the data sorted ascending.\n\nAlternatively, we could ask for more details? No, just that.\n\nWe should mention we used get_weather API. And we sorted by temperature ascending.\n\nThus the answer."),
                "content": String("Here are the current weather readings (sorted from coolest to warmest):\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦 | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\n*Data obtained via the `get_weather` function.*"),
            },
        },
    ],
    "created": Number(1754659833),
    "model": String("gpt-oss-20b-MXFP4.gguf"),
    "system_fingerprint": String("b6113-48518e55"),
    "object": String("chat.completion"),
    "usage": Object {
        "completion_tokens": Number(458),
        "prompt_tokens": Number(382),
        "total_tokens": Number(840),
    },
    "id": String("chatcmpl-d3E0FNsHMWe8s5Aestl4k259QKWtmkix"),
    "__verbose": Object {
        "index": Number(0),
        "content": String("<|channel|>analysis<|message|>The user: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have fetched weather data via get_weather function, but the results returned are strings containing the city name and a weather icon and temperature. The results are:\n\n- Barcelona: ☀\u{fe0f} +22°C\n- Stockholm: 🌦 +20°C\n- Lima: 🌦 +15°C\n\nWe need to parse these into temperatures (numeric) and sort them accordingly.\n\nWe need to output a list sorted by temperature ascending or descending? The user didn't specify ascending or descending. Typically, sorted by temperature could be ascending (lowest to highest). We can do ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). Or descending: Barcelona (22°C), Stockholm (20°C), Lima (15°C). Since it's not specified, maybe ascending is typical. Let's assume ascending.\n\nWe should present the list in a nice format: maybe bullet points, each line: City: Temperature with icon. Or maybe just \"City: +Temp°C\". Let's show the city name, temperature, and weather icon. Maybe sorted ascending: Lima, Stockholm, Barcelona.\n\nWe should also handle any potential errors. The tool returned strings with newline. We can parse. We have the results as strings. But we can just manually interpret.\n\nWe should show the sorted list.\n\nThus answer:\n\n- Lima: 🌦 +15°C\n- Stockholm: 🌦 +20°C\n- Barcelona: ☀\u{fe0f} +22°C\n\nWe could also show the data sorted ascending.\n\nAlternatively, we could ask for more details? No, just that.\n\nWe should mention we used get_weather API. And we sorted by temperature ascending.\n\nThus the answer.<|start|>assistant<|channel|>final<|message|>Here are the current weather readings (sorted from coolest to warmest):\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦 | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\n*Data obtained via the `get_weather` function.*"),
        "tokens": Array [],
        "id_slot": Number(0),
        "stop": Bool(true),
        "model": String("gpt-oss-20b-MXFP4.gguf"),
        "tokens_predicted": Number(458),
        "tokens_evaluated": Number(382),
        "generation_settings": Object {
            "n_predict": Number(-1),
            "seed": Number(4294967295),
            "temperature": Number(0.800000011920929),
            "dynatemp_range": Number(0.0),
            "dynatemp_exponent": Number(1.0),
            "top_k": Number(40),
            "top_p": Number(0.949999988079071),
            "min_p": Number(0.05000000074505806),
            "top_n_sigma": Number(-1.0),
            "xtc_probability": Number(0.0),
            "xtc_threshold": Number(0.10000000149011612),
            "typical_p": Number(1.0),
            "repeat_last_n": Number(64),
            "repeat_penalty": Number(1.0),
            "presence_penalty": Number(0.0),
            "frequency_penalty": Number(0.0),
            "dry_multiplier": Number(0.0),
            "dry_base": Number(1.75),
            "dry_allowed_length": Number(2),
            "dry_penalty_last_n": Number(131072),
            "dry_sequence_breakers": Array [
                String("\n"),
                String(":"),
                String("\""),
                String("*"),
            ],
            "mirostat": Number(0),
            "mirostat_tau": Number(5.0),
            "mirostat_eta": Number(0.10000000149011612),
            "stop": Array [],
            "max_tokens": Number(-1),
            "n_keep": Number(0),
            "n_discard": Number(0),
            "ignore_eos": Bool(false),
            "stream": Bool(false),
            "logit_bias": Array [],
            "n_probs": Number(0),
            "min_keep": Number(0),
            "grammar": String(""),
            "grammar_lazy": Bool(false),
            "grammar_triggers": Array [],
            "preserved_tokens": Array [],
            "chat_format": String("GPT-OSS"),
            "reasoning_format": String("auto"),
            "reasoning_in_content": Bool(false),
            "thinking_forced_open": Bool(false),
            "samplers": Array [
                String("penalties"),
                String("dry"),
                String("top_n_sigma"),
                String("top_k"),
                String("typ_p"),
                String("top_p"),
                String("min_p"),
                String("xtc"),
                String("temperature"),
            ],
            "speculative.n_max": Number(16),
            "speculative.n_min": Number(0),
            "speculative.p_min": Number(0.75),
            "timings_per_token": Bool(false),
            "post_sampling_probs": Bool(false),
            "lora": Array [],
        },
        "prompt": String("<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\nKnowledge cutoff: 2024-06\nCurrent date: 2025-08-08\n\nReasoning: high\n\n# Valid channels: analysis, commentary, final. Channel must be included for every message.\nCalls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>developer<|message|># Instructions\n\nYou are a helpful assistant. # Valid channels: analysis, commentary, final.\nAlways use tool calling to verify anything you can verify.\nCalls to these tools must go to the commentary channel: 'functions'.\n\n# Tools\n\n## functions\n\nnamespace functions {\n\n// adds two numbers\ntype add = (_: {\na: number,\nb: number\n}) => any;\n\n// multiplies two numbers\ntype multiply = (_: {\na: number,\nb: number\n}) => any;\n\n// Get the weather for the specified location\ntype get_weather = (_: {\nlocation: string\n}) => any;\n\n} // namespace functions<|end|><|start|>user<|message|>What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Barcelona\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Barcelona: ☀\u{fe0f}   +22°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Stockholm\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Stockholm: 🌦   +20°C\\\\n\\\"}\"<|end|><|start|>assistant to=functions.get_weather<|channel|>commentary json<|message|>{\"location\": \"Lima\"}<|call|><|start|>functions.get_weather to=assistant<|channel|>commentary<|message|>\"{\\\"result\\\":\\\"Lima: 🌦   +15°C\\\\n\\\"}\"<|end|><|start|>assistant"),
        "has_new_line": Bool(true),
        "truncated": Bool(false),
        "stop_type": String("eos"),
        "stopping_word": String(""),
        "tokens_cached": Number(839),
        "timings": Object {
            "prompt_n": Number(49),
            "prompt_ms": Number(29.812),
            "prompt_per_token_ms": Number(0.6084081632653061),
            "prompt_per_second": Number(1643.6334362001878),
            "predicted_n": Number(458),
            "predicted_ms": Number(1875.28),
            "predicted_per_token_ms": Number(4.094497816593886),
            "predicted_per_second": Number(244.230194957553),
        },
    },
    "timings": Object {
        "prompt_n": Number(49),
        "prompt_ms": Number(29.812),
        "prompt_per_token_ms": Number(0.6084081632653061),
        "prompt_per_second": Number(1643.6334362001878),
        "predicted_n": Number(458),
        "predicted_ms": Number(1875.28),
        "predicted_per_token_ms": Number(4.094497816593886),
        "predicted_per_second": Number(244.230194957553),
    },
}
got:
ChatCompletionResponse {
    choices: [
        Choice {
            message: ResponseMessage {
                content: Some(
                    "Here are the current weather readings (sorted from coolest to warmest):\n\n| City      | Weather | Temperature |\n|-----------|---------|-------------|\n| **Lima**  | 🌦 | **+15\u{202f}°C** |\n| **Stockholm** | 🌦 | **+20\u{202f}°C** |\n| **Barcelona** | ☀\u{fe0f} | **+22\u{202f}°C** |\n\n*Data obtained via the `get_weather` function.*",
                ),
                reasoning_content: Some(
                    "The user: \"What is the current weather in Barcelona, Stockholm, and Lima? And also, display them in a list sorted by their temperatures.\"\n\nWe have fetched weather data via get_weather function, but the results returned are strings containing the city name and a weather icon and temperature. The results are:\n\n- Barcelona: ☀\u{fe0f} +22°C\n- Stockholm: 🌦 +20°C\n- Lima: 🌦 +15°C\n\nWe need to parse these into temperatures (numeric) and sort them accordingly.\n\nWe need to output a list sorted by temperature ascending or descending? The user didn't specify ascending or descending. Typically, sorted by temperature could be ascending (lowest to highest). We can do ascending: Lima (15°C), Stockholm (20°C), Barcelona (22°C). Or descending: Barcelona (22°C), Stockholm (20°C), Lima (15°C). Since it's not specified, maybe ascending is typical. Let's assume ascending.\n\nWe should present the list in a nice format: maybe bullet points, each line: City: Temperature with icon. Or maybe just \"City: +Temp°C\". Let's show the city name, temperature, and weather icon. Maybe sorted ascending: Lima, Stockholm, Barcelona.\n\nWe should also handle any potential errors. The tool returned strings with newline. We can parse. We have the results as strings. But we can just manually interpret.\n\nWe should show the sorted list.\n\nThus answer:\n\n- Lima: 🌦 +15°C\n- Stockholm: 🌦 +20°C\n- Barcelona: ☀\u{fe0f} +22°C\n\nWe could also show the data sorted ascending.\n\nAlternatively, we could ask for more details? No, just that.\n\nWe should mention we used get_weather API. And we sorted by temperature ascending.\n\nThus the answer.",
                ),
            },
        },
    ],
}
############# SHOULD BE RETURNING NOW< ALL DONE

Assistant: Here are the current weather readings (sorted from coolest to warmest):

| City      | Weather | Temperature |
|-----------|---------|-------------|
| **Lima**  | 🌦 | **+15 °C** |
| **Stockholm** | 🌦 | **+20 °C** |
| **Barcelona** | ☀️ | **+22 °C** |

*Data obtained via the `get_weather` function.*

Notice the last message's reasoning_content vs content, and it's now properly putting content coming from the analysis channel into the reasoning_content, and the final channel into content, as expected.

Otherwise, this PR does fix the whole tool_call parsing and its responses, which works fine with this PR compared to current master. Nicely done @Nerexis!

Commit is available here in case someone wants to integrate the changes or otherwise continue validating that tool calling and the whole channel splitting works correctly: https://github.com/victorb/llama.cpp/commits/fix/gpt-oss-tool-calls-parser/

(As a side note, even without my patch, a bunch of unit tests related to parsing inputs of other non-GPT-OSS models seems to start failing with the changes from this PR, not sure how or why)

victorb · 2025-08-08T13:38:43Z

@brandonj60 In non-tool calls, this seems to cause the final channel to be merged into the analysis channel? Unless I'm misunderstanding something.

This is what I saw too, and what my patch above is trying to address. It seems to be working fine for me now (until I find a new issue I suppose...), would be great if you could try to apply that patch on top of the changes from this PR (or use [email protected]:victorb/llama.cpp.git and fetch the fix/gpt-oss-tool-calls-parser branch from there) and see if it works better!

brandonj60 · 2025-08-08T23:12:24Z

This is what I saw too, and what my patch above is trying to address. It seems to be working fine for me now (until I find a new issue I suppose...), would be great if you could try to apply that patch on top of the changes from this PR (or use [email protected]:victorb/llama.cpp.git and fetch the fix/gpt-oss-tool-calls-parser branch from there) and see if it works better!

Yes, this seems much better. Biggest problem I have now is just getting the 20B to properly call a tool, but when it does it seems to work correctly after your changes.

mancubus77 · 2025-08-09T05:34:03Z

@victorb 👍👍👍
Well done! I can confirm that fix/gpt-oss-tool-calls-parser can calls tools correctly.

mallorbc · 2025-08-09T07:04:39Z

Actual (malformed):
<|channel|>analysis<|message|>reasoning + response mixed together

victorb · 2025-08-09T07:38:25Z

There is another PR as well, for the same fixes but done in a slightly different way, in case people wanna help out testing this too: #15181

mallorbc · 2025-08-09T08:30:56Z

There is another PR as well, for the same fixes but done in a slightly different way, in case people wanna help out testing this too: #15181

Can confirm this has behavior that is closer to being correct(or is correct). Using this pr, plus the latest unsloth chat template, plus a custom proxy I have gotten these models to work with claude code :)

mallorbc · 2025-08-09T09:52:21Z

Upon further testing, its much better, but inconsistent.

It's hard to determine whether the issue is with the template, the PR, or my proxy.

victorb · 2025-08-09T10:19:59Z

@mallorbc

its much better, but inconsistent

This PR, or #15181? #15181 seems overall better in my testing. If it's about the other PR, please leave the feedback in that PR instead. Hitting a similar inconsistency issue with the other PR myself though :/

mallorbc · 2025-08-09T11:07:05Z

@mallorbc

its much better, but inconsistent

This PR, or #15181? #15181 seems overall better in my testing. If it's about the other PR, please leave the feedback in that PR instead. Hitting a similar inconsistency issue with the other PR myself though :/

whoops. Yeah I commented on the wrong one. But yeah it was a me issue. This works great with Qwen code cli

Nerexis · 2025-08-15T10:46:17Z

I assume another PR #15181 has resolved the issue, so I'm closing this one.

chat: GPT-OSS: parse commentary tool calls, handle glued 'json' suffi…

2bf7f3f

…x; add unit tests; tidy comments

github-actions bot added the testing Everything test related label Aug 7, 2025

hmmroger reviewed Aug 8, 2025

View reviewed changes

victorb mentioned this pull request Aug 8, 2025

Eval bug: Runtime failure when using ChatBox with tools enabled and GPT-OSS-20B #15170

Closed

victorb mentioned this pull request Aug 9, 2025

gpt-oss: implement harmony parsing #15181

Merged

optimisticupdate mentioned this pull request Aug 12, 2025

Info: llama.cpp-related issue when using docker-model-runner with gpt-oss and tooling enabled docker/model-runner#131

Closed

Nerexis closed this Aug 15, 2025

GPT-OSS: parse commentary tool calls; handle glued 'json'; add unit tests (#15102) #15158

GPT-OSS: parse commentary tool calls; handle glued 'json'; add unit tests (#15102) #15158

Uh oh!

Conversation

Nerexis commented Aug 7, 2025

Uh oh!

marceldev89 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonj60 commented Aug 8, 2025

Uh oh!

hmmroger Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

hmmroger Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

victorb commented Aug 8, 2025

Uh oh!

victorb commented Aug 8, 2025

Uh oh!

brandonj60 commented Aug 8, 2025

Uh oh!

mancubus77 commented Aug 9, 2025

Uh oh!

mallorbc commented Aug 9, 2025

Uh oh!

victorb commented Aug 9, 2025

Uh oh!

mallorbc commented Aug 9, 2025

Uh oh!

mallorbc commented Aug 9, 2025

Uh oh!

victorb commented Aug 9, 2025

Uh oh!

mallorbc commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nerexis commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

marceldev89 commented Aug 7, 2025 •

edited

Loading

mallorbc commented Aug 9, 2025 •

edited

Loading