Mistral tool calling unary #3567

dkalinowski · 2025-08-11T10:11:44Z

🛠 Summary

CVS-171565

{'multiple'}
{"accuracy": 0.89, "correct_count": 178, "total_count": 200}

dtrawins · 2025-08-12T10:13:35Z

demos/common/export_models/templates/tool_chat_template_mistral_parallel.jinja

@@ -0,0 +1,93 @@
+{%- if messages[0]["role"] == "system" %}


we don't want to maintain this template here. Check this PR #3565

changed: now using vllm's mistral_parallel.jinja which gives worse accuracy

src/llm/io_processing/mistral/tool_parser.cpp

mzegla · 2025-08-12T10:30:52Z

src/llm/io_processing/mistral/tool_parser.cpp

+        } else {
+            SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool calls are not generated by the model");
+        }
+    } else {
+        SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool calls are not generated by the model");
+    }


Since we have isToolGenerated, maybe we could drop these elses and go with

if (!isToolGenerated) { SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool calls are not generated by the model"); }

outside of this block?

not needed anymore

src/llm/io_processing/mistral/tool_parser.cpp

src/llm/io_processing/mistral/tool_parser.hpp

mzegla · 2025-08-12T10:36:52Z

src/llm/io_processing/mistral/tool_parser.cpp

+    if (begin != parsedOutput.content.end() && *begin == '[') {
+        begin = skipToFirstNonWhitespaceCharacter(begin + 1, parsedOutput.content.end());
+        if (begin != parsedOutput.content.end() && *begin == '{') {
+            // If the content starts with '[{', it indicates that tool calls are present.


Did you make sure Mistral does not generate any BOT token that we might not see after decoding?

BOT tokens appear when we detokenize with skip_special_tokens=false. changed parser to base on that

mzegla · 2025-08-12T10:38:12Z

src/llm/io_processing/mistral/tool_parser.cpp

+                    continue;
+                }
+                ToolCall toolCall;
+                toolCall.id = generateRandomId();  // Generate a random ID for the tool call


we can have it at the end before pushing to toolCalls

dtrawins · 2025-08-18T23:05:43Z

src/llm/io_processing/mistral/tool_parser.cpp

+        rapidjson::Document toolsDoc;
+        toolsDoc.Parse(toolsString.c_str());
+
+        if (!toolsDoc.HasParseError() && toolsDoc.IsArray() && !hasMoreSpecialTags) {


what is going to happen when json is incomplete for example with max_tokens too short? I guess we should still return the tool_call response even if incomplete or invalid.

as with other tool parsers - when max tokens is reached and message is not valid JSON, it will not be put into tools, but content

mzegla · 2025-08-19T09:35:37Z

demos/common/export_models/export_model.py

@@ -52,7 +52,7 @@ def add_common_arguments(parser):
                         'Not effective if target device is not NPU', dest='max_prompt_len')
 parser_text.add_argument('--prompt_lookup_decoding', action='store_true', help='Set pipeline to use prompt lookup decoding', dest='prompt_lookup_decoding')
 parser_text.add_argument('--reasoning_parser', choices=["qwen3"], help='Set the type of the reasoning parser for reasoning content extraction', dest='reasoning_parser')
-parser_text.add_argument('--tool_parser', choices=["llama3","phi4","hermes3", "qwen3"], help='Set the type of the tool parser for tool calls extraction', dest='tool_parser')
+parser_text.add_argument('--tool_parser', choices=["llama3","phi4","hermes3", "qwen3","mistral"], help='Set the type of the tool parser for tool calls extraction', dest='tool_parser')


unrelated, but qwen3 should not be here

this is already being changed in your PR, right? im rebased

mzegla · 2025-08-19T09:40:49Z

src/llm/io_processing/mistral/tool_parser.cpp

+        return;
+    }
+
+    std::string decoded = tokenizer.decode(generatedTokens, {ov::genai::skip_special_tokens(false)});


With that line we do double decoding since parsedOutput already contains decoded content. I suppose it's the skip_special_tokens option that is the reason of doing it again. Did you check performance impact? Can we afford that?

theres no double decoding anymore, i simply check first token for BOT and treat entire response as JSON as you requested, and it works

mzegla · 2025-08-19T09:43:25Z

src/llm/io_processing/mistral/tool_parser.cpp

+    const std::string toolsStartEnd = getParsingEndTag();
+
+    size_t toolsStartPos = decoded.find(toolsStartString);
+    size_t toolsEndPos = decoded.find(toolsStartEnd);


Do mistral have a dedicated tools end tag?

apparently no - the we saw was EOS

mzegla · 2025-08-19T09:43:32Z

src/llm/io_processing/mistral/tool_parser.cpp

+    std::string decoded = tokenizer.decode(generatedTokens, {ov::genai::skip_special_tokens(false)});
+
+    const std::string toolsStartString = getParsingStartTag();
+    const std::string toolsStartEnd = getParsingEndTag();


mzegla · 2025-08-19T09:50:03Z

src/llm/io_processing/mistral/tool_parser.cpp

+        std::string remaining = decoded.substr(0, toolsStartPos) + decoded.substr(toolsEndPos + toolsStartEnd.length());
+
+        size_t toolsStartPos2 = remaining.find(toolsStartString);
+        size_t toolsEndPos2 = remaining.find(toolsStartEnd);
+        bool hasMoreSpecialTags = !(toolsStartPos2 == std::string::npos && toolsEndPos2 == std::string::npos);


Does it make sense to make such extraction? Did you see real example where there are more special tokens in the generated output?

not relevant anymore with new approach

mzegla · 2025-08-19T09:50:49Z

src/llm/io_processing/mistral/tool_parser.hpp

+namespace ovms {
+class MistralToolParser : public BaseOutputParser {
+    const std::string toolCallStartTag = "[TOOL_CALLS]";
+    const std::string toolCallEndTag = "</s>";


Isn't it just a regular EOS token?

yes, good catch

mzegla · 2025-08-19T09:51:46Z

src/llm/io_processing/mistral/tool_parser.hpp

+    // Tools calls are expected to be the last part of the content, so we do not specify an end tag.
+    const std::string& getParsingEndTag() const override {
+        return toolCallEndTag;
+    }


Implementation does not match the comment. If the comment is true for mistral, we should return empty string here.

mzegla · 2025-08-19T09:52:16Z

src/test/llm/output_parsers/mistral_output_parser_test.cpp

+    std::unique_ptr<OutputParser> outputParser;
+
+    void SetUp() override {
+        // For Phi4 model there is only tool parser available


mzegla · 2025-08-19T09:53:38Z

src/test/llm/output_parsers/mistral_output_parser_test.cpp

+    EXPECT_EQ(parsedOutput.toolCalls[0].arguments, "{\"arg1\":\"value1\",\"arg2\":42}");
+    EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false);  // ID should be generated
+}
+TEST_F(MistralOutputParserTest, ParseToolCallOutputWithContentOnBothSidesAndSingleToolCall) {


Is it a real example?

why not? model can produce anything

mzegla · 2025-08-19T09:54:08Z

src/test/llm/output_parsers/mistral_output_parser_test.cpp

+    EXPECT_EQ(parsedOutput.toolCalls[0].arguments, "{\"arg1\":\"value1\",\"arg2\":42}");
+    EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false);  // ID should be generated
+}
+TEST_F(MistralOutputParserTest, ParseToolCallOutputWithMultipleToolCallsReturnsContentOnly) {


Is it a real example?

why not? model can produce anything, why should we unit test that

CVS-171565

dkalinowski added 8 commits August 8, 2025 09:36

save

b799c93

Merge remote-tracking branch 'origin/main' into mistral

c033f88

mistral

17b361f

save

5eed9ca

save

9660fc5

save

eb9d1e6

save

a2249e4

save

fa66a41

dkalinowski requested review from mzegla and dtrawins and removed request for mzegla August 11, 2025 12:58

unit test

ad2cb60

dkalinowski changed the title ~~[WIP] Mistral tool calling unary~~ Mistral tool calling unary Aug 12, 2025

dtrawins reviewed Aug 12, 2025

View reviewed changes

mzegla reviewed Aug 12, 2025

View reviewed changes

dkalinowski added 4 commits August 14, 2025 15:19

Merge remote-tracking branch 'origin/main' into mistral

cf47ff2

save

5f02112

skip special tokens false

cca406b

save

94597a1

dkalinowski requested review from dtrawins and mzegla August 18, 2025 11:08

dtrawins reviewed Aug 18, 2025

View reviewed changes

dtrawins approved these changes Aug 19, 2025

View reviewed changes

save

379bc68

dkalinowski force-pushed the mistral branch from 8b9bb38 to 379bc68 Compare August 19, 2025 07:48

mzegla reviewed Aug 19, 2025

View reviewed changes

Milosz

5b8e24a

dkalinowski force-pushed the mistral branch from 51d1140 to 5b8e24a Compare August 19, 2025 13:21

dkalinowski requested a review from mzegla August 19, 2025 14:18

prepare mistral tokenizer for unit tests on windows

360c5f3

mzegla approved these changes Aug 19, 2025

View reviewed changes

dkalinowski merged commit 17136c3 into main Aug 20, 2025
14 checks passed

dkalinowski deleted the mistral branch August 20, 2025 07:54

przepeck pushed a commit that referenced this pull request Aug 21, 2025

Mistral tool calling unary (#3567)

137631e

CVS-171565

Mistral tool calling unary #3567

Mistral tool calling unary #3567

Uh oh!

Conversation

dkalinowski commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛠 Summary

Uh oh!

dtrawins Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dkalinowski commented Aug 11, 2025 •

edited

Loading

dtrawins Aug 12, 2025 •

edited

Loading