Skip to content

Add tool_parser minicpm5 for MiniCPM5-1B XML tool calls #4268

@DmytroMelnyk

Description

@DmytroMelnyk

[Feature request] Add tool_parser: minicpm5 for MiniCPM5-1B XML tool calls

Summary

MiniCPM5-1B uses XML-style function calls in model output. OVMS does not expose a matching tool_parser, so /v3/chat/completions responses leave tool XML in message.content with tool_calls: []. OpenAI-compatible agent clients then never execute tools.

Expected model output (MiniCPM5)

Documented by OpenBMB: https://github.com/OpenBMB/MiniCPM#with-tool-calling

<function name="get_weather"><param name="city">Beijing</param></function>

The shipped chat_template.jinja uses the same <function> / <param> format.

Current OVMS behavior

  • Supported parsers include lfm2, hermes3, llama3, etc. (see LLM reference and --tool_parser in export/CLI).
  • There is no minicpm5 parser.
  • Using --tool_parser lfm2 with MiniCPM5-1B is incorrect: LFM2 expects <|tool_call_start|>[...]<|tool_call_end|> (Pythonic/JSON), not MiniCPM5 XML.
  • Observed with OVMS 2026.2-gpu: model may emit plausible XML in content, but API returns "tool_calls": [].

Reference implementations

Project Parser Notes
SGLang --tool-call-parser minicpm5 Recommended by OpenBMB; merged in sgl-project/sglang#25600
OpenBMB MiniCPM vLLM-style XML parser Reference for regex/XML extraction

SGLang does not enable structural-tag guided generation for MiniCPM5 (supports_structural_tag() -> False); post-generation extraction is the primary path.

Requested change

  1. Add minicpm5 to supported tool_parser values (CLI, graph.pbtxt, export_model.py, docs).
  2. Implement extraction in src/llm/io_processing/ (same integration point as lfm2, hermes3, etc. in output_parser.cpp).
  3. Map parsed calls to OpenAI-style tool_calls in non-streaming and streaming responses.
  4. Strip optional `` / <think>...</think> blocks before parsing.

Out of scope for v1 (optional follow-ups):

  • enable_tool_guided_generation / XGrammar rules for MiniCPM5 XML
  • Full streaming parity with SGLang incremental parser

Minimal acceptance test

Request:

POST /v3/chat/completions
{
  "model": "<MiniCPM5-1B>",
  "messages": [{"role": "user", "content": "What is the weather in Beijing?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }
  }],
  "tool_choice": "required",
  "enable_thinking": false
}

When the model emits:

<function name="get_weather"><param name="city">Beijing</param></function>

Response must include non-empty message.tool_calls with function.name and JSON function.arguments, and no raw tool XML in content.

Environment

  • Model: openbmb/MiniCPM5-1B (OpenVINO IR + native chat_template.jinja)
  • Server: openvino/model_server:2026.2-gpu, --task text_generation

Thank you for considering this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions