MiniCPM/demo/minicpm4/MCP/README_en.md at main · OpenBMB/MiniCPM

[English | 中文]

Overview • Training • Model • Inference • Evaluation

News

[2025-06-05] 🚀🚀🚀 We have open-sourced MiniCPM4-MCP, built on MiniCPM4-8B, which is capable of calling a variety of MCP tools and delivers performance comparable to larger models.

🚩 Overview

MiniCPM4-MCP is an open-source on-device LLM agent model jointly developed by THUNLP, Renmin University of China and ModelBest, built on MiniCPM-4 with 8 billion parameters. It is capable of solving a wide range of real-world tasks by interacting with various tool and data resources through MCP. As of now, MiniCPM4-MCP supports the following:

Utilization of tools across 16 MCP servers: These servers span various categories, including office, lifestyle, communication, information, and work management.
Single-tool-calling capability: It can perform single- or multi-step tool calls using a single tool that complies with the MCP.
Cross-tool-calling capability: It can perform single- or multi-step tool calls using different tools that complies with the MCP.

Demo Case (1 x speed):

demo.mp4

🛠️ Installation

The required package versions are listed in ./requirements.txt to ensure compatibility.

pip install -r requirements.txt

📽️ Training

We primarily adopt a learning-from-demonstration approach to train our model. The demonstrations are generated through continuous interactions between an LLM and the MCP environment. MiniCPM learns from these demonstrations through Supervised Fine-Tuning (SFT). We employ LLaMa-Factory as our SFT framework, with an adapted version for MiniCPM.

Model Download

Download the model in this link.

Data Format

{
  "conversations": [
    {"from": "human", "value": "Hi, I need to convert 500 US dollars to Euros. Can you help me with that?"}, 
    {"from": "gpt", "value": "<|thought_start|>\nI will call the get_currency_exchange_rate function to convert 500 US dollars to Euros.\n<|thought_end|>\n<|tool_call_start|>\n```python\nget_currency_exchange_rate(from_currency=\"USD\",to_currency=\"EUR\",amount=500)\n```\n<|tool_call_end|>\n"}, 
    {"from": "tool", "value": "{\"converted_amount\": 425.50, \"exchange_rate\": 0.851}"}, 
    {"from": "gpt", "value": "<|thought_start|>\nThe assistant thought that the user asked for a currency conversion, which is a task that can be handled by the 'get_currency_exchange_rate' function, and the assistant has received the necessary parameters to execute this function.\n<|thought_end|>\nSure, 500 US dollars will convert to approximately 425.50 Euros. The current exchange rate is 0.851."}
  ], 
  "tools": "[{\"name\": \"get_currency_exchange_rate\", \"description\": \"Get the exchange rate between two currencies\", \"parameters\": {\"type\": \"object\", \"properties\": {\"from_currency\": {\"type\": \"string\", \"description\": \"The currency to convert from\"}, \"to_currency\": {\"type\": \"string\", \"description\": \"The currency to convert to\"}, \"amount\": {\"type\": \"number\", \"description\": \"The amount to convert\"}}, \"required\": [\"from_currency\", \"to_currency\", \"amount\"]}}, {\"name\": \"generate_random_password\", \"description\": \"Generate a random password with specified requirements\", \"parameters\": {\"type\": \"object\", \"properties\": {\"length\": {\"type\": \"integer\", \"description\": \"The length of the password\"}, \"include_numbers\": {\"type\": \"boolean\", \"description\": \"Include numbers in the password\"}, \"include_symbols\": {\"type\": \"boolean\", \"description\": \"Include symbols in the password\"}}, \"required\": [\"length\"]}}]", 
  "system": "You are a helpful assistant with access to some functions. Use them if required."}

Single-Node Training

To run training on a single machine, simply use the following command:

llamafactory-cli train /path/to/config.yaml

Example:

llamafactory-cli train ./LLaMA-Factory/examples/train_full/minicpm4/sft.yaml

Distributed Training Setup (2 Nodes Example)

All nodes must have:

Identical software environments
Network connectivity (e.g., TCP port 29500 open)
Access to the same training data

To run training across 2 machines, follow these steps:

Determine Master Node IP On your master node (node 0), run:

hostname -I | awk '{print $1}'

Launch Training

On master node (node 0):

export MASTER_ADDR=$(hostname -I | awk '{print $1}')

FORCE_TORCHRUN=1 NNODES=2 RANK=0 MASTER_ADDR=$MASTER_ADDR MASTER_PORT=29500 \
llamafactory-cli train /path/to/config.yaml

On worker node (node 1):

FORCE_TORCHRUN=1 NNODES=2 RANK=1 MASTER_ADDR=$MASTER_ADDR MASTER_PORT=29500 \
llamafactory-cli train /path/to/config.yaml

📖 Inference

MCP Servers Deployment

The MCP Servers supported by MiniCPM4-MCP include Airbnb, Amap-Maps, Arxiv-MCP-Server, Calculator, Computer-Control-MCP, Desktop-commander, Filesystem, Github, Gaode, MCP-Code-Executor, MCP-DOCx, PPT, PPTx, Simple-Time-Server, Slack, and Whisper. Follow the instructions provided in each server's repository for successful deployment. Note that not all tools in these servers will function properly in every environment. Some tools are unstable and may return errors such as timeouts or HTTP errors. During training data construction, tools with consistently high failure rates (e.g., those for which the LLM fails to produce a successful query even after hundreds of attempts) are filtered out.

MCP Client Setup

We modified the existing MCP Client from the mcp-cli repository to enable interaction between MiniCPM and MCP Servers.
After the MCP Client performs a handshake with a Server, it retrieves a list of available tools. An example of tool information contained in this list is provided in available_tool_example.json.

Once the available tools and user query are obtained, results can be generated using the following script logic:

python generate_example.py \
--tokenizer_path {path to MiniCPM4 tokenizer} \
--base_url {vllm deployment URL} \
--model {model name used in vllm deployment} \
--output_path {path to save results}

where MiniCPM4 generates tool calls in the following format:

    <|tool_call_start|>
    ```python 
    read_file(path="/path/to/file")
    ```
    <|tool_call_end|>

You can build a custom parser for MiniCPM4 tool calls based on this format. The relevant parsing logic is located in generate_example.py.

Since the mcp-cli repository supports the vLLM inference framework, MiniCPM4-MCP can also be integrated into mcp-cli by modifying vLLM accordingly.
Specifically, follow the instructions in this link to enable interaction between a client running the MiniCPM4-MCP model and the MCP Server.

📈 Evaluation

Once generation is complete, run the following example evaluation script:

python eval_scripts.py \
--input_path {path where the results generated by `generate` are saved}

This script is used to evaluate the model's performance in predicting function names during single-turn tool calls. In multi-turn scenarios, the accuracy of the tool call generated at the current step can be evaluated by providing the ground-truth information from previous steps. The evaluation logic for each step is the same as that of the single-turn setting.

Evaluation Results

MCP Server		gpt-4o			qwen3			minicpm4
	func	param	value	func	param	value	func	param	value
Airbnb	89.3	67.9	53.6	92.8	60.7	50.0	96.4	67.9	50.0
Amap-Maps	79.8	77.5	50.0	74.4	72.0	41.0	89.3	85.7	39.9
Arxiv-MCP-Server	85.7	85.7	85.7	81.8	54.5	50.0	57.1	57.1	52.4
Calculator	100.0	100.0	20.0	80.0	80.0	13.3	100.0	100.0	6.67
Computor-Control-MCP	90.0	90.0	90.0	90.0	90.0	90.0	90.0	90.0	86.7
Desktop-Commander	100.0	100.0	100.0	100.0	100.0	100.0	100.0	100.0	100.0
Filesystem	63.5	63.5	31.3	69.7	69.7	26.0	83.3	83.3	42.7
Github	92.0	80.0	58.0	80.5	50.0	27.7	62.8	25.7	17.1
Gaode	71.1	55.6	17.8	68.8	46.6	24.4	68.9	46.7	15.6
MCP-Code-Executor	85.0	80.0	70.0	80.0	80.0	70.0	90.0	90.0	65.0
MCP-Docx	95.8	86.7	67.1	94.9	81.6	60.1	95.1	86.6	76.1
PPT	72.6	49.8	40.9	85.9	50.7	37.5	91.2	72.1	56.7
PPTx	64.2	53.7	13.4	91.0	68.6	20.9	91.0	58.2	26.9
Simple-Time-Server	90.0	70.0	70.0	90.0	90.0	90.0	90.0	60.0	60.0
Slack	100.0	90.0	70.0	100.0	100.0	65.0	100.0	100.0	100.0
Whisper	90.0	90.0	90.0	90.0	90.0	90.0	90.0	90.0	30.0
Average	80.2	70.2	49.1	83.5	67.7	43.8	88.3	76.1	51.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

News

🚩 Overview

🛠️ Installation

📽️ Training

Model Download

Data Format

Single-Node Training

Distributed Training Setup (2 Nodes Example)

📖 Inference

MCP Servers Deployment

MCP Client Setup

📈 Evaluation

Evaluation Results

FilesExpand file tree

README_en.md

Latest commit

History

README_en.md

File metadata and controls

News

🚩 Overview

🛠️ Installation

📽️ Training

Model Download

Data Format

Single-Node Training

Distributed Training Setup (2 Nodes Example)

📖 Inference

MCP Servers Deployment

MCP Client Setup

📈 Evaluation

Evaluation Results