Skip to content

Commit 24435cb

Browse files
authored
Merge pull request #31 from MiroMindAI/binwang_dev
feat(agent): restructure project to miroflow package with v1.6 enhancements
2 parents f55fc1e + 57debe6 commit 24435cb

File tree

184 files changed

+2361
-1050
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

184 files changed

+2361
-1050
lines changed

.env.template

Lines changed: 72 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,79 @@
11

2-
# Must have for minimal agent get started
3-
OPENROUTER_API_KEY=xxxx
2+
# ============================================================
3+
# MiroFlow Environment Configuration Template
4+
# ============================================================
5+
# Copy this file to .env and fill in the values.
6+
# Lines starting with # are optional or have defaults.
7+
# ============================================================
48

5-
SERPER_API_KEY=xxxx
6-
JINA_API_KEY=xxxx
7-
E2B_API_KEY=xxxx
89

9-
OPENAI_API_KEY=xxxx
10-
OPENAI_BASE_URL=xxxx
10+
# ------ Core LLM (OpenAI-compatible, required) ------
11+
OPENAI_API_KEY=
12+
OPENAI_BASE_URL=
1113

12-
OAI_MIROTHINKER_BASE_URL=xxxx
13-
OAI_MIROTHINKER_API_KEY=xxxx
14+
# ------ MiroThinker ------
15+
OAI_MIROTHINKER_BASE_URL=
16+
OAI_MIROTHINKER_API_KEY=
1417

15-
SUMMARY_LLM_BASE_URL=xxxx
16-
SUMMARY_LLM_API_KEY=xxxx
17-
SUMMARY_LLM_MODEL_NAME=xxxx
18+
# ------ Summary LLM (used by jina_scrape) ------
19+
SUMMARY_LLM_BASE_URL=
20+
SUMMARY_LLM_API_KEY=
21+
SUMMARY_LLM_MODEL_NAME=
1822

19-
HF_TOKEN=xxxx
23+
# ------ Search: Serper (used by searching/reading MCP servers) ------
24+
SERPER_API_KEY=
25+
# SERPER_BASE_URL= # Optional: override default Serper endpoint
2026

21-
# TencentCloud credentials for Sogou search (used by serper_sogou_search tool)
22-
TENCENTCLOUD_SECRET_ID=xxxx
23-
TENCENTCLOUD_SECRET_KEY=xxxx
27+
# ------ Search: Jina (used by searching/reading MCP servers) ------
28+
JINA_API_KEY=
29+
# JINA_BASE_URL= # Optional: override default Jina endpoint
30+
31+
# ------ Search: TencentCloud Sogou (used by serper_sogou_search) ------
32+
# TENCENTCLOUD_SECRET_ID=
33+
# TENCENTCLOUD_SECRET_KEY=
34+
35+
# ------ Code Sandbox: E2B ------
36+
E2B_API_KEY=
37+
38+
# ------ Vision MCP Server ------
39+
# Supports multiple providers; enable the ones you need.
40+
# ENABLE_OPENAI_VISION=true # Uses OPENAI_API_KEY / OPENAI_BASE_URL above
41+
# OPENAI_MODEL_NAME= # Model name for OpenAI vision
42+
# ENABLE_CLAUDE_VISION=true
43+
# ANTHROPIC_API_KEY=
44+
# ANTHROPIC_BASE_URL=
45+
# ANTHROPIC_MODEL_NAME=
46+
# GEMINI_API_KEY=
47+
# GEMINI_MODEL_NAME=
48+
49+
# ------ Vision MCP Server (Open-Source alternative) ------
50+
# VISION_API_KEY=
51+
# VISION_BASE_URL=
52+
# VISION_MODEL_NAME=
53+
54+
# ------ Reasoning MCP Server ------
55+
# Uses OPENAI / ANTHROPIC keys above by default.
56+
# OPENAI_MODEL_NAME is shared with vision.
57+
58+
# ------ Reasoning MCP Server (Open-Source alternative) ------
59+
# REASONING_API_KEY=
60+
# REASONING_BASE_URL=
61+
# REASONING_MODEL_NAME=
62+
63+
# ------ Audio MCP Server ------
64+
# Uses OPENAI_API_KEY / OPENAI_BASE_URL above by default.
65+
# OPENAI_AUDIO_MODEL_NAME=
66+
# OPENAI_TRANSCRIPTION_MODEL_NAME=
67+
68+
# ------ Audio MCP Server (Open-Source alternative) ------
69+
# WHISPER_API_KEY=
70+
# WHISPER_BASE_URL=
71+
# WHISPER_MODEL_NAME=
72+
73+
# ------ Hugging Face ------
74+
# HF_TOKEN= # Optional: for downloading benchmark datasets
75+
76+
# ------ Web App ------
77+
# MIROFLOW_HOST=0.0.0.0
78+
# MIROFLOW_PORT=8000
79+
# MIROFLOW_DEBUG=false

.github/workflows/run-ruff.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ jobs:
2626
uses: astral-sh/setup-uv@v5
2727

2828
- name: Install dependencies
29-
run: uv sync
29+
run: uv sync --extra dev
3030

3131
- name: Check static error
3232
run: |

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ celerybeat.pid
136136

137137
# Environments
138138
.env
139+
.env.*
139140
.envrc
140141
.venv
141142
env/
@@ -229,4 +230,4 @@ web_app/uploads/
229230
.vscode/
230231
.ruff_cache/
231232
.env
232-
.env.local
233+
.env.*

README.md

Lines changed: 69 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<div align="center">
22
<img src="docs/mkdocs/docs/assets/miroflow_logo.png" width="45%" alt="MiroFlow" />
33

4-
<h3>Open-Source Research Agent Framework with State-of-the-Art Performance</h3>
4+
<h3>Performance-First Agent Framework That Makes Any Model Better</h3>
55

66
[![DEMO](https://img.shields.io/badge/Demo-FFB300?style=for-the-badge&logo=airplayvideo&logoColor=white)](https://dr.miromind.ai/)
77
[![MODELS](https://img.shields.io/badge/Models-5EDDD2?style=for-the-badge&logo=huggingface&logoColor=ffffff&labelColor)](https://huggingface.co/miromind-ai)
@@ -13,8 +13,8 @@
1313
</div>
1414

1515
<div align="center">
16-
<strong>MiroFlow</strong> is an open-source research agent framework that achieves <strong>#1 ranking</strong> across representative benchmarks (FutureX, GAIA, HLE, xBench-DeepSearch, BrowseComp).<br>
17-
It powers <a href="https://github.com/MiroMindAI/mirothinker">MiroThinker</a>, our open-source agent foundation model with native tool-assisted reasoning.
16+
<strong>MiroFlow</strong> is the open-source agent framework that maximizes any model's agent performance — and proves it across 9+ benchmarks with reproducible results.<br>
17+
Plug in GPT-5, Claude, <a href="https://github.com/MiroMindAI/mirothinker">MiroThinker</a>, Kimi, DeepSeek, or any OpenAI-compatible model. Same tools. Same environment. Better results.
1818
</div>
1919

2020
<br>
@@ -34,22 +34,61 @@ It powers <a href="https://github.com/MiroMindAI/mirothinker">MiroThinker</a>, o
3434

3535
- **[2025-09-15]**: **MiroFlow v0.3**: Enhanced codebase architecture and significantly improved benchmark performance, boosting GPT-5's prediction accuracy for future events by 11%. MiroFlow now ranks #1 in the future prediction benchmark. See [FutureX](https://futurex-ai.github.io/).
3636
- **[2025-08-27]**: **MiroFlow v0.2**: Achieves state-of-the-art performance across [multiple agentic benchmarks](https://miromind.ai), including HLE (27.2%), HLE-Text-Only (29.5%), BrowserComp-EN (33.2%), BrowserComp-ZH (47.1%), and xBench-DeepSearch (72.0%).
37-
- **[2025-08-26]**: Released [GAIA Validation Trace](docs/public_trace.md) (73.94% pass@1) and [Gradio Demo](https://github.com/MiroMindAI/MiroThinker/tree/main/apps/gradio-demo) for local deployment.
37+
- **[2025-08-26]**: Released GAIA Validation Trace (73.94% pass@1) and [Gradio Demo](https://github.com/MiroMindAI/MiroThinker/tree/main/apps/gradio-demo) for local deployment.
3838
- **[2025-08-08]**: **MiroFlow v0.1**: Complete open-source release of the research agent framework.
3939

4040
</details>
4141

4242
---
4343

44-
## Highlights
44+
## Architecture
4545

46-
- **Reproducible State-of-the-Art Performance**: #1 ranking across [multiple representative agentic benchmarks](https://miromindai.github.io/miroflow/evaluation_overview/), including FutureX, GAIA, HLE, xBench-DeepSearch, and BrowseComp.
47-
- **High Concurrency & Reliability**: Robust concurrency management and fault-tolerant design for handling rate-limited APIs and unstable networks.
48-
- **Cost-Effective Deployment**: Run a research agent service on a single RTX 4090 with the open-source [MiroThinker](https://github.com/MiroMindAI/mirothinker) model and free tools.
46+
<div align="center">
47+
<img src="docs/mkdocs/docs/assets/miroflow_architecture_v1.6.png" width="100%" alt="MiroFlow Architecture" />
48+
</div>
4949

5050
---
5151

52-
## Performance on Benchmarks
52+
## Why MiroFlow
53+
54+
### Make Any Model Better
55+
- **Model-Agnostic Performance**: Plug in any LLM — GPT-5, Claude, MiroThinker, Kimi K2.5, DeepSeek — and get better agent performance through smart rollback, iterative reasoning, and optimized tool orchestration.
56+
- **#1 Across 9+ Benchmarks**: Reproducible state-of-the-art on FutureX, GAIA, HLE, xBench-DeepSearch, BrowseComp, and more.
57+
- **One-Line Model Switching**: Change `provider_class` and `model_name` in YAML. Same tools, same prompts, same environment.
58+
59+
### Prove It
60+
- **Standardized Evaluation**: Fair model comparison with identical infrastructure. The framework is the constant; the model is the variable.
61+
- **Automated Multi-Run Evaluation**: Parallel runs with statistical aggregation (mean, std dev, min/max). Every result reproducible from config to score.
62+
63+
### Build With It
64+
- **Skill System**: Define agent skills via `SKILL.md` — no code changes needed.
65+
- **Agent Graph**: Compose multi-agent workflows with hierarchical graphs.
66+
- **Web Application**: FastAPI + React interface out of the box.
67+
- **Plugin Architecture**: `@register` decorator — extend without touching core code.
68+
- **Zero-Code Prompts**: YAML + Jinja2 templates.
69+
- **Cost-Effective**: Single RTX 4090 with open-source [MiroThinker](https://github.com/MiroMindAI/mirothinker).
70+
71+
---
72+
73+
## Any Model, Better Results
74+
75+
### Cross-Model Performance (MiroFlow Framework)
76+
77+
| Benchmark | MiroThinker | Claude 3.7 Sonnet | Kimi K2.5 |
78+
|-----------|-------------|-------------------|-----------|
79+
| GAIA Validation (165) | **82.4%** | 73.9% ||
80+
| GAIA Text-Only (103) | **79.6%** || 52.4% |
81+
| HLE | **27.2%** |||
82+
| HLE Text-Only | **29.5%** |||
83+
| BrowseComp-EN | 33.2% |||
84+
| BrowseComp-ZH | **47.1%** |||
85+
| xBench-DeepSearch | **72.0%** |||
86+
| FutureX | **#1** |||
87+
88+
> All results use the same MiroFlow tools, prompts, and infrastructure. The only variable is the model.
89+
> See the full [Model Comparison](https://miromindai.github.io/miroflow/model_comparison/) for details.
90+
91+
### Featured Results: MiroThinker
5392

5493
<div align="center">
5594
<img width="100%" alt="MiroThinker Performance" src="docs/mkdocs/docs/assets/mirothinker.png" />
@@ -59,7 +98,7 @@ It powers <a href="https://github.com/MiroMindAI/mirothinker">MiroThinker</a>, o
5998
<img width="100%" alt="BrowseComp MiroThinker Performance" src="docs/mkdocs/docs/assets/bc-mirothinker.png" />
6099
</div>
61100

62-
Follow our detailed guides to reproduce benchmark results in our [Benchmarks Documentation](https://miromindai.github.io/miroflow/evaluation_overview/).
101+
Follow our detailed guides to reproduce any result in our [Benchmarks Documentation](https://miromindai.github.io/miroflow/evaluation_overview/).
63102

64103
---
65104

@@ -83,6 +122,25 @@ bash scripts/test_single_task.sh \
83122

84123
Expected output: `\boxed{Congo Democratic Republic}`
85124

125+
**Switch models in one line** — same tools, same prompts, different LLM:
126+
127+
```yaml
128+
# GPT-5
129+
llm:
130+
provider_class: GPT5OpenAIClient
131+
model_name: gpt-5
132+
133+
# Claude 3.7 Sonnet
134+
llm:
135+
provider_class: ClaudeAnthropicClient
136+
model_name: claude-3-7-sonnet-20250219
137+
138+
# MiroThinker (open-source, self-hosted)
139+
llm:
140+
provider_class: MiroThinkerSGLangClient
141+
model_name: mirothinker-v1.5
142+
```
143+
86144
See [full documentation](https://miromindai.github.io/miroflow/quickstart/) for web app setup, more examples, and configuration options.
87145
88146
---
@@ -104,7 +162,7 @@ If you find our work helpful, please consider citing:
104162
**MiroFlow** (Framework)
105163
```bibtex
106164
@misc{2026miroflow,
107-
title={MiroFlow: A High-Performance Open-Source Research Agent Framework},
165+
title={MiroFlow: A Performance-First Agent Framework for Any Model},
108166
author={MiroMind AI Team},
109167
howpublished={\url{https://github.com/MiroMindAI/miroflow}},
110168
year={2026}

config/agent_quickstart.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@ main_agent:
2121
max_tokens: 128000
2222
reasoning_effort: medium
2323

24-
prompt: config/prompts/standard_prompt_main_agent.yaml
24+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
2525

2626
tools:
2727
- config/tool/tool-reading.yaml
28-
# - config/tool/tool-python.yaml
29-
# - config/tool/tool-search-and-scrape-webpage.yaml
30-
# - config/tool/tool-jina-scrape-llm-summary.yaml
28+
# - config/tool/tool-code-sandbox.yaml
29+
# - config/tool/tool-serper-search.yaml
30+
# - config/tool/tool-jina-scrape.yaml
3131
#- config/tool/tool-code.yaml
3232
#- config/tool/tool-image-video.yaml
3333
# - config/tool/tool-audio.yaml # Uncomment for audio processing

config/agent_quickstart_graph.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ main_agent:
2121
max_tokens: 128000
2222
reasoning_effort: medium
2323

24-
prompt: config/prompts/standard_prompt_main_agent.yaml
24+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
2525

2626
tools: null
2727

@@ -76,13 +76,13 @@ agent-subagent-3:
7676
_base_: config/llm/base_mirothinker.yaml
7777
prompt: config/prompts/prompt_sub_agent.yaml
7878
tools:
79-
- config/tool/tool-python.yaml
80-
- config/tool/tool-search-and-scrape-webpage.yaml
81-
- config/tool/tool-jina-scrape-llm-summary.yaml
79+
- config/tool/tool-code-sandbox.yaml
80+
- config/tool/tool-serper-search.yaml
81+
- config/tool/tool-jina-scrape.yaml
8282
tool_blacklist:
83-
- server: "tool-search-and-scrape-webpage"
83+
- server: "tool-serper-search"
8484
tool: "sogou_search"
85-
- server: "tool-python"
85+
- server: "tool-code-sandbox"
8686
tool: "download_file_from_sandbox_to_local"
8787
input_processor:
8888
- ${input-message-generator}

config/agent_quickstart_skill.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,13 @@ main_agent:
2626
max_tokens: 128000
2727
reasoning_effort: medium
2828

29-
prompt: config/prompts/standard_prompt_main_agent.yaml
29+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
3030

3131
tools:
32-
- config/tool/tool-python.yaml
32+
- config/tool/tool-code-sandbox.yaml
3333

3434
skills:
35-
- src/skill/skills/simple_file_understanding
35+
- miroflow/skill/skills/simple_file_understanding
3636

3737
input_processor:
3838
- ${input-message-generator}

config/agent_single-test.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ main_agent:
2121
- config/tool/tool-audio.yaml
2222
# - config/tool/tool-reasoning.yaml
2323
skills:
24-
- src/skill/skills/Today_feeling
25-
- src/skill/skills/Afternoon_feeling
24+
- miroflow/skill/skills/Today_feeling
25+
- miroflow/skill/skills/Afternoon_feeling
2626
input_processor:
2727
- ${input-hint-generator}
2828
- ${input-message-generator}

config/standard_browsecomp-en-200_mirothinker.yaml renamed to config/benchmark_browsecomp-en-200_mirothinker.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ main_agent:
1111
max_turns: 400
1212
llm:
1313
_base_: config/llm/base_mirothinker.yaml
14-
prompt: config/prompts/standard_prompt_main_agent.yaml
14+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
1515
tools:
1616
- config/tool/tool-code-sandbox.yaml
1717
- config/tool/tool-serper-search.yaml
@@ -36,7 +36,7 @@ output-boxed-extractor:
3636
type: RegexBoxedExtractor
3737
output-exceed-max-turn-summary:
3838
type: ExceedMaxTurnSummaryGenerator
39-
prompt: config/prompts/standard_prompt_main_agent.yaml
39+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
4040
llm:
4141
_base_: config/llm/base_mirothinker.yaml
4242

config/standard_browsecomp-en_mirothinker.yaml renamed to config/benchmark_browsecomp-en_mirothinker.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ main_agent:
1111
max_turns: 400
1212
llm:
1313
_base_: config/llm/base_mirothinker.yaml
14-
prompt: config/prompts/standard_prompt_main_agent.yaml
14+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
1515
tools:
1616
- config/tool/tool-code-sandbox.yaml
1717
- config/tool/tool-serper-search.yaml
@@ -36,7 +36,7 @@ output-boxed-extractor:
3636
type: RegexBoxedExtractor
3737
output-exceed-max-turn-summary:
3838
type: ExceedMaxTurnSummaryGenerator
39-
prompt: config/prompts/standard_prompt_main_agent.yaml
39+
prompt: config/prompts/prompt_main_agent_benchmark.yaml
4040
llm:
4141
_base_: config/llm/base_mirothinker.yaml
4242

0 commit comments

Comments
 (0)