🚀Please try our Demo!🚀

MiroFlow: A Leading "Open Deep Research" Project

📰 News & Updates
📝 Introduction
✨ Performance on Benchmarks
🚀 Getting Started
🌟 MiroThinker
📄 License & Support

📰 News & Updates

2025-08-27: 🎉 MiroFlow v0.2 - Achieves SOTA performance across multiple agentic benchmarks. Highlights include HLE 27.2%, HLE-Text-Only 29.5%, BrowserComp-EN 33.2%, BrowserComp-ZH 47.1%, and xBench-DeepSearch 72.0%.
2025-08-26: 🎉 GAIA Validation Trace released (73.94% with pass@1) and Gradio Demo released for local deployment.
2025-08-08: 🎉 MiroFlow v0.1 - Framework, model, and data are now fully open-sourced!

📝 Introduction

MiroFlow is a fully open-sourced agent framework designed to reliably complete complex tool-use tasks. Our comprehensive ecosystem includes the following key components:

🌟 Reproducible SOTA Performance: MiroFlow consistently achieves 72.2% (pass@1 average@3) on the GAIA benchmark. Follow our detailed guide to reproduce our released GAIA traces and verify results.
🌟 Advanced Data Collection: Our framework features sophisticated data collection capabilities that generate high-quality, post-training agent trace data. We've open-sourced extensive datasets through MiroVerse.
🌟 Open Source Models: We provide fully open-sourced models that you can deploy locally and fine-tune for your specific needs. Explore our model collection at MiroThinker.
🌟 Comprehensive Training Framework: We've open-sourced our complete SFT and DPO training recipes, available at MiroTrain.
🌟 Reinforcement Learning Framework: Our RL training exploration and methodologies are fully available through MiroRL.

✨ Performance on Benchmarks

Comprehensive Benchmark Performance Comparison

We benchmark MiroFlow on a series of benchmarks including GAIA, HLE, BrowseComp and xBench-DeepSearch. Meantime, we are working on more benchmarks.

Model/Framework	GAIA Val	HLE	HLE-Text	BrowserComp-EN	BrowserComp-ZH	xBench-DeepSearch
MiroFlow	82.4%	27.2%	29.5%	33.2%	47.1%	72.0%
OpenAI Deep Research	67.4%	26.6%	-	51.5%	42.9%	-
Gemini Deep Research	-	26.9%	-	-	-	50+%
Kimi Researcher	-	-	26.9%	-	-	69.0%
WebSailor-72B	55.4%	-	-	-	30.1%	55.0%
Manus	73.3%	-	-	-	-	-
DeepSeek v3.1	-	-	29.8%	-	-	71.2%

GAIA-Validation

MiroFlow achieved 81.8% pass@3, 82.4% maj. vote, 74.5% pass@1 (best@3), and 72.2% pass@1 (avg@3) on the GAIA validation set. This represents state-of-the-art (SOTA) performance among open-source agent frameworks.

Note

Our pass@1 scores are reported as both the average across three runs (avg@3) and the best score among those runs (best@3). For most other reported pass@1 results, it is unclear whether they represent an average or a best score across multiple trials (indicated with *).

To prevent agents from retrieving answers directly from Hugging Face, we disabled access to it during the inference and trace collection.

We have evaluated multiple agent frameworks on GAIA. Please note that some reported results may be overstated or lack clear definitions, and are not reproducible. In contrast, reproducing MiroFlow's results is straightforward with just a few required API keys.

🤖 MiroFlow: Modular AI Agent Framework

MiroFlow is a high-performance, modular framework for building intelligent AI agents that achieve state-of-the-art results on complex benchmarks. It features multi-turn conversation capabilities, comprehensive tool integration, and hierarchical sub-agent support for superior task completion.

More information on our agent workflow.

🚀 Getting Started

Prerequisites

Tip

we recommend using uv with python>= 3.12

Step 1: Clone repo and prepare python environment

## clone the repo
git clone https://github.com/MiroMindAI/MiroFlow
cd MiroFlow/apps/run-agent

## prepare python environment
uv sync

Step 2: Set up environment variables

a. Set up `MiroFlow/apps/prepare-benchmark/.env`

## copy environment variable template and prepare yours in .env file
cd MiroFlow/apps/prepare-benchmark

# Edit .env with your actual API keys
cp .env.template .env

Edit .env to configure environment variables:

# For downloading datasets from Hugging Face
HF_TOKEN="<your-huggingface-token>"

# [Optional] Data loading directory, by default `../../data`
DATA_DIR="../../data" # relative to this file

b. Set up `MiroFlow/apps/run-agent/.env`

## copy environment variable template and prepare yours in .env file
cd MiroFlow/apps/run-agent

# Edit .env with your actual API keys
cp .env.template .env

Edit .env to configure environment variables:

# Using OpenRouter to provide primary agent model
OPENROUTER_API_KEY=""
OPENROUTER_BASE_URL="https://openrouter.ai/api/v1"

# Anthropic, for vision tools
ANTHROPIC_API_KEY=""
ANTHROPIC_BASE_URL="https://api.anthropic.com"

# OpenAI, for audio tools, intent recognition, and answer extraction
OPENAI_API_KEY=""
OPENAI_BASE_URL="https://api.openai.com/v1"

# Gemini, for YouTube tasks
GEMINI_API_KEY=""

# Third party API keys
# For Google search and website scraping
SERPER_API_KEY=""
# For website scraping
JINA_API_KEY=""
# For the Linux sandbox
E2B_API_KEY=""

# [Optional] NewAPI, alternative to OpenRouter 
NEWAPI_API_KEY=""
NEWAPI_BASE_URL=""

# [Optional] for network proxy, null by default
HTTPS_PROXY=""
# [Optional] Data loading directory, by default `../../data`
DATA_DIR="../../data"

Note

If you wish to use a different LLM as the primary agent model, you will need to provide the corresponding API keys.

Step 3: Local E2B Sandbox Deployment

To achieve our best benchmark results, we recommend using a pre-defined sandbox template that includes the most commonly used Python and apt packages. Please see our installation guide for detailed instructions.

If you prefer not to use a sandbox template, you can disable it by commenting out the line template=DEFAULT_TEMPLATE_ID, in libs/miroflow-tool/src/miroflow/tool/mcp_servers/python_server.py (line 145).

Run a single task

## run a task with instruction
cd MiroFlow/apps/run-agent
uv run main.py trace --task="your task description" --task_file_name="path to related task file"

Evaluate on Benchmark

Prepare datasets according to your requirements. Some datasets may need to be downloaded manually into the /data/<benchmark> folder, and you should also create a corresponding standardized_data.jsonl metafile. We will support as many datasets as possible as soon as we can.

## supported benchmarks
cd MiroFlow/apps/prepare-benchmark
uv run main.py get gaia-val
uv run main.py get browsecomp-test
uv run main.py get browsecomp-zh-test
uv run main.py get hle

Run evaluation using the default settings. (Not parallelized; not recommended.)

## run the code
cd MiroFlow/apps/run-agent
uv run main.py common-benchmark benchmark=gaia-validation
uv run main.py common-benchmark benchmark=browsecomp
uv run main.py common-benchmark benchmark=browsecomp-zh
uv run main.py common-benchmark benchmark=hle

For parallel and multi-run evaluations, and to gain better control over environment settings using Hydra, we recommend using the provided script:

cd MiroFlow/apps/run-agent
bash ./scripts/main-worker-dual/run_evaluate_multiple_runs_gaia-validation.sh
bash ./scripts/main-worker-dual/run_evaluate_multiple_runs_browsecomp.sh
bash ./scripts/main-worker-dual/run_evaluate_multiple_runs_browsecomp-zh.sh
bash ./scripts/main-worker-dual/run_evaluate_multiple_runs_hle.sh

You can easily modify and customize these scripts to suit your needs. See Customized Configuration for more details.

Customized Configuration

MiroFlow leverages Hydra for powerful configuration management, allowing you to easily switch between different LLMs, agents, benchmarks, and pricing models using YAML configuration files. For detailed instructions on configuration management, see our configuration guide.

📄 License & Support

This project is licensed under the Apache License 2.0 - see the LICENSE file for details. Some components may have different licenses as specified in their respective file headers.

🙏 Acknowledgments

Benchmark Contributors for the comprehensive evaluation datasets
Open Source Community for the tools and libraries that make this possible

🔧 Support

Issues: For questions or bug reports, please use GitHub Issues.
FAQ Documentation: See faq.md for additional guidelines

References

@misc{2025mirothinker,
    title={MiroFlow: An Open-Source Agentic Framework for Deep Research},
    author={MiroMind AI Team},
    howpublished={\url{https://github.com/MiroMindAI/MiroFlow}},
    year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
LICENSES		LICENSES
apps		apps
data		data
docs		docs
libs		libs
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
REUSE.toml		REUSE.toml
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀Please try our Demo!🚀

MiroFlow: A Leading "Open Deep Research" Project

📰 News & Updates

📝 Introduction

✨ Performance on Benchmarks

GAIA-Validation

🤖 MiroFlow: Modular AI Agent Framework

🚀 Getting Started

Prerequisites

Step 1: Clone repo and prepare python environment

Step 2: Set up environment variables

a. Set up `MiroFlow/apps/prepare-benchmark/.env`

b. Set up `MiroFlow/apps/run-agent/.env`

Step 3: Local E2B Sandbox Deployment

Run a single task

Evaluate on Benchmark

Customized Configuration

📄 License & Support

🙏 Acknowledgments

🔧 Support

References

About

Uh oh!

Releases

Packages

Contributors 6

Uh oh!

Languages

License

MiroMindAI/MiroFlow

Folders and files

Latest commit

History

Repository files navigation

🚀Please try our Demo!🚀

MiroFlow: A Leading "Open Deep Research" Project

📰 News & Updates

📝 Introduction

✨ Performance on Benchmarks

GAIA-Validation

🤖 MiroFlow: Modular AI Agent Framework

🚀 Getting Started

Prerequisites

Step 1: Clone repo and prepare python environment

Step 2: Set up environment variables

a. Set up MiroFlow/apps/prepare-benchmark/.env

b. Set up MiroFlow/apps/run-agent/.env

Step 3: Local E2B Sandbox Deployment

Run a single task

Evaluate on Benchmark

Customized Configuration

📄 License & Support

🙏 Acknowledgments

🔧 Support

References

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Uh oh!

Languages

a. Set up `MiroFlow/apps/prepare-benchmark/.env`

b. Set up `MiroFlow/apps/run-agent/.env`

Packages