diff --git a/README.md b/README.md
index 617acc16..dbfd9e98 100644
--- a/README.md
+++ b/README.md
@@ -1,174 +1,115 @@
-
-MiroFlow: A Consistent Agent Framework with Reproducible Performance
-
+
+

+
+
-
-Try our demo with MiroThinker here!
-
+## π Table of Contents
-## π Table of Contents
-
-- [π― Overview](#-overview)
-- [β¨ MiroFlow SOTA Performance](#-miroflow-sota-performance)
-- [π€ MiroFlow: Modular AI Agent Framework](#-miroflow-modular-ai-agent-framework)
- - [Workflow Overview](#workflow-overview)
- - [Architecture Components](#architecture-components)
- - [Core System π»](#core-system-)
- - [Tool Integration π§](#tool-integration-)
- - [Agent System π·](#agent-system-)
- - [Support Systems βοΈ](#support-systems-οΈ)
+- [π° News & Updates](#-news--updates)
+- [π Introduction](#-introduction)
+- [β¨ Performance on Benchmarks](#-performance-on-benchmarks)
- [π Getting Started](#-getting-started)
- - [Prerequisites](#prerequisites)
- - [Runing a single task](#runing-a-single-task)
- - [Evaluate on Benchmark](#evaluate-on-benchmark)
- - [[Optional] Customized Configuration](#optional-customized-configuration)
- [π MiroThinker](#-mirothinker)
-- [β FAQ](#-faq)
-- [π Join Our Communities!](#-join-our-communities)
+- [π License](#-license)
+- [π Acknowledgments](#-acknowledgments)
+- [π§ Support](#-support)
-# π― Overview
-
+## π° News & Updates
-**MiroFlow** is a **battle-tested** agent framework that reliably completes complex tool-use tasks. We have extensively used it to generate high-quality, post-training agent trace data for **[MiroThinker](https://huggingface.co/collections/miromind-ai/mirothinker-v01-689301b6d0563321862d44a1)**, our suite of open-source agentic models. Some key features are:
+- **2025-08-26**: π **New Open-Source SOTA Results** - MiroFlow has achieved state-of-the-art or highly competitive performance across multiple agentic benchmarks: **HLE 27.2%**, **HLE-Text-Only 29.5%**, **BrowserComp-EN 33.2%**, **BrowserComp-ZH 44.3%**, and **xBench-DeepSearch 72.0%**. For comprehensive analysis and detailed results, please see our [Blog](https://miromind.ai/blog/miroflow).
+- **2025-08-25**: π **GAIA-Validation Trace Release** - We have released comprehensive MiroFlow execution traces achieving an overall accuracy of 73.94% (Pass@1) on the GAIA validation benchmark. This represents the best reproducible result we are aware of to date. Explore the traces at [Trace-GAIA-Validation](apps/public-trace/gaia-validation).
+- **2025-08-22**: π **Light-Weight Deployment** - Introducing streamlined deployment options for MiroThinker models with optimized resource usage and faster startup times. Experience the interactive demo: [π Try Gradio Demo](https://github.com/MiroMindAI/MiroThinker/tree/main/apps/gradio-demo)
+- **2025-08-08**: π **MiroFlow v0.1 Released** - Framework, model, and data are now fully open-sourced!
-- π **Reproducible SOTA**: **MiroFlow** consistently achieves 72.2% (pass@1 average@3) on GAIA validation set. Follow our [getting-started guide](#get-start) below, or view our many runs of gaia trace on huggingfaces. If you can't reproduce our result, please open a Github issue - We take reproducibility seriously.
-- π **High Concurrency and Fault Tolerance**: **MiroFlow** scales data collection efficiently and handles rate-limited APIs and unstable network connections with ease.
-- π **Baked-in observability and evaluation**: **MiroFlow** ships with scripts for benchmarking agents and a straightforward web-ui for visualizing and debugging agent trace data.
-# β¨ MiroFlow SOTA Performance
+
+

+
-MiroFlow, equipped with Claude Sonnet 3.7 as its primary LLM, **achieved 81.8% pass@3, 82.4% maj. vote, 74.5% pass@1 (best@3), and 72.2% pass@1 (avg@3) on the GAIA validation set**. This represents **state-of-the-art (SOTA) performance** among open-source agent frameworks.
-
-> [!NOTE]
-> Our pass@1 scores are reported as both the average across three runs (avg@3) and the best score among those runs (best@3). For most other reported pass@1 results, it is unclear whether they represent an average or a best score across multiple trials (indicated with *).
+## π Introduction
-To prevent agents from retrieving answers directly from Hugging Face, we disabled access to it during the inference and trace collection.
+
-*We have evaluated multiple agent frameworks on GAIA. Please note that some reported results may be overstated or lack clear definitions, and are not reproducible.*
-In contrast, reproducing MiroFlow's results is straightforward with just a few required API keys.
-
-# π€ MiroFlow: Modular AI Agent Framework
-MiroFlow is a sophisticated, modular framework for building intelligent AI agents with multi-turn conversation capabilities, comprehensive tool integration, and hierarchical sub-agent support.
+**MiroFlow** is a fully open-sourced agent framework that reliably completes complex tool-use tasks. Some key features are:
-
-
-## Workflow Overview
-
-MiroFlow handles user queries through a multi-stage and agentic process designed for flexibility and depth. The workflow is organized as follows:
-
-1. **Intent Recognition & Query Augmentation**
- LLMs analyze user input to detect intent and refine the query.
-
-2. **Planning & Task Orchestration**
- The main agent drafts an execution plan, invokes tools, and coordinates sub-agents.
-
-3. **Delegation to Sub-Agents**
- Specialized agents (e.g., agent-browsing) handle complex or domain-specific tasks. Sub-agents independently plan, act, and execute tool calls as needed.
-
-4. **Tool Access via MCP Servers**
- When external capabilities are required, agents leverage specialized tools by connecting to MCP (Model Context Protocol) servers.
-
-5. **Result Synthesis & Output Alignment**
- After task completion, a dedicated summary process synthesizes results, ensuring the output is high-quality and aligned with user instructions (or benchmark formats).
+- π **Reproducible SOTA**: **MiroFlow** consistently achieves 72.2% (pass@1 average@3) on GAIA validation set. Follow our [getting-started guide](#get-start) below, or view our many runs of gaia trace on huggingfaces. If you can't reproduce our result, please open a Github issue - We take reproducibility seriously.
+- π **High-quanlity Data Collection**: We enabled the agent workflow with data colleciton features to generate high-quality, post-training agent trace data. We also released partial data and model to public including [MiroThinker](https://huggingface.co/collections/miromind-ai/mirothinker-v01-689301b6d0563321862d44a1) and [MiroVerse](https://huggingface.co/datasets/miromind-ai/MiroVerse-v0.1).
+- π **High Concurrency and Fault Tolerance**: **MiroFlow** scales data collection efficiently and handles rate-limited APIs and unstable network connections with ease.
+- π **Baked-in observability and evaluation**: **MiroFlow** ships with scripts for benchmarking agents and a straightforward web-ui for visualizing and debugging agent trace data.
-## Architecture Components
-All core components are located in the `MiroFlow/libs/` directory.
+## β¨ Performance on Benchmarks
-```
-MiroFlow/libs/
-βββ miroflow/
-β βββ src/miroflow/
-β βββ prebuilt/
-β β βββ pipeline.py # Pipeline: coordinates task execution
-β β βββ orchestrator.py # Orchestrator: manages LLM β tool flow
-β β βββ config/ # Hydra configs for agents, LLMs, pricing
-β βββ llm/
-β β βββ client.py # Unified LLM client
-β βββ utils/
-β β βββ io_utils.py # Output formatting utilities
-β β βββ prompt_utils.py # Prompt definitions for agents
-β β βββ tool_utils.py # Tool configuration helpers
-β βββ logging/ # Task logging & metrics
-β
-βββ miroflow-tool/
-β βββ src/miroflow/tool/
-β βββ manager.py # Tool Manager: MCP server connector
-β βββ mcp_servers/ # Individual MCP tool servers
-β βββ python_server.py # Code execution
-β βββ vision_mcp_server.py # Visual perception
-β βββ searching_mcp_server.py # Web search & retrieval
-β βββ audio_mcp_server.py # Audio transcription
-β βββ reasoning_mcp_server.py # Enhanced reasoning
-β βββ reading_mcp_server.py # Document processing
-```
+
+

+
-
+We benchmark MiroFlow on a series of benchmarks including **GAIA**, **HLE**, **BrowseComp** and **xBench-DeepSearch**. Meantime, we are working on more benchmarks.
-### Core System π»
+| Model/Framework | GAIA Val | HLE | HLE-Text | BrowserComp-EN | BrowserComp-ZH | xBench-DeepSearch |
+|----------------|----------|-----|----------|----------------|----------------|-------------------|
+| **MiroFlow** | **82.4%** | **27.2%** | **29.5%** | 33.2% | **44.3%** | **72.0%** |
+| OpenAI Deep Research | 67.4% | 26.6% | - | **51.5%** | 42.9% | - |
+| Gemini Deep Research | - | 26.9% | - | - | - | 50+% |
+| Kimi Researcher | - | - | 26.9% | - | - | 69.0% |
+| WebSailor-72B | 55.4% | - | - | - | 30.1% | 55.0% |
+| Manus | 73.3% | - | - | - | - | - |
-- **Pipeline** (`./miroflow/src/miroflow/prebuilt/pipeline.py`): Main entry point that creates and manages all components, handles error recovery, and returns final results
-- **Orchestrator** (`./miroflow/src/miroflow/prebuilt/orchestrator.py`): Manages multi-turn conversations, parses tool calls, executes tools, and delegates to sub-agents
-- **LLM Client** (`./miroflow/src/miroflow/llm/client.py`): Unified interface supporting Anthropic, OpenAI, Google, Qwen, DeepSeek, and local deployments
+### GAIA-Validation
-### Tool Integration π§
+
-- **Tool Manager** (`./miroflow-tool/src/miroflow/tool/manager.py`) : Comprehensive MCP server connection manager with tool discovery, persistent connections, and error handling
+MiroFlow **achieved 81.8% pass@3, 82.4% maj. vote, 74.5% pass@1 (best@3), and 72.2% pass@1 (avg@3) on the GAIA validation set**. This represents **state-of-the-art (SOTA) performance** among open-source agent frameworks.
-- **MCP Servers** (`./miroflow-tool/src/miroflow/tool/mcp_servers/`) : Individual tool implementations built on FastMCP. Provides extensive capabilities including:
- - Code execution and analysis (`./python_server.py`)
- - Visual perception (`./vision_mcp_server.py`)
- - Web search and content retrieval (`./searching_mcp_server.py`)
- - Audio transcription (`./audio_mcp_server.py`)
- - Enhanced reasoning capabilities (`./reasoning_mcp_server.py`)
- - Document processing and analysis (`./reading_mcp_server.py`)
+> [!NOTE]
+> Our pass@1 scores are reported as both the average across three runs (avg@3) and the best score among those runs (best@3). For most other reported pass@1 results, it is unclear whether they represent an average or a best score across multiple trials (indicated with *).
-### Agent System π·
+To prevent agents from retrieving answers directly from Hugging Face, we disabled access to it during the inference and trace collection.
-**Sub-Agents**
-Specialized agents designed for specific domains (e.g., `agent-browsing` for web navigation). Each sub-agent maintains dedicated tool sets and custom prompts, allowing the main agent to delegate tasks requiring specialized expertise. Agent definitions are managed through configuration files with prompts and descriptions customized in `./miroflow/src/miroflow/utils/prompt_utils.py` and `tool_utils.py`.
+*We have evaluated multiple agent frameworks on GAIA. Please note that some reported results may be overstated or lack clear definitions, and are not reproducible.*
+In contrast, reproducing MiroFlow's results is straightforward with just a few required API keys.
-### Support Systems βοΈ
-- **Configuration System** (`./miroflow/src/miroflow/prebuilt/config/`) : Hydra-powered YAML configuration for agents, LLMs, benchmarks, and pricing
+# π€ MiroFlow: Modular AI Agent Framework
-- **Output Formatter** (`./miroflow/src/miroflow/utils/io_utils.py`) : Intelligent response formatting that adapts to various benchmark requirements
+MiroFlow is a high-performance, modular framework for building intelligent AI agents that achieve state-of-the-art results on complex benchmarks. It features multi-turn conversation capabilities, comprehensive tool integration, and hierarchical sub-agent support for superior task completion.
-- **Task Logger** (`./miroflow/src/miroflow/logging/`) : Comprehensive logging for agent interactions, tool executions, and performance metrics
+
+

+
-### Execution Pipeline Data Flow
+More information on our agent [workflow](docs/workflow.md).
-
# π Getting Started
-## Prerequisites
+### Prerequisites
> [!TIP]
> we recommend using [`uv`](https://docs.astral.sh/uv/) with `python>= 3.12`
-**Step 1:** Clone repo and prepare python environment:
+### Step 1: Clone repo and prepare python environment
```bash
## clone the repo
@@ -179,9 +120,10 @@ cd MiroFlow/apps/run-agent
uv sync
```
-**Step 2:** Set up environment dependencies:
+### Step 2: Set up environment variables
+
+#### a. Set up `MiroFlow/apps/prepare-benchmark/.env`
-a. Set up `MiroFlow/apps/prepare-benchmark/.env` by:
```bash
## copy environment variable template and prepare yours in .env file
cd MiroFlow/apps/prepare-benchmark
@@ -189,8 +131,10 @@ cd MiroFlow/apps/prepare-benchmark
# Edit .env with your actual API keys
cp .env.template .env
```
-Edit `.env` to configure environment variables:
-```
+
+Edit `.env` to configure environment variables:
+
+```env
# For downloading datasets from Hugging Face
HF_TOKEN=""
@@ -198,7 +142,8 @@ HF_TOKEN=""
DATA_DIR="../../data" # relative to this file
```
-b. Set up `MiroFlow/apps/run-agent/.env` by:
+#### b. Set up `MiroFlow/apps/run-agent/.env`
+
```bash
## copy environment variable template and prepare yours in .env file
cd MiroFlow/apps/run-agent
@@ -206,8 +151,10 @@ cd MiroFlow/apps/run-agent
# Edit .env with your actual API keys
cp .env.template .env
```
-Edit `.env` to configure environment variables:
-```
+
+Edit `.env` to configure environment variables:
+
+```env
# Using OpenRouter to provide primary agent model
OPENROUTER_API_KEY=""
OPENROUTER_BASE_URL="https://openrouter.ai/api/v1"
@@ -241,51 +188,14 @@ HTTPS_PROXY=""
DATA_DIR="../../data"
```
-If you wish to use a different LLM as the primary agent model, you will need to provide the corresponding API keys.
-
-
-**Step 3:** Prepare E2B Sandbox (Optional)
+> [!NOTE]
+> If you wish to use a different LLM as the primary agent model, you will need to provide the corresponding API keys.
> [!TIP]
-> We provide a public E2B sandbox template. Follow this step if you want to reproduce.
->
-> For the E2B sandbox service, we recommend setting up a Linux Docker image with a comprehensive set of apt and Python packages pre-installed. Without these pre-installed packages, the agent will need to spend extra steps and context installing them, resulting in reduced token efficiency.
->
-> you need to have `npm` install and `docker` running locally.
+> **Optional Local E2B Sandbox**: If you prefer to use a local E2B Sandbox installation instead of the online service, please refer to our prepared [installation guide](docs/local_e2b.md).
-1. Install `e2b` command line and login:
-
-```shell
-## install e2b
-npm install -g @e2b/cli
-## check that it is available
-which e2b
-```
-
-2. Download our pre-configured Dockerfile:
-[e2b.Dockerfile](https://github.com/MiroMindAI/MiroFlow/blob/main/docs/e2b.Dockerfile).
-
-```shell
-wget https://github.com/MiroMindAI/MiroFlow/blob/main/docs/e2b.Dockerfile
-```
-
-3. Run `e2b template build` command [check official doc here](https://e2b.dev/docs/sdk-reference/cli/v1.0.2/template), use `all_pip_apt_pkg` as the name of template.
-
-```shell
-## build the template with `docker build` locally
-E2B_ACCESS_TOKEN=${your-token}
-e2b template build -c "/root/.jupyter/start-up.sh" -n "all_pip_apt_pkg" -d ./e2b.Dockerfile
-## check that template is built successfully
-E2B_ACCESS_TOKEN=${your-token} e2b template list
-```
-
-For additional information, please see the [E2B Docker documentation](https://e2b.dev/docs/sandbox-template).
-
-
-## Runing a single task
-
-Run a single task:
+### Run a single task
```bash
## run a task with instruction
@@ -293,9 +203,7 @@ cd MiroFlow/apps/run-agent
uv run main.py trace --task="your task description" --task_file_name="path to related task file"
```
-## Evaluate on Benchmark
-
-Run prebuilt agent on the benchmark data:
+### Evaluate on Benchmark
```bash
## download data
@@ -313,127 +221,45 @@ cd MiroFlow/apps/run-agent
bash scripts/claude-sonnet-3.7/run_evaluate_multiple_runs_gaia-validation.sh
```
-## [Optional] Customized Configuration
+### Customized Configuration
-MiroFlow uses [Hydra](https://hydra.cc/) for flexible configuration management, supporting different setups for LLMs, agents, benchmarks, and pricing models.
+MiroFlow leverages [Hydra](https://hydra.cc/) for powerful configuration management, allowing you to easily switch between different LLMs, agents, benchmarks, and pricing models using YAML configuration files. For detailed instructions on configuration management, see our [configuration guide](docs/hydra_config.md).
-## Structure
-
-```
-MiroFlow/libs/miroflow/src/miroflow/prebuilt/config
-βββ config.yaml # Main configuration with defaults
-βββ agent/ # Agent configurations (tools, limits)
-βββ benchmark/ # Benchmark configurations (datasets, execution)
-βββ llm/ # Language model configurations (providers, models)
-```
-
-## Usage
-
-Run with default configuration:
-```bash
-cd MiroFlow/apps/run-agent
-uv run main.py common-benchmark
-```
-
-Default configuration is defined in
-`MiroFlow/libs/miroflow/src/miroflow/prebuilt/config/config.yaml`:
-
-```yaml
-# conf/config.yaml
-defaults:
- - llm: claude_openrouter
- - agent: miroflow
- - benchmark: gaia-validation
- - pricing: _default
-
-# Other configurations...
-```
-
-| Component | Default Value | File Path |
-|------------|----------------------|---------------------------------------------------------------------------|
-| LLM | `claude_openrouter` | `libs/miroflow/src/miroflow/prebuilt/config/llm/claude_openrouter.yaml` |
-| Agent | `miroflow` | `libs/miroflow/src/miroflow/prebuilt/config/agent/miroflow.yaml` |
-| Benchmark | `gaia-validation` | `libs/miroflow/src/miroflow/prebuilt/config/benchmark/gaia-validation.yaml` |
-
-
-## Override Configurations
-
-### Component Override
-Switch between existing configurations using the filename (without `.yaml`):
-```bash
-uv run main.py common-benchmark llm= agent= benchmark=
-```
-
-For example, if you have `conf/llm/claude_openrouter.yaml`, use `llm=claude_openrouter`
-
-
-### Parameter Override
-Override specific parameters:
-```bash
-cd MiroFlow/apps/run-agent
-uv run main.py common-benchmark llm.temperature=0.1 agent.main_agent.max_turns=30
-```
-
-## Create Custom Configurations
-
-1. **Create new config file** in the appropriate subdirectory (e.g., `conf/llm/my_config.yaml`)
-2. **Inherit from defaults** using Hydra's composition:
- ```yaml
- defaults:
- - _default # Inherit base configuration
- - _self_ # Allow self-overrides
-
- # Your custom parameters
- parameter: value
- ```
-3. **Use your config**: `uv run main.py common-benchmark component=my_config`
-
-
-# π MiroThinker
+## π MiroThinker
[MiroThinker](https://github.com/MiroMindAI/MiroThinker) (7B/14B/32B) is our suite of open-source agentic models, designed to work seamlessly with the MiroFlow framework. Our models are specifically built to handle **complex, multi-tool tasks**, leveraging the reproducible and robust foundation that MiroFlow provides.
By combining MiroFlowβs reliable orchestration with MiroThinkerβs advanced reasoning capabilities, we offer a powerful, end-to-end solution for building high-performing, reproducible AI agents.
These models are a direct result of our extensive data collection efforts, utilizing MiroFlow to generate high-quality, post-training agent trace data. This unique approach enables MiroThinker to excel in planning, executing, and reasoning through complex multi-step tasks.
-We invite the community to explore and build upon these models. For more details on the architecture and implementation, please take a look at our codebase.
-# β FAQ
+## π License
-**Q: What is the estimated cost of running the GAIA validation set for a single run?**
-**A**: The cost is approximately **$450 USD** for a run without a cache. Enabling the cache can significantly reduce this cost by 50-67%, bringing it down to the **$150 - $225** range.
+This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
+## π Acknowledgments
-**Q: How long does it take to run the GAIA validation set for a single run?**
-**A**: With the `max_concurrent` parameter set to 20, a full run takes about **5 hours** to complete.
+- **Benchmark Contributors** for the comprehensive evaluation datasets
+- **Open Source Community** for the tools and libraries that make this possible
-**Q: Are all the specified APIs required?**
-**A**: **Yes.** To fully reproduce our published results, access to all the listed APIs is necessary.
+## π§ Support
-**Q: What is the difference between MiroFlow and MiroThinker?**
-**A**: **MiroFlow** is primarily focused on interacting with proprietary models; **MiroThinker** is designed for our own open-source models.
+- Issues: For questions or bug reports, please use [GitHub Issues](https://github.com/MiroMindAI/MiroFlow/issues).
+- FAQ Documentation: See [faq.md](docs/faq.md) for additional guidelines
-We plan to merge these two projects in the future to create a single, unified platform.
-## π Join Our Communities!
+
+

+
-- Follow us on social media for timely updates!
- - [X - MiroMindAI](https://x.com/miromind_ai)
- - [RedNote - MiroMind](https://www.xiaohongshu.com/user/profile/663098830000000003033edc)
-- Join our communities:
- - [Discord server](https://discord.gg/GPqEnkzQZd)
- -
- WeChat Group
-
-
-
WeChat Bot QR Code
-

-
-
-
WeChat Group QR Code
-

-
-
-
+### References
+```
+@misc{2025mirothinker,
+ title={MiroFlow: An Open-Source Agentic Framework for Deep Research},
+ author={MiroMind AI Team},
+ howpublished={\url{https://github.com/MiroMindAI/MiroFlow}},
+ year={2025}
+}
+```
diff --git a/docs/faq.md b/docs/faq.md
new file mode 100644
index 00000000..4f8c27bf
--- /dev/null
+++ b/docs/faq.md
@@ -0,0 +1,15 @@
+**Q: What is the estimated cost of running the GAIA validation set for a single run?**
+**A**: The cost is approximately **$450 USD** for a run without a cache. Enabling the cache can significantly reduce this cost by 50-67%, bringing it down to the **$150 - $225** range.
+
+
+**Q: How long does it take to run the GAIA validation set for a single run?**
+**A**: With the `max_concurrent` parameter set to 20, a full run takes about **5 hours** to complete.
+
+**Q: Are all the specified APIs required?**
+**A**: **Yes.** To fully reproduce our published results, access to all the listed APIs is necessary.
+
+
+**Q: What is the difference between MiroFlow and MiroThinker?**
+**A**: **MiroFlow** is primarily focused on interacting with proprietary models; **MiroThinker** is designed for our own open-source models.
+
+We plan to merge these two projects in the future to create a single, unified platform.
\ No newline at end of file
diff --git a/docs/figs/09xyHJV9dkbY2yacsv4zYTBbKM.avif b/docs/figs/09xyHJV9dkbY2yacsv4zYTBbKM.avif
new file mode 100644
index 00000000..0b44e3ae
Binary files /dev/null and b/docs/figs/09xyHJV9dkbY2yacsv4zYTBbKM.avif differ
diff --git a/docs/figs/MiroFlow_logo.png b/docs/figs/MiroFlow_logo.png
new file mode 100644
index 00000000..5847b8d0
Binary files /dev/null and b/docs/figs/MiroFlow_logo.png differ
diff --git a/docs/figs/gaia_score.png b/docs/figs/gaia_score.png
index 145e8cb8..d7a9978c 100644
Binary files a/docs/figs/gaia_score.png and b/docs/figs/gaia_score.png differ
diff --git a/docs/figs/logo.png b/docs/figs/logo.png
index ea384dd3..b60b56b5 100644
Binary files a/docs/figs/logo.png and b/docs/figs/logo.png differ
diff --git a/docs/figs/miroflow_architecture.png b/docs/figs/miroflow_architecture.png
index bc032cc1..e5c3cf2f 100644
Binary files a/docs/figs/miroflow_architecture.png and b/docs/figs/miroflow_architecture.png differ
diff --git a/docs/figs/wechat-bot-qr-code.jpg b/docs/figs/wechat-bot-qr-code.jpg
deleted file mode 100644
index 52079868..00000000
Binary files a/docs/figs/wechat-bot-qr-code.jpg and /dev/null differ
diff --git a/docs/figs/wechat-group-qr-code.jpg b/docs/figs/wechat-group-qr-code.jpg
deleted file mode 100644
index 83bcaff3..00000000
Binary files a/docs/figs/wechat-group-qr-code.jpg and /dev/null differ
diff --git a/docs/hydra_config.md b/docs/hydra_config.md
new file mode 100644
index 00000000..457f2a61
--- /dev/null
+++ b/docs/hydra_config.md
@@ -0,0 +1,71 @@
+
+### Structure
+
+```
+MiroFlow/libs/miroflow/src/miroflow/prebuilt/config
+βββ config.yaml # Main configuration with defaults
+βββ agent/ # Agent configurations (tools, limits)
+βββ benchmark/ # Benchmark configurations (datasets, execution)
+βββ llm/ # Language model configurations (providers, models)
+```
+
+### Usage
+
+Run with default configuration:
+```bash
+cd MiroFlow/apps/run-agent
+uv run main.py common-benchmark
+```
+
+Default configuration is defined in
+`MiroFlow/libs/miroflow/src/miroflow/prebuilt/config/config.yaml`:
+
+```yaml
+# conf/config.yaml
+defaults:
+ - llm: claude_openrouter
+ - agent: miroflow
+ - benchmark: gaia-validation
+ - pricing: _default
+
+# Other configurations...
+```
+
+| Component | Default Value | File Path |
+|------------|----------------------|---------------------------------------------------------------------------|
+| LLM | `claude_openrouter` | `libs/miroflow/src/miroflow/prebuilt/config/llm/claude_openrouter.yaml` |
+| Agent | `miroflow` | `libs/miroflow/src/miroflow/prebuilt/config/agent/miroflow.yaml` |
+| Benchmark | `gaia-validation` | `libs/miroflow/src/miroflow/prebuilt/config/benchmark/gaia-validation.yaml` |
+
+
+### Override Configurations
+
+#### Component Override
+Switch between existing configurations using the filename (without `.yaml`):
+```bash
+uv run main.py common-benchmark llm= agent= benchmark=
+```
+
+For example, if you have `conf/llm/claude_openrouter.yaml`, use `llm=claude_openrouter`
+
+
+#### Parameter Override
+Override specific parameters:
+```bash
+cd MiroFlow/apps/run-agent
+uv run main.py common-benchmark llm.temperature=0.1 agent.main_agent.max_turns=30
+```
+
+### Create Custom Configurations
+
+1. **Create new config file** in the appropriate subdirectory (e.g., `conf/llm/my_config.yaml`)
+2. **Inherit from defaults** using Hydra's composition:
+ ```yaml
+ defaults:
+ - _default # Inherit base configuration
+ - _self_ # Allow self-overrides
+
+ # Your custom parameters
+ parameter: value
+ ```
+3. **Use your config**: `uv run main.py common-benchmark component=my_config`
diff --git a/docs/local_e2b.md b/docs/local_e2b.md
new file mode 100644
index 00000000..a1657059
--- /dev/null
+++ b/docs/local_e2b.md
@@ -0,0 +1,39 @@
+
+# Prepare E2B Sandbox (Optional)
+
+> [!TIP]
+> We provide a public E2B sandbox template. Follow this step if you want to reproduce.
+>
+> For the E2B sandbox service, we recommend setting up a Linux Docker image with a comprehensive set of apt and Python packages pre-installed. Without these pre-installed packages, the agent will need to spend extra steps and context installing them, resulting in reduced token efficiency.
+>
+> you need to have `npm` install and `docker` running locally.
+
+
+1. Install `e2b` command line and login:
+
+```shell
+## install e2b
+npm install -g @e2b/cli
+## check that it is available
+which e2b
+```
+
+2. Download our pre-configured Dockerfile:
+[e2b.Dockerfile](https://github.com/MiroMindAI/MiroFlow/blob/main/docs/e2b.Dockerfile).
+
+```shell
+wget https://github.com/MiroMindAI/MiroFlow/blob/main/docs/e2b.Dockerfile
+```
+
+3. Run `e2b template build` command [check official doc here](https://e2b.dev/docs/sdk-reference/cli/v1.0.2/template), use `all_pip_apt_pkg` as the name of template.
+
+```shell
+## build the template with `docker build` locally
+E2B_ACCESS_TOKEN=${your-token}
+e2b template build -c "/root/.jupyter/start-up.sh" -n "all_pip_apt_pkg" -d ./e2b.Dockerfile
+## check that template is built successfully
+E2B_ACCESS_TOKEN=${your-token} e2b template list
+```
+
+For additional information, please see the [E2B Docker documentation](https://e2b.dev/docs/sandbox-template).
+
diff --git a/docs/workflow.md b/docs/workflow.md
new file mode 100644
index 00000000..717a981b
--- /dev/null
+++ b/docs/workflow.md
@@ -0,0 +1,90 @@
+
+## Workflow Overview
+
+MiroFlow handles user queries through a multi-stage and agentic process designed for flexibility and depth. The workflow is organized as follows:
+
+1. **Intent Recognition & Query Augmentation**
+ LLMs analyze user input to detect intent and refine the query.
+
+2. **Planning & Task Orchestration**
+ The main agent drafts an execution plan, invokes tools, and coordinates sub-agents.
+
+3. **Delegation to Sub-Agents**
+ Specialized agents (e.g., agent-browsing) handle complex or domain-specific tasks. Sub-agents independently plan, act, and execute tool calls as needed.
+
+4. **Tool Access via MCP Servers**
+ When external capabilities are required, agents leverage specialized tools by connecting to MCP (Model Context Protocol) servers.
+
+5. **Result Synthesis & Output Alignment**
+ After task completion, a dedicated summary process synthesizes results, ensuring the output is high-quality and aligned with user instructions (or benchmark formats).
+
+## Architecture Components
+
+All core components are located in the `MiroFlow/libs/` directory.
+
+```
+MiroFlow/libs/
+βββ miroflow/
+β βββ src/miroflow/
+β βββ prebuilt/
+β β βββ pipeline.py # Pipeline: coordinates task execution
+β β βββ orchestrator.py # Orchestrator: manages LLM β tool flow
+β β βββ config/ # Hydra configs for agents, LLMs, pricing
+β βββ llm/
+β β βββ client.py # Unified LLM client
+β βββ utils/
+β β βββ io_utils.py # Output formatting utilities
+β β βββ prompt_utils.py # Prompt definitions for agents
+β β βββ tool_utils.py # Tool configuration helpers
+β βββ logging/ # Task logging & metrics
+β
+βββ miroflow-tool/
+β βββ src/miroflow/tool/
+β βββ manager.py # Tool Manager: MCP server connector
+β βββ mcp_servers/ # Individual MCP tool servers
+β βββ python_server.py # Code execution
+β βββ vision_mcp_server.py # Visual perception
+β βββ searching_mcp_server.py # Web search & retrieval
+β βββ audio_mcp_server.py # Audio transcription
+β βββ reasoning_mcp_server.py # Enhanced reasoning
+β βββ reading_mcp_server.py # Document processing
+```
+
+
+
+### Core System π»
+
+- **Pipeline** (`./miroflow/src/miroflow/prebuilt/pipeline.py`): Main entry point that creates and manages all components, handles error recovery, and returns final results
+
+- **Orchestrator** (`./miroflow/src/miroflow/prebuilt/orchestrator.py`): Manages multi-turn conversations, parses tool calls, executes tools, and delegates to sub-agents
+
+- **LLM Client** (`./miroflow/src/miroflow/llm/client.py`): Unified interface supporting Anthropic, OpenAI, Google, Qwen, DeepSeek, and local deployments
+
+### Tool Integration π§
+
+- **Tool Manager** (`./miroflow-tool/src/miroflow/tool/manager.py`) : Comprehensive MCP server connection manager with tool discovery, persistent connections, and error handling
+
+- **MCP Servers** (`./miroflow-tool/src/miroflow/tool/mcp_servers/`) : Individual tool implementations built on FastMCP. Provides extensive capabilities including:
+ - Code execution and analysis (`./python_server.py`)
+ - Visual perception (`./vision_mcp_server.py`)
+ - Web search and content retrieval (`./searching_mcp_server.py`)
+ - Audio transcription (`./audio_mcp_server.py`)
+ - Enhanced reasoning capabilities (`./reasoning_mcp_server.py`)
+ - Document processing and analysis (`./reading_mcp_server.py`)
+
+### Agent System π·
+
+**Sub-Agents**
+Specialized agents designed for specific domains (e.g., `agent-browsing` for web navigation). Each sub-agent maintains dedicated tool sets and custom prompts, allowing the main agent to delegate tasks requiring specialized expertise. Agent definitions are managed through configuration files with prompts and descriptions customized in `./miroflow/src/miroflow/utils/prompt_utils.py` and `tool_utils.py`.
+
+### Support Systems βοΈ
+
+- **Configuration System** (`./miroflow/src/miroflow/prebuilt/config/`) : Hydra-powered YAML configuration for agents, LLMs, benchmarks, and pricing
+
+- **Output Formatter** (`./miroflow/src/miroflow/utils/io_utils.py`) : Intelligent response formatting that adapts to various benchmark requirements
+
+- **Task Logger** (`./miroflow/src/miroflow/logging/`) : Comprehensive logging for agent interactions, tool executions, and performance metrics
+
+### Execution Pipeline Data Flow
+
+
\ No newline at end of file