Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions config/agent_gaia-validation-gpt5.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
defaults:
- benchmark: gaia-validation
- override hydra/job_logging: none
- _self_ # Allow defining variables at the top of this file


main_agent:
prompt_class: MainAgentPrompt_GAIA
llm:
provider_class: "GPT5OpenAIClient"
model_name: "gpt-5"
async_client: true
temperature: 1.0
top_p: 1.0
min_p: 0.0
top_k: -1
max_tokens: 128000
reasoning_effort: "high"
openai_api_key: "${oc.env:OPENAI_API_KEY,???}"
openai_base_url: "${oc.env:OPENAI_BASE_URL,https://api.openai.com/v1}"
openrouter_provider: ""
disable_cache_control: true
keep_tool_result: -1
oai_tool_thinking: false

tool_config:
- tool-reasoning

max_turns: -1 # Maximum number of turns for main agent execution
max_tool_calls_per_turn: 10 # Maximum number of tool calls per turn

input_process:
hint_generation: true
hint_llm_base_url: "${oc.env:HINT_LLM_BASE_URL,https://api.openai.com/v1}"
output_process:
final_answer_extraction: true
final_answer_llm_base_url: "${oc.env:FINAL_ANSWER_LLM_BASE_URL,https://api.openai.com/v1}"

openai_api_key: "${oc.env:OPENAI_API_KEY,???}" # used for hint generation and final answer extraction
add_message_id: true
keep_tool_result: -1
chinese_context: "${oc.env:CHINESE_CONTEXT,false}"


sub_agents:
agent-worker:
prompt_class: SubAgentWorkerPrompt
llm:
provider_class: "GPT5OpenAIClient"
model_name: "gpt-5"
async_client: true
temperature: 1.0
top_p: 1.0
min_p: 0.0
top_k: -1
max_tokens: 128000
reasoning_effort: "medium"
openai_api_key: "${oc.env:OPENAI_API_KEY,???}"
openai_base_url: "${oc.env:OPENAI_BASE_URL,https://api.openai.com/v1}"
openrouter_provider: ""
disable_cache_control: true
keep_tool_result: -1
oai_tool_thinking: false

tool_config:
- tool-searching
- tool-image-video
- tool-reading
- tool-code
- tool-audio

max_turns: -1 # Maximum number of turns for main agent execution
max_tool_calls_per_turn: 10 # Maximum number of tool calls per turn


# Can define some top-level or default parameters here
output_dir: logs/
data_dir: "${oc.env:DATA_DIR,data}" # Points to where data is stored

5 changes: 4 additions & 1 deletion config/tool/tool-reading.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@ name: "tool-reading"
tool_command: "python"
args:
- "-m"
- "src.tool.mcp_servers.reading_mcp_server"
- "src.tool.mcp_servers.reading_mcp_server"
env:
SERPER_API_KEY: "${oc.env:SERPER_API_KEY}"
JINA_API_KEY: "${oc.env:JINA_API_KEY}"
56 changes: 56 additions & 0 deletions docs/mkdocs/docs/gaia_validation_gpt5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# GAIA Validation - GPT5

MiroFlow now supports GPT-5 with MCP tool invocation, providing a unified workflow for multi-step reasoning, information integration, and scalable tool coordination.

!!! info "Prerequisites"
Before proceeding, please review the [GAIA Validation Prerequisites](gaia_validation_prerequisites.md) document, which covers common setup requirements, dataset preparation, and API key configuration.

---

## Running the Evaluation

### Step 1: Dataset Preparation

Follow the [dataset preparation instructions](gaia_validation_prerequisites.md#dataset-preparation) in the prerequisites document.

### Step 2: API Keys Configuration

Configure the following API keys in your `.env` file:

```env title="GPT-5 .env Configuration"
# Search and web scraping capabilities
SERPER_API_KEY="your-serper-api-key"
JINA_API_KEY="your-jina-api-key"

# Code execution environment
E2B_API_KEY="your-e2b-api-key"

# Vision understanding capabilities
ANTHROPIC_API_KEY="your-anthropic-api-key"
GEMINI_API_KEY="your-gemini-api-key"

# Primary LLM provider, LLM judge, reasoning, and hint generation
OPENAI_API_KEY="your-openai-api-key"
OPENAI_BASE_URL="https://api.openai.com/v1"

```

### Step 3: Run the Evaluation

Execute the evaluation using the GPT-5 configuration:

```bash title="Run GAIA Validation with GPT-5"
uv run main.py common-benchmark \
--config_file_name=agent_gaia-validation-gpt5 \
output_dir="logs/gaia-validation-gpt5/$(date +"%Y%m%d_%H%M")"
```

### Step 4: Monitor Progress

Follow the [progress monitoring instructions](gaia_validation_prerequisites.md#progress-monitoring-and-resume) in the prerequisites document.


---

!!! info "Documentation Info"
**Last Updated:** October 2025 · **Doc Contributor:** Team @ MiroMind AI
3 changes: 2 additions & 1 deletion docs/mkdocs/docs/llm_clients_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ MiroFlow supports multiple LLM providers through a unified client interface. Eac
| `ClaudeAnthropicClient` | Anthropic Direct | claude-3-7-sonnet | `ANTHROPIC_API_KEY`, `ANTHROPIC_BASE_URL` |
| `ClaudeOpenRouterClient` | OpenRouter | anthropic/claude-3.7-sonnet, and other [supported models](https://openrouter.ai/models) | `OPENROUTER_API_KEY`, `OPENROUTER_BASE_URL` |
| `GPTOpenAIClient` | OpenAI | gpt-4, gpt-3.5 | `OPENAI_API_KEY`, `OPENAI_BASE_URL` |
| `GPT5OpenAIClient` | OpenAI | gpt-5 | `OPENAI_API_KEY`, `OPENAI_BASE_URL` |
| `MiroThinkerSGLangClient` | SGLang | MiroThinker series | `OAI_MIROTHINKER_API_KEY`, `OAI_MIROTHINKER_BASE_URL` |

## Basic Configuration
Expand All @@ -31,4 +32,4 @@ main_agent:
---

!!! info "Documentation Info"
**Last Updated:** September 2025 · **Doc Contributor:** Team @ MiroMind AI
**Last Updated:** October 2025 · **Doc Contributor:** Team @ MiroMind AI
46 changes: 43 additions & 3 deletions docs/mkdocs/docs/openai-gpt.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,45 @@
# OpenAI GPT Models

OpenAI's latest models including GPT-4o and advanced reasoning models with strong coding, vision, and reasoning capabilities.
OpenAI's latest models including GPT-5, GPT-4o and advanced reasoning models with strong coding, vision, and reasoning capabilities.

## Client Used
## Client Used for GPT-5

`GPT5OpenAIClient`

## Environment Setup

```bash title="Environment Variables"
export OPENAI_API_KEY="your-openai-key"
export OPENAI_BASE_URL="https://api.openai.com/v1" # optional
```

## Configuration

```yaml title="Agent Configuration"
main_agent:
llm:
provider_class: "GPT5OpenAIClient"
model_name: "gpt-5"
async_client: true
temperature: 1.0
top_p: 1.0
min_p: 0.0
top_k: -1
max_tokens: 128000
reasoning_effort: "high" # Use high in the main agent, and use the default medium in the sub-agent.
openai_api_key: "${oc.env:OPENAI_API_KEY,???}"
openai_base_url: "${oc.env:OPENAI_BASE_URL,https://api.openai.com/v1}"
```

## Usage

```bash title="Example Command"
# Create custom OpenAI config
uv run main.py trace --config_file_name=your_config_file \
--task="Your task" --task_file_name="data/file.txt"
```

## Client Used for GPT-4o

`GPTOpenAIClient`

Expand Down Expand Up @@ -32,7 +69,10 @@ uv run main.py trace --config_file_name=your_config_file \
--task="Your task" --task_file_name="data/file.txt"
```

!!! note "Configuration Notes"
- `GPTOpenAIClient` also supports GPT-5, but it has not been fully validated on MiroFlow yet. We recommend using `GPT5OpenAIClient`.

---

!!! info "Documentation Info"
**Last Updated:** September 2025 · **Doc Contributor:** Team @ MiroMind AI
**Last Updated:** October 2025 · **Doc Contributor:** Team @ MiroMind AI
1 change: 1 addition & 0 deletions docs/mkdocs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ nav:
- GAIA-Validation:
- Prerequisites: gaia_validation_prerequisites.md
- Claude-3.7-Sonnet: gaia_validation_claude37sonnet.md
- GPT-5: gaia_validation_gpt5.md
- MiroThinker: gaia_validation_mirothinker.md
- GAIA-Validation-Text-Only: gaia_validation_text_only.md
- GAIA-Test: gaia_test.md
Expand Down
1 change: 1 addition & 0 deletions src/llm/provider_client_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ def __post_init__(self):
self.top_p: float = self.cfg.llm.top_p
self.min_p: float = self.cfg.llm.min_p
self.top_k: int = self.cfg.llm.top_k
self.reasoning_effort: str = self.cfg.llm.get("reasoning_effort", "medium")
self.repetition_penalty: float = self.cfg.llm.get("repetition_penalty", 1.0)
self.max_tokens: int = self.cfg.llm.max_tokens
self.max_context_length: int = self.cfg.llm.get("max_context_length", -1)
Expand Down
2 changes: 1 addition & 1 deletion src/llm/providers/claude_openrouter_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -411,4 +411,4 @@ def _apply_cache_control(self, messages):
else:
# Other messages add directly
cached_messages.append(turn)
return list(reversed(cached_messages))
return list(reversed(cached_messages))
Loading
Loading