You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deep research has broken out as one of the most popular agent applications. This is a simple, configurable, fully open source deep research agent that works across many model providers, search tools, and MCP servers.
5
+
Deep research has broken out as one of the most popular agent applications. This is a simple, configurable, fully open source deep research agent that works across many model providers, search tools, and MCP servers. It's performance is on par with many popular deep research agents ([see Deep Research Bench leaderboard](https://huggingface.co/spaces/Ayanami0730/DeepResearch-Leaderboard)).
6
6
7
-
* Read more in our [blog](https://blog.langchain.com/open-deep-research/)
8
-
* See our [video](https://www.youtube.com/watch?v=agGiWUpxkhg) for a quick overview
7
+
<imgwidth="817"height="666"alt="Screenshot 2025-07-13 at 11 21 12 PM"src="https://github.com/user-attachments/assets/052f2ed3-c664-4a4f-8ec2-074349dcaa3f" />
9
8
10
9
### 🔥 Recent Updates
11
10
12
11
**August 2, 2025**: Achieved #6 ranking on the [Deep Research Bench Leaderboard](https://huggingface.co/spaces/Ayanami0730/DeepResearch-Leaderboard) with an overall score of 0.4344.
13
12
13
+
**July 30, 2025**: Read about the evolution from our original implementations to the current version in our [blog post](https://rlancemartin.github.io/2025/07/30/bitter_lesson/).
14
+
15
+
**July 16, 2025**: Read more in our [blog](https://blog.langchain.com/open-deep-research/) and watch our [video](https://www.youtube.com/watch?v=agGiWUpxkhg) for a quick overview.
16
+
14
17
### 🚀 Quickstart
15
18
16
19
1. Clone the repository and activate a virtual environment:
This will open the LangGraph Studio UI in your browser.
47
+
44
48
```
45
49
- 🚀 API: http://127.0.0.1:2024
46
50
- 🎨 Studio UI: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
47
51
- 📚 API Docs: http://127.0.0.1:2024/docs
48
52
```
49
-
<imgwidth="817"height="666"alt="Screenshot 2025-07-13 at 11 21 12 PM"src="https://github.com/user-attachments/assets/052f2ed3-c664-4a4f-8ec2-074349dcaa3f" />
50
53
51
-
Ask a question in the `messages` input field and click `Submit`.
54
+
Ask a question in the `messages` input field and click `Submit`. Select different configuration in the "Manage Assistants" tab.
52
55
53
56
### ⚙️ Configurations
54
57
55
-
Extensive configuration options to customize research behavior. Configure via web UI, environment variables, or direct modification.
56
-
57
-
#### General Settings
58
-
59
-
-**Max Structured Output Retries** (default: 3): Maximum number of retries for structured output calls from models when parsing fails
60
-
-**Allow Clarification** (default: true): Whether to allow the researcher to ask clarifying questions before starting research
61
-
-**Max Concurrent Research Units** (default: 5): Maximum number of research units to run concurrently using sub-agents. Higher values enable faster research but may hit rate limits
62
-
63
-
#### Research Configuration
64
-
65
-
-**Search API** (default: Tavily): Choose from Tavily (works with all models), OpenAI Native Web Search, Anthropic Native Web Search, or None
66
-
-**Max Researcher Iterations** (default: 3): Number of times the Research Supervisor will reflect on research and ask follow-up questions
67
-
-**Max React Tool Calls** (default: 5): Maximum number of tool calling iterations in a single researcher step
68
-
69
-
#### Models
70
-
71
-
Open Deep Research uses multiple specialized models for different research tasks:
72
-
73
-
-**Summarization Model** (default: `openai:gpt-4.1-mini`): Summarizes research results from search APIs
74
-
-**Research Model** (default: `openai:gpt-4.1`): Conducts research and analysis
75
-
-**Compression Model** (default: `openai:gpt-4.1`): Compresses research findings from sub-agents
76
-
-**Final Report Model** (default: `openai:gpt-4.1`): Writes the final comprehensive report
77
-
78
-
All models are configured using [init_chat_model() API](https://python.langchain.com/docs/how_to/chat_models_universal_init/) which supports providers like OpenAI, Anthropic, Google Vertex AI, and others.
79
-
80
-
**Important Model Requirements:**
58
+
#### LLM :brain:
81
59
82
-
1.**Structured Outputs**: All models must support structured outputs. Check support [here](https://python.langchain.com/docs/integrations/chat/).
60
+
Open Deep Research supports a wide range of LLM providers via the [init_chat_model() API](https://python.langchain.com/docs/how_to/chat_models_universal_init/). It uses LLMs for a few different tasks. See the below model fields in the [configuration.py](https://github.com/langchain-ai/open_deep_research/blob/main/src/open_deep_research/configuration.py) file for more details. This can be accessed via the LangGraph Studio UI.
83
61
84
-
2.**Search API Compatibility**: Research and Compression models must support your selected search API:
85
-
- Anthropic search requires Anthropic models with web search capability
86
-
- OpenAI search requires OpenAI models with web search capability
87
-
- Tavily works with all models
62
+
-**Summarization** (default: `openai:gpt-4.1-mini`): Summarizes search API results
63
+
-**Research** (default: `openai:gpt-4.1`): Power the search agent
64
+
-**Compression** (default: `openai:gpt-4.1`): Compresses research findings
65
+
-**Final Report Model** (default: `openai:gpt-4.1`): Write the final report
88
66
89
-
3.**Tool Calling**: All models must support tool calling functionality
67
+
> Note: the selected model will need to support [structured outputs](https://python.langchain.com/docs/integrations/chat/) and [tool calling](https://python.langchain.com/docs/how_to/tool_calling/).
90
68
91
-
4.**Special Configurations**:
92
-
- For OpenRouter: Follow [this guide](https://github.com/langchain-ai/open_deep_research/issues/75#issuecomment-2811472408)
93
-
- For local models via Ollama: See [setup instructions](https://github.com/langchain-ai/open_deep_research/issues/65#issuecomment-2743586318)
69
+
> Note: For OpenRouter: Follow [this guide](https://github.com/langchain-ai/open_deep_research/issues/75#issuecomment-2811472408) and for local models via Ollama see [setup instructions](https://github.com/langchain-ai/open_deep_research/issues/65#issuecomment-2743586318).
94
70
95
-
#### Example MCP (Model Context Protocol) Servers
71
+
#### Search API :mag:
96
72
97
-
Open Deep Research supports MCP servers to extend research capabilities.
73
+
Open Deep Research supports a wide range of search tools. By default it uses the [Tavily](https://www.tavily.com/) search API. Has full MCP compatibility and work native web search for Anthropic and OpenAI. See the `search_api` and `mcp_config` fields in the [configuration.py](https://github.com/langchain-ai/open_deep_research/blob/main/src/open_deep_research/configuration.py) file for more details. This can be accessed via the LangGraph Studio UI.
98
74
99
-
#### Local MCP Servers
75
+
#### Other
100
76
101
-
**Filesystem MCP Server** provides secure file system operations with robust access control:
102
-
- Read, write, and manage files and directories
103
-
- Perform operations like reading file contents, creating directories, moving files, and searching
104
-
- Restrict operations to predefined directories for security
105
-
- Support for both command-line configuration and dynamic MCP roots
77
+
See the fields in the [configuration.py](https://github.com/langchain-ai/open_deep_research/blob/main/src/open_deep_research/configuration.py) for various other settings to customize the behavior of Open Deep Research.
**Remote MCP servers** enable distributed agent coordination and support streamable HTTP requests. Unlike local servers, they can be multi-tenant and require more complex authentication.
Remote servers can be configured as authenticated or unauthenticated and support JWT-based authentication through OAuth endpoints.
81
+
Open Deep Research is configured for evaluation with [Deep Research Bench](https://huggingface.co/spaces/Ayanami0730/DeepResearch-Leaderboard). This benchmark has 100 PhD-level research tasks (50 English, 50 Chinese), crafted by domain experts across 22 fields (e.g., Science & Tech, Business & Finance) to mirror real-world deep-research needs. It has 2 evaluation metrics, but the leaderboard is based on the RACE score. This uses LLM-as-a-judge (Gemini) to evaluate research reports against a golden set of reports compiled by experts across a set of metrics.
125
82
126
-
###📊 Evaluation
83
+
#### Usage
127
84
128
-
Comprehensive batch evaluation system for detailed analysis and comparative studies.
85
+
> Warning: Running across the 100 examples can cost ~$20-$100 depending on the model selection.
129
86
130
-
#### **Features:**
131
-
-**Multi-dimensional Scoring**: Specialized evaluators with 0-1 scale ratings
132
-
-**Dataset-driven Evaluation**: Batch processing across multiple test cases
87
+
The dataset is available on [LangSmith via this link](https://smith.langchain.com/public/c5e7a6ad-fdba-478c-88e6-3a388459ce8b/d). To kick off evaluation, run the following command:
133
88
134
-
#### **Usage:**
135
89
```bash
136
90
# Run comprehensive evaluation on LangSmith datasets
137
91
python tests/run_evaluate.py
138
92
```
139
93
140
-
#### **Deep Research Bench Submission:**
141
-
The evaluation runs against the [Deep Research Bench](https://github.com/Ayanami0730/deep_research_bench), a comprehensive benchmark with 100 PhD-level research tasks across 22 fields.
142
-
143
-
To submit results to the benchmark:
94
+
This will provide a link to a LangSmith experiment, which will have a name `YOUR_EXPERIMENT_NAME`. Once this is done, extract the results to a JSONL file that can be submitted to the Deep Research Bench.
144
95
145
-
1.**Run Evaluation**: Execute `python tests/run_evaluate.py` to evaluate against the Deep Research Bench dataset
146
-
2.**Extract Results**: Use the extraction script to generate JSONL output:
This creates `tests/expt_results/deep_research_bench_gpt-4.1.jsonl` with the required format.
151
-
3.**Submit to Benchmark**: Move the generated JSONL file to the Deep Research Bench repository and follow their [Quick Start guide](https://github.com/Ayanami0730/deep_research_bench?tab=readme-ov-file#quick-start) for evaluation submission
152
-
153
-
> **Note:** We submitted results from [this commit](https://github.com/langchain-ai/open_deep_research/commit/c0a160b57a9b5ecd4b8217c3811a14d8eff97f72) to the Deep Research Bench, resulting in an overall score of 0.4344 (#6 on the leaderboard).
Results for current `main` branch utilize more constrained prompting to reduce token spend ~4x while still achieving a score of 0.4268.
100
+
This creates `tests/expt_results/deep_research_bench_model-name.jsonl` with the required format. Move the generated JSONL file to a local clone of the Deep Research Bench repository and follow their [Quick Start guide](https://github.com/Ayanami0730/deep_research_bench?tab=readme-ov-file#quick-start) for evaluation submission.
156
101
157
-
#### **Current Results (Main Branch)**
102
+
#### Results
158
103
159
-
| Metric | Score |
160
-
|--------|-------|
161
-
| Comprehensiveness | 0.4145 |
162
-
| Insight | 0.3854 |
163
-
| Instruction Following | 0.4780 |
164
-
| Readability | 0.4495 |
165
-
|**Overall Score**|**0.4268**|
104
+
| Name | Commit | Summarization | Research | Compression | Total Cost | Total Tokens | RACE Score | Experiment |
| Deep Research Bench Submission |[c0a160b](https://github.com/langchain-ai/open_deep_research/commit/c0a160b57a9b5ecd4b8217c3811a14d8eff97f72)| openai:gpt-4.1-nano | openai:gpt-4.1 | openai:gpt-4.1 | $87.83 | 207,005,549 | 0.4344 |[Link](https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/datasets/6e4766ca-6[…]ons=e6647f74-ad2f-4cb9-887e-acb38b5f73c0&baseline=undefined)|
166
109
167
110
### 🚀 Deployments and Usage
168
111
169
-
Multiple deployment options for different use cases.
170
-
171
112
#### LangGraph Studio
172
113
173
114
Follow the [quickstart](#-quickstart) to start LangGraph server locally and test the agent out on LangGraph Studio.
@@ -188,9 +129,7 @@ You can also deploy your own instance of OAP, and make your own custom agents (l
188
129
189
130
### Legacy Implementations 🏛️
190
131
191
-
Read about the evolution from our original implementations to the current version in our [blog post](https://rlancemartin.github.io/2025/07/30/bitter_lesson/).
192
-
193
-
The `src/legacy/` folder contains two earlier implementations that provide alternative approaches to automated research:
132
+
The `src/legacy/` folder contains two earlier implementations that provide alternative approaches to automated research. They are less performant than the current implementation, but provide alternative ideas understanding the different approaches to deep research.
0 commit comments