diff --git a/README.md b/README.md
index 38aeb1f53..58e069315 100644
--- a/README.md
+++ b/README.md
@@ -58,6 +58,7 @@ etc.
| [Gamesense](gamesense) | 🤖 LLMOps | 🧠 LoRA, ⚡ Efficient Training | pytorch, peft, phi-2 |
| [Nightwatch AI](nightwatch-ai) | 🤖 LLMOps | 📝 Summarization, 📊 Reporting | openai, supabase, slack |
| [ResearchRadar](research-radar) | 🤖 LLMOps | 📝 Classification, 📊 Comparison | anthropic, huggingface, transformers |
+| [Deep Research](deep_research) | 🤖 LLMOps | 📝 Research, 📊 Reporting, 🔍 Web Search | anthropic, mcp, agents, openai |
| [End-to-end Computer Vision](end-to-end-computer-vision) | 👁 CV | 🔎 Object Detection, 🏷️ Labeling | pytorch, label_studio, yolov8 |
| [Magic Photobooth](magic-photobooth) | 👁 CV | 📷 Image Gen, 🎞️ Video Gen | stable-diffusion, huggingface |
| [OmniReader](omni-reader) | 👁 CV | 📑 OCR, 📊 Evaluation, ⚙️ Batch Processing | polars, litellm, openai, ollama |
diff --git a/deep_research/README.md b/deep_research/README.md
index c057d85bf..5f2b66bbe 100644
--- a/deep_research/README.md
+++ b/deep_research/README.md
@@ -13,6 +13,7 @@ The ZenML Deep Research Agent is a scalable, modular pipeline that automates in-
- Creates a structured outline based on your research query
- Researches each section through targeted web searches and LLM analysis
+- **NEW**: Performs additional MCP-powered searches using Anthropic's Model Context Protocol with Exa integration
- Iteratively refines content through reflection cycles
- Produces a comprehensive, well-formatted research report
- Visualizes the research process and report structure in the ZenML dashboard
@@ -24,7 +25,7 @@ This project transforms exploratory notebook-based research into a production-gr
The Deep Research Agent produces comprehensive, well-structured reports on any topic. Here's an example of research conducted on quantum computing:
-

+
Sample report generated by the Deep Research Agent
@@ -40,8 +41,9 @@ The pipeline uses a parallel processing architecture for efficiency and breaks d
6. **Reflection Generation**: Generate recommendations for improving research quality
7. **Human Approval** (optional): Get human approval for additional searches
8. **Execute Approved Searches**: Perform approved additional searches to fill gaps
-9. **Final Report Generation**: Compile all synthesized information into a coherent HTML report
-10. **Collect Tracing Metadata**: Gather comprehensive metrics about token usage, costs, and performance
+9. **MCP-Powered Search**: Use Anthropic's Model Context Protocol to perform additional targeted searches via Exa
+10. **Final Report Generation**: Compile all synthesized information into a coherent HTML report
+11. **Collect Tracing Metadata**: Gather comprehensive metrics about token usage, costs, and performance
This architecture enables:
- Better reproducibility and caching of intermediate results
@@ -55,6 +57,7 @@ This architecture enables:
- **LLM Integration**: Uses litellm for flexible access to various LLM providers
- **Web Research**: Utilizes Tavily API for targeted internet searches
+- **MCP Integration**: Leverages Anthropic's Model Context Protocol with Exa for enhanced research capabilities
- **ZenML Orchestration**: Manages pipeline flow, artifacts, and caching
- **Reproducibility**: Track every step, parameter, and output via ZenML
- **Visualizations**: Interactive visualizations of the research structure and progress
@@ -70,6 +73,8 @@ This architecture enables:
- ZenML installed and configured
- API key for your preferred LLM provider (configured with litellm)
- Tavily API key
+- Anthropic API key (for MCP integration)
+- Exa API key (for MCP-powered searches)
- Langfuse account for LLM tracking (optional but recommended)
### Installation
@@ -85,7 +90,8 @@ pip install -r requirements.txt
# Set up API keys
export OPENAI_API_KEY=your_openai_key # Or another LLM provider key
export TAVILY_API_KEY=your_tavily_key # For Tavily search (default)
-export EXA_API_KEY=your_exa_key # For Exa search (optional)
+export EXA_API_KEY=your_exa_key # For Exa search and MCP integration (required for MCP)
+export ANTHROPIC_API_KEY=your_anthropic_key # For MCP integration (required)
# Set up Langfuse for LLM tracking (optional)
export LANGFUSE_PUBLIC_KEY=your_public_key
@@ -227,6 +233,31 @@ python run.py --num-results 5 # Get 5 results per sea
python run.py --num-results 10 --search-provider exa # 10 results with Exa
```
+### MCP (Model Context Protocol) Integration
+
+The pipeline includes a powerful MCP integration step that uses Anthropic's Model Context Protocol to perform additional targeted searches. This step runs after the reflection phase and before final report generation, providing an extra layer of research depth.
+
+#### How MCP Works
+
+The MCP step:
+1. Receives the synthesized research data and analysis from previous steps
+2. Uses Claude (via Anthropic API) with MCP tools to identify gaps or areas needing more research
+3. Performs targeted searches using Exa's advanced search capabilities including:
+ - `research_paper_search`: Academic paper and research content
+ - `company_research`: Company website crawling for business information
+ - `competitor_finder`: Find company competitors
+ - `linkedin_search`: Search LinkedIn for companies and people
+ - `wikipedia_search_exa`: Wikipedia article retrieval
+ - `github_search`: GitHub repositories and issues
+
+#### MCP Requirements
+
+To use the MCP integration, you need:
+- `ANTHROPIC_API_KEY`: For accessing Claude with MCP capabilities
+- `EXA_API_KEY`: For the Exa search tools used by MCP
+
+The MCP step uses Claude Sonnet 4.0 (claude-sonnet-4-20250514) which supports the MCP protocol.
+
### Search Providers
The pipeline supports multiple search providers for flexibility and comparison:
@@ -364,6 +395,7 @@ zenml_deep_research/
│ ├── execute_approved_searches_step.py # Execute approved searches
│ ├── generate_reflection_step.py # Generate reflection without execution
│ ├── iterative_reflection_step.py # Legacy combined reflection step
+│ ├── mcp_step.py # MCP integration for additional searches
│ ├── merge_results_step.py
│ ├── process_sub_question_step.py
│ ├── pydantic_final_report_step.py
@@ -421,16 +453,16 @@ query: "Climate change policy debates"
steps:
initial_query_decomposition_step:
parameters:
- llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
+ llm_model: "google/gemini-2.0-flash-lite-001"
cross_viewpoint_analysis_step:
parameters:
- llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
+ llm_model: "google/gemini-2.0-flash-lite-001"
viewpoint_categories: ["scientific", "political", "economic", "social", "ethical", "historical"]
iterative_reflection_step:
parameters:
- llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
+ llm_model: "google/gemini-2.0-flash-lite-001"
max_additional_searches: 2
num_results_per_search: 3
@@ -442,7 +474,7 @@ steps:
pydantic_final_report_step:
parameters:
- llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
+ llm_model: "google/gemini-2.0-flash-lite-001"
# Environment settings
settings:
diff --git a/deep_research/assets/pipeline_visualization.png b/deep_research/assets/pipeline_visualization.png
new file mode 100644
index 000000000..aa57ac24d
Binary files /dev/null and b/deep_research/assets/pipeline_visualization.png differ
diff --git a/deep_research/assets/sample_report.gif b/deep_research/assets/sample_report.gif
new file mode 100644
index 000000000..783638cfc
Binary files /dev/null and b/deep_research/assets/sample_report.gif differ
diff --git a/deep_research/configs/enhanced_research.yaml b/deep_research/configs/enhanced_research.yaml
index 0bfc0a79f..1933efa98 100644
--- a/deep_research/configs/enhanced_research.yaml
+++ b/deep_research/configs/enhanced_research.yaml
@@ -27,11 +27,11 @@ langfuse_project_name: "deep-research"
steps:
initial_query_decomposition_step:
parameters:
- llm_model: "openrouter/google/gemini-2.0-flash-lite-001"
+ llm_model: "openrouter/google/gemini-2.5-flash-preview-05-20"
cross_viewpoint_analysis_step:
parameters:
- llm_model: "openrouter/google/gemini-2.0-flash-lite-001"
+ llm_model: "openrouter/google/gemini-2.5-flash-preview-05-20"
viewpoint_categories:
[
"scientific",
@@ -44,7 +44,7 @@ steps:
generate_reflection_step:
parameters:
- llm_model: "openrouter/google/gemini-2.0-flash-lite-001"
+ llm_model: "openrouter/google/gemini-2.5-flash-preview-05-20"
get_research_approval_step:
parameters:
@@ -53,11 +53,11 @@ steps:
execute_approved_searches_step:
parameters:
- llm_model: "openrouter/google/gemini-2.0-flash-lite-001"
+ llm_model: "openrouter/google/gemini-2.5-flash-preview-05-20"
pydantic_final_report_step:
parameters:
- llm_model: "openrouter/google/gemini-2.0-flash-lite-001"
+ llm_model: "openrouter/google/gemini-2.5-flash-preview-05-20"
# Environment settings
settings:
diff --git a/deep_research/materializers/__init__.py b/deep_research/materializers/__init__.py
index 7009d5572..c51a46714 100644
--- a/deep_research/materializers/__init__.py
+++ b/deep_research/materializers/__init__.py
@@ -8,6 +8,7 @@
from .analysis_data_materializer import AnalysisDataMaterializer
from .approval_decision_materializer import ApprovalDecisionMaterializer
from .final_report_materializer import FinalReportMaterializer
+from .mcp_result_materializer import MCPResultMaterializer
from .prompt_materializer import PromptMaterializer
from .query_context_materializer import QueryContextMaterializer
from .search_data_materializer import SearchDataMaterializer
@@ -23,4 +24,5 @@
"SynthesisDataMaterializer",
"AnalysisDataMaterializer",
"FinalReportMaterializer",
+ "MCPResultMaterializer",
]
diff --git a/deep_research/materializers/mcp_result_materializer.py b/deep_research/materializers/mcp_result_materializer.py
new file mode 100644
index 000000000..d97f3c9c6
--- /dev/null
+++ b/deep_research/materializers/mcp_result_materializer.py
@@ -0,0 +1,236 @@
+"""Materializer for MCPResult with HTML and JSON visualization."""
+
+import json
+import os
+from typing import Dict
+
+from utils.css_utils import get_shared_css_tag
+from utils.pydantic_models import MCPResult
+from zenml.enums import ArtifactType, VisualizationType
+from zenml.io import fileio
+from zenml.materializers import PydanticMaterializer
+
+
+class MCPResultMaterializer(PydanticMaterializer):
+ """Materializer for MCPResult with interactive visualization."""
+
+ ASSOCIATED_TYPES = (MCPResult,)
+ ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA
+
+ def save_visualizations(
+ self, data: MCPResult
+ ) -> Dict[str, VisualizationType]:
+ """Create and save visualizations for the MCPResult.
+
+ Args:
+ data: The MCPResult to visualize
+
+ Returns:
+ Dictionary mapping file paths to visualization types
+ """
+ visualization_path = os.path.join(self.uri, "mcp_result.html")
+ html_content = self._generate_visualization_html(data)
+
+ with fileio.open(visualization_path, "w") as f:
+ f.write(html_content)
+
+ return {visualization_path: VisualizationType.HTML}
+
+ def _generate_visualization_html(self, data: MCPResult) -> str:
+ """Generate HTML visualization for the MCP result.
+
+ Args:
+ data: The MCPResult to visualize
+
+ Returns:
+ HTML string
+ """
+ # Process the raw MCP result for pretty display
+ raw_result_html = ""
+ if data.raw_mcp_result:
+ try:
+ # Try to parse as JSON for pretty formatting
+ parsed_json = json.loads(data.raw_mcp_result)
+ formatted_json = json.dumps(
+ parsed_json, indent=2, ensure_ascii=False
+ )
+ raw_result_html = f"""
+
+
Raw MCP Search Results
+
+
JSON Data
+
{self._escape_html(formatted_json)}
+
+
+ """
+ except json.JSONDecodeError:
+ # If not valid JSON, display as plain text
+ raw_result_html = f"""
+
+
Raw MCP Search Results
+
+
Raw Data
+
{self._escape_html(data.raw_mcp_result)}
+
+
+ """
+ else:
+ raw_result_html = """
+
+
Raw MCP Search Results
+
No raw search results available
+
+ """
+
+ # Display the MCP result (HTML formatted)
+ mcp_result_html = ""
+ if data.mcp_result:
+ mcp_result_html = f"""
+
+
MCP Analysis Summary
+
+ {data.mcp_result}
+
+
+ """
+ else:
+ mcp_result_html = """
+
+
MCP Analysis Summary
+
No analysis summary available
+
+ """
+
+ html = f"""
+
+
+
+ MCP Search Results
+ {get_shared_css_tag()}
+
+
+
+
+
+
+ {mcp_result_html}
+
+ {raw_result_html}
+
+
+
+ """
+
+ return html
+
+ def _escape_html(self, text: str) -> str:
+ """Escape HTML special characters in text.
+
+ Args:
+ text: Text to escape
+
+ Returns:
+ Escaped text safe for HTML display
+ """
+ return (
+ text.replace("&", "&")
+ .replace("<", "<")
+ .replace(">", ">")
+ .replace('"', """)
+ .replace("'", "'")
+ )
diff --git a/deep_research/pipelines/parallel_research_pipeline.py b/deep_research/pipelines/parallel_research_pipeline.py
index fabdc2045..5fada61fb 100644
--- a/deep_research/pipelines/parallel_research_pipeline.py
+++ b/deep_research/pipelines/parallel_research_pipeline.py
@@ -4,6 +4,7 @@
from steps.execute_approved_searches_step import execute_approved_searches_step
from steps.generate_reflection_step import generate_reflection_step
from steps.initialize_prompts_step import initialize_prompts_step
+from steps.mcp_step import mcp_updates_step
from steps.merge_results_step import merge_sub_question_results_step
from steps.process_sub_question_step import process_sub_question_step
from steps.pydantic_final_report_step import pydantic_final_report_step
@@ -142,6 +143,13 @@ def parallelized_deep_research_pipeline(
)
)
+ mcp_results = mcp_updates_step(
+ query_context=query_context,
+ synthesis_data=enhanced_synthesis_data,
+ analysis_data=enhanced_analysis_data,
+ langfuse_project_name=langfuse_project_name,
+ )
+
# Use our new Pydantic-based final report step
pydantic_final_report_step(
query_context=query_context,
@@ -152,6 +160,7 @@ def parallelized_deep_research_pipeline(
executive_summary_prompt=executive_summary_prompt,
introduction_prompt=introduction_prompt,
langfuse_project_name=langfuse_project_name,
+ mcp_results=mcp_results,
after="execute_approved_searches_step",
)
diff --git a/deep_research/requirements.txt b/deep_research/requirements.txt
index 6d2f91669..a925bd6d0 100644
--- a/deep_research/requirements.txt
+++ b/deep_research/requirements.txt
@@ -8,3 +8,4 @@ pydantic>=2.0.0
typing_extensions>=4.0.0
requests
langfuse>=2.0.0
+anthropic>=0.52.2
diff --git a/deep_research/run.py b/deep_research/run.py
index 8db45164e..c7f038bf7 100644
--- a/deep_research/run.py
+++ b/deep_research/run.py
@@ -274,16 +274,16 @@ def main(
logger.info(f"Results per search: {num_results}")
langfuse_project_name = "deep-research" # default
+ query_from_config = None # default
try:
with open(config, "r") as f:
config_data = yaml.safe_load(f)
langfuse_project_name = config_data.get(
"langfuse_project_name", "deep-research"
)
+ query_from_config = config_data.get("query", None)
except Exception as e:
- logger.warning(
- f"Could not load langfuse_project_name from config: {e}"
- )
+ logger.warning(f"Could not load config data: {e}")
# Set up the pipeline with the parallelized version as default
pipeline = parallelized_deep_research_pipeline.with_options(
@@ -291,16 +291,19 @@ def main(
)
# Execute the pipeline
- if query:
+ # Use CLI query if provided, otherwise fall back to config query
+ final_query = query or query_from_config
+
+ if final_query:
logger.info(
- f"Using query: {query} with max {max_sub_questions} parallel sub-questions"
+ f"Using query: {final_query} with max {max_sub_questions} parallel sub-questions"
)
if require_approval:
logger.info(
f"Human approval enabled with {approval_timeout}s timeout"
)
pipeline(
- query=query,
+ query=final_query,
max_sub_questions=max_sub_questions,
require_approval=require_approval,
approval_timeout=approval_timeout,
@@ -312,7 +315,7 @@ def main(
)
else:
logger.info(
- f"Using query from config file with max {max_sub_questions} parallel sub-questions"
+ f"No query provided via CLI or config. Using pipeline default with max {max_sub_questions} parallel sub-questions"
)
if require_approval:
logger.info(
diff --git a/deep_research/steps/cross_viewpoint_step.py b/deep_research/steps/cross_viewpoint_step.py
index cc844bdde..e2f678155 100644
--- a/deep_research/steps/cross_viewpoint_step.py
+++ b/deep_research/steps/cross_viewpoint_step.py
@@ -23,7 +23,7 @@ def cross_viewpoint_analysis_step(
query_context: QueryContext,
synthesis_data: SynthesisData,
viewpoint_analysis_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
viewpoint_categories: List[str] = [
"scientific",
"political",
diff --git a/deep_research/steps/execute_approved_searches_step.py b/deep_research/steps/execute_approved_searches_step.py
index f1eb86255..c849a0c40 100644
--- a/deep_research/steps/execute_approved_searches_step.py
+++ b/deep_research/steps/execute_approved_searches_step.py
@@ -44,7 +44,7 @@ def execute_approved_searches_step(
additional_synthesis_prompt: Prompt,
num_results_per_search: int = 3,
cap_search_length: int = 20000,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
search_provider: str = "tavily",
search_mode: str = "auto",
langfuse_project_name: str = "deep-research",
diff --git a/deep_research/steps/generate_reflection_step.py b/deep_research/steps/generate_reflection_step.py
index 61cf46371..9e4ff8622 100644
--- a/deep_research/steps/generate_reflection_step.py
+++ b/deep_research/steps/generate_reflection_step.py
@@ -27,7 +27,7 @@ def generate_reflection_step(
synthesis_data: SynthesisData,
analysis_data: AnalysisData,
reflection_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> Tuple[
Annotated[AnalysisData, "analysis_data"],
diff --git a/deep_research/steps/mcp_step.py b/deep_research/steps/mcp_step.py
new file mode 100644
index 000000000..c981ffa57
--- /dev/null
+++ b/deep_research/steps/mcp_step.py
@@ -0,0 +1,197 @@
+import json
+import logging
+import os
+from typing import Annotated
+
+from anthropic import Anthropic
+from utils.prompts import MCP_PROMPT
+from utils.pydantic_models import (
+ AnalysisData,
+ MCPResult,
+ QueryContext,
+ SynthesisData,
+)
+from zenml import step
+
+logger = logging.getLogger(__name__)
+
+anthropic = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
+exa_api_key = os.getenv("EXA_API_KEY")
+
+
+def preprocess_data_for_prompt(
+ query_context, synthesis_data: SynthesisData, analysis_data: AnalysisData
+) -> dict:
+ """Preprocess Pydantic objects into JSON strings for prompt injection.
+
+ Args:
+ query_context: Either a QueryContext object or a string containing the query
+ synthesis_data: The synthesis data containing synthesized and enhanced info
+ analysis_data: The analysis data containing viewpoint analysis and reflection metadata
+
+ Returns:
+ A dictionary with preprocessed JSON strings ready for prompt formatting
+ """
+ # Handle query_context - either string or QueryContext object
+ if isinstance(query_context, str):
+ user_query = query_context
+ logger.warning("query_context is a string, not a QueryContext object")
+ else:
+ user_query = query_context.main_query
+
+ # Convert synthesized_info dict to formatted JSON
+ synthesized_info_json = json.dumps(
+ {
+ k: v.model_dump()
+ for k, v in synthesis_data.synthesized_info.items()
+ },
+ indent=2,
+ ensure_ascii=False,
+ )
+
+ # Convert enhanced_info dict to formatted JSON
+ enhanced_info_json = json.dumps(
+ {k: v.model_dump() for k, v in synthesis_data.enhanced_info.items()},
+ indent=2,
+ ensure_ascii=False,
+ )
+
+ # Convert viewpoint_analysis to formatted JSON (handle None case)
+ if analysis_data.viewpoint_analysis:
+ viewpoint_analysis_json = json.dumps(
+ analysis_data.viewpoint_analysis.model_dump(),
+ indent=2,
+ ensure_ascii=False,
+ )
+ else:
+ viewpoint_analysis_json = "No viewpoint analysis available"
+
+ # Convert reflection_metadata to formatted JSON (handle None case)
+ if analysis_data.reflection_metadata:
+ reflection_metadata_json = json.dumps(
+ analysis_data.reflection_metadata.model_dump(),
+ indent=2,
+ ensure_ascii=False,
+ )
+ else:
+ reflection_metadata_json = "No reflection metadata available"
+
+ return {
+ "user_query": user_query,
+ "synthesized_info": synthesized_info_json,
+ "enhanced_info": enhanced_info_json,
+ "viewpoint_analysis": viewpoint_analysis_json,
+ "reflection_metadata": reflection_metadata_json,
+ }
+
+
+@step
+def mcp_updates_step(
+ query_context: QueryContext,
+ synthesis_data: SynthesisData,
+ analysis_data: AnalysisData,
+ langfuse_project_name: str,
+) -> Annotated[MCPResult, "mcp_results"]:
+ """Additional MCP-driven search of Exa.
+
+ This step is used to update the synthesis and analysis data with the results of
+ the MCP tools.
+ """
+ try:
+ # Preprocess all data into JSON strings for the prompt
+ preprocessed_data = preprocess_data_for_prompt(
+ query_context, synthesis_data, analysis_data
+ )
+
+ prompt = MCP_PROMPT.format(**preprocessed_data)
+
+ logger.info(f"Making MCP request for query: {prompt}")
+
+ response = anthropic.beta.messages.create(
+ model="claude-sonnet-4-20250514", # latest version works with MCP
+ max_tokens=3000,
+ messages=[
+ {
+ "role": "user",
+ "content": prompt,
+ }
+ ],
+ mcp_servers=[
+ {
+ "type": "url",
+ "url": f"https://mcp.exa.ai/mcp?exaApiKey={exa_api_key}",
+ "name": "exa-mcp-search",
+ "authorization_token": exa_api_key,
+ "tool_configuration": {
+ "enabled": True,
+ "allowed_tools": [
+ "research_paper_search",
+ "company_research",
+ "competitor_finder",
+ "linkedin_search",
+ "wikipedia_search_exa",
+ "github_search",
+ ],
+ },
+ }
+ ],
+ betas=["mcp-client-2025-04-04"], # needed for MCP functionality
+ )
+
+ # Safe extraction of response content
+ raw_mcp_result = ""
+ mcp_result = ""
+ last_text_index = -1
+ mcp_tool_result_index = -1
+
+ if hasattr(response, "content") and response.content:
+ # First pass: find indices of relevant content
+ for i, content_item in enumerate(response.content):
+ logger.debug(
+ f"Content item {i}: type={getattr(content_item, 'type', 'unknown')}"
+ )
+
+ if hasattr(content_item, "type"):
+ if content_item.type == "mcp_tool_result":
+ mcp_tool_result_index = i
+ elif content_item.type == "text":
+ last_text_index = i
+
+ # Extract MCP tool result if found
+ if mcp_tool_result_index >= 0:
+ content_item = response.content[mcp_tool_result_index]
+ if (
+ hasattr(content_item, "content")
+ and content_item.content
+ and len(content_item.content) > 0
+ and hasattr(content_item.content[0], "text")
+ ):
+ raw_mcp_result = content_item.content[0].text
+ logger.info(
+ f"Found raw MCP result at index {mcp_tool_result_index}"
+ )
+
+ # Extract the last text result if found
+ if last_text_index >= 0:
+ content_item = response.content[last_text_index]
+ if hasattr(content_item, "text"):
+ mcp_result = content_item.text
+ logger.info(
+ f"Found MCP text result at index {last_text_index} (last text item)"
+ )
+
+ if not raw_mcp_result and not mcp_result:
+ logger.warning("No MCP results found in response")
+
+ return MCPResult(
+ raw_mcp_result=raw_mcp_result,
+ mcp_result=mcp_result,
+ )
+
+ except Exception as e:
+ logger.error(f"Error in MCP updates step: {str(e)}", exc_info=True)
+ # Return empty results on error rather than crashing
+ return MCPResult(
+ raw_mcp_result="",
+ mcp_result=f"Error occurred: {str(e)}",
+ )
diff --git a/deep_research/steps/process_sub_question_step.py b/deep_research/steps/process_sub_question_step.py
index 7ff14ab1d..826578022 100644
--- a/deep_research/steps/process_sub_question_step.py
+++ b/deep_research/steps/process_sub_question_step.py
@@ -40,8 +40,8 @@ def process_sub_question_step(
search_query_prompt: Prompt,
synthesis_prompt: Prompt,
question_index: int,
- llm_model_search: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
- llm_model_synthesis: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model_search: str = "openrouter/google/gemini-2.0-flash-lite-001",
+ llm_model_synthesis: str = "openrouter/google/gemini-2.0-flash-lite-001",
num_results_per_search: int = 3,
cap_search_length: int = 20000,
search_provider: str = "tavily",
diff --git a/deep_research/steps/pydantic_final_report_step.py b/deep_research/steps/pydantic_final_report_step.py
index 339d42fb2..0ee82fc8a 100644
--- a/deep_research/steps/pydantic_final_report_step.py
+++ b/deep_research/steps/pydantic_final_report_step.py
@@ -22,6 +22,7 @@
from utils.pydantic_models import (
AnalysisData,
FinalReport,
+ MCPResult,
Prompt,
QueryContext,
SearchData,
@@ -33,6 +34,28 @@
logger = logging.getLogger(__name__)
+def extract_mcp_content(mcp_results: MCPResult) -> str:
+ """Extract the HTML content from MCPResult object.
+
+ Args:
+ mcp_results: The MCPResult object from the MCP step
+
+ Returns:
+ The HTML-formatted MCP result, or empty string if not available
+ """
+ if not mcp_results:
+ return ""
+
+ # Prefer the processed mcp_result over raw_mcp_result
+ if mcp_results.mcp_result:
+ return mcp_results.mcp_result
+ elif mcp_results.raw_mcp_result:
+ # If only raw result is available, wrap it in basic HTML
+ return f"Raw MCP search results:
{html.escape(mcp_results.raw_mcp_result)}"
+ else:
+ return ""
+
+
def clean_html_output(html_content: str) -> str:
"""Clean HTML output from LLM to ensure proper rendering.
@@ -172,7 +195,8 @@ def generate_executive_summary(
synthesis_data: SynthesisData,
analysis_data: AnalysisData,
executive_summary_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ mcp_results: MCPResult,
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> str:
"""Generate an executive summary using LLM based on the complete research findings.
@@ -184,6 +208,7 @@ def generate_executive_summary(
executive_summary_prompt: Prompt for generating executive summary
llm_model: The model to use for generation
langfuse_project_name: Name of the Langfuse project for tracking
+ mcp_results: The results from the MCP tools
Returns:
HTML formatted executive summary
@@ -196,6 +221,7 @@ def generate_executive_summary(
"sub_questions": query_context.sub_questions,
"key_findings": {},
"viewpoint_analysis": None,
+ "mcp_results": extract_mcp_content(mcp_results),
}
# Include key findings from synthesis data
@@ -258,7 +284,8 @@ def generate_executive_summary(
def generate_introduction(
query_context: QueryContext,
introduction_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ mcp_results: MCPResult,
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> str:
"""Generate an introduction using LLM based on research query and sub-questions.
@@ -275,10 +302,11 @@ def generate_introduction(
logger.info("Generating introduction using LLM")
# Prepare the context
- context = f"Main Research Query: {query_context.main_query}\n\n"
- context += "Sub-questions being explored:\n"
+ context = f"## Main Research Query: \n\n{query_context.main_query}\n\n"
+ context += "## Sub-questions being explored:\n"
for i, sub_question in enumerate(query_context.sub_questions, 1):
- context += f"{i}. {sub_question}\n"
+ context += f"### {i}. {sub_question}\n"
+ context += f"## Additional Exa MCP Results: \n\n{extract_mcp_content(mcp_results)}\n"
try:
# Call LLM to generate introduction
@@ -349,7 +377,8 @@ def generate_conclusion(
synthesis_data: SynthesisData,
analysis_data: AnalysisData,
conclusion_generation_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ mcp_results: MCPResult,
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> str:
"""Generate a comprehensive conclusion using LLM based on all research findings.
@@ -361,6 +390,7 @@ def generate_conclusion(
conclusion_generation_prompt: Prompt for generating conclusion
llm_model: The model to use for conclusion generation
langfuse_project_name: Name of the Langfuse project for tracking
+ mcp_results: The results from the MCP tools
Returns:
str: HTML-formatted conclusion content
@@ -372,6 +402,7 @@ def generate_conclusion(
"main_query": query_context.main_query,
"sub_questions": query_context.sub_questions,
"enhanced_info": {},
+ "mcp_results": extract_mcp_content(mcp_results),
}
# Include enhanced information for each sub-question
@@ -487,7 +518,8 @@ def generate_report_from_template(
conclusion_generation_prompt: Prompt,
executive_summary_prompt: Prompt,
introduction_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ mcp_results: MCPResult,
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> str:
"""Generate a final HTML report from a static template.
@@ -504,6 +536,7 @@ def generate_report_from_template(
executive_summary_prompt: Prompt for generating executive summary
introduction_prompt: Prompt for generating introduction
llm_model: The model to use for conclusion generation
+ mcp_results: The results from the MCP tools
langfuse_project_name: Name of the Langfuse project for tracking
Returns:
@@ -699,6 +732,7 @@ def generate_report_from_template(
synthesis_data,
analysis_data,
executive_summary_prompt,
+ mcp_results,
llm_model,
langfuse_project_name,
)
@@ -709,7 +743,11 @@ def generate_report_from_template(
# Generate dynamic introduction using LLM
logger.info("Generating dynamic introduction...")
introduction_html = generate_introduction(
- query_context, introduction_prompt, llm_model, langfuse_project_name
+ query_context,
+ introduction_prompt,
+ mcp_results,
+ llm_model,
+ langfuse_project_name,
)
logger.info(f"Introduction generated: {len(introduction_html)} characters")
@@ -719,6 +757,7 @@ def generate_report_from_template(
synthesis_data,
analysis_data,
conclusion_generation_prompt,
+ mcp_results,
llm_model,
langfuse_project_name,
)
@@ -958,8 +997,9 @@ def pydantic_final_report_step(
conclusion_generation_prompt: Prompt,
executive_summary_prompt: Prompt,
introduction_prompt: Prompt,
+ mcp_results: MCPResult,
use_static_template: bool = True,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
langfuse_project_name: str = "deep-research",
) -> Tuple[
Annotated[FinalReport, "final_report"],
@@ -979,6 +1019,7 @@ def pydantic_final_report_step(
introduction_prompt: Prompt for generating introduction
use_static_template: Whether to use a static template instead of LLM generation
llm_model: The model to use for report generation with provider prefix
+ mcp_results: The results from the MCP tools
langfuse_project_name: Name of the Langfuse project for tracking
Returns:
@@ -1000,6 +1041,7 @@ def pydantic_final_report_step(
conclusion_generation_prompt,
executive_summary_prompt,
introduction_prompt,
+ mcp_results,
llm_model,
langfuse_project_name,
)
diff --git a/deep_research/steps/query_decomposition_step.py b/deep_research/steps/query_decomposition_step.py
index 053a8fd4c..b74a99af6 100644
--- a/deep_research/steps/query_decomposition_step.py
+++ b/deep_research/steps/query_decomposition_step.py
@@ -14,7 +14,7 @@
def initial_query_decomposition_step(
main_query: str,
query_decomposition_prompt: Prompt,
- llm_model: str = "sambanova/DeepSeek-R1-Distill-Llama-70B",
+ llm_model: str = "openrouter/google/gemini-2.0-flash-lite-001",
max_sub_questions: int = 8,
langfuse_project_name: str = "deep-research",
) -> Annotated[QueryContext, "query_context"]:
diff --git a/deep_research/utils/llm_utils.py b/deep_research/utils/llm_utils.py
index 54dd27eed..3bc7ba2d5 100644
--- a/deep_research/utils/llm_utils.py
+++ b/deep_research/utils/llm_utils.py
@@ -12,8 +12,11 @@
logger = logging.getLogger(__name__)
# This module uses litellm for all LLM interactions
-# Models are specified with a provider prefix (e.g., "sambanova/DeepSeek-R1-Distill-Llama-70B")
-# ALL model names require a provider prefix (e.g., "sambanova/", "openai/", "anthropic/")
+# Models are specified with a provider prefix:
+# - For Google Gemini via OpenRouter: "openrouter/google/gemini-2.0-flash-lite-001"
+# - For direct Google Gemini API: "gemini/gemini-2.0-flash-lite-001"
+# - For other providers: "sambanova/", "openai/", "anthropic/", "meta/", "aws/"
+# ALL model names require a provider prefix
litellm.callbacks = ["langfuse"]
@@ -126,20 +129,23 @@ def run_llm_completion(
"""
try:
# Ensure model name has provider prefix
- if not any(
+ # Special handling for OpenRouter models which have a nested provider
+ if model.startswith("openrouter/"):
+ # OpenRouter models are valid (e.g., openrouter/google/gemini-2.0-flash-lite-001)
+ pass
+ elif not any(
model.startswith(prefix + "/")
for prefix in [
"sambanova",
"openai",
"anthropic",
"meta",
- "google",
+ "gemini", # Direct Google Gemini API
"aws",
- "openrouter",
]
):
# Raise an error if no provider prefix is specified
- error_msg = f"Model '{model}' does not have a provider prefix. Please specify provider (e.g., 'sambanova/{model}')"
+ error_msg = f"Model '{model}' does not have a provider prefix. Please specify provider (e.g., 'gemini/{model}', 'openrouter/{model}')"
logger.error(error_msg)
raise ValueError(error_msg)
@@ -317,20 +323,23 @@ def find_most_relevant_string(
if model:
try:
# Ensure model name has provider prefix
- if not any(
+ # Special handling for OpenRouter models which have a nested provider
+ if model.startswith("openrouter/"):
+ # OpenRouter models are valid (e.g., openrouter/google/gemini-2.0-flash-lite-001)
+ pass
+ elif not any(
model.startswith(prefix + "/")
for prefix in [
"sambanova",
"openai",
"anthropic",
"meta",
- "google",
+ "gemini", # Direct Google Gemini API
"aws",
- "openrouter",
]
):
# Raise an error if no provider prefix is specified
- error_msg = f"Model '{model}' does not have a provider prefix. Please specify provider (e.g., 'sambanova/{model}')"
+ error_msg = f"Model '{model}' does not have a provider prefix. Please specify provider (e.g., 'gemini/{model}', 'openrouter/{model}')"
logger.error(error_msg)
raise ValueError(error_msg)
diff --git a/deep_research/utils/prompts.py b/deep_research/utils/prompts.py
index ced42f6fb..00cd62634 100644
--- a/deep_research/utils/prompts.py
+++ b/deep_research/utils/prompts.py
@@ -1433,3 +1433,65 @@
"""
+
+MCP_PROMPT = """This is the final stage in a multi-step research-pipeline.
+
+You will be given the following information:
+
+- the original user query
+- written text synthesis that was generated based on the search data
+- analysis data containing reflection and viewpoint analysis
+
+You have the following tools available to you:
+
+- web_search_exa: Real-time web search with content extraction
+- research_paper_search: Academic paper and research content search
+- company_research: Company website crawling for business information
+- crawling: Extract content from specific URLs
+- competitor_finder: Find company competitors
+- linkedin_search: Search LinkedIn for companies and people
+- wikipedia_search_exa: Retrieve information from Wikipedia articles
+- github_search: Search GitHub repositories and issues
+
+Please use the tools to search for anything you feel might still be needed to
+answer or to round out the research. The results of what you find will be passed
+to the final report generation and summarization step.
+
+## User Query
+
+{user_query}
+
+
+## Synthesis Data
+
+### Synthesized Info
+
+{synthesized_info}
+
+
+### Enhanced Info
+
+{enhanced_info}
+
+
+## Analysis Data
+
+### Viewpoint Analysis
+
+{viewpoint_analysis}
+
+
+### Reflection Metadata
+
+{reflection_metadata}
+
+
+Now please use the tools to search for anything you feel might still be needed to
+answer or to round out the research. The results of what you find will be passed
+to the final report generation and summarization step.
+
+Format your output as a well-structured conclusion section in HTML format with
+appropriate paragraph breaks and formatting. Use tags for paragraphs and
+organize the content logically with clear transitions between the different
+aspects outlined above.
+"""
diff --git a/deep_research/utils/pydantic_models.py b/deep_research/utils/pydantic_models.py
index 822afe991..4b667b312 100644
--- a/deep_research/utils/pydantic_models.py
+++ b/deep_research/utils/pydantic_models.py
@@ -501,3 +501,20 @@ class FinalReport(BaseModel):
"frozen": False,
"validate_assignment": True,
}
+
+
+class MCPResult(BaseModel):
+ """Contains the result of the MCP update."""
+
+ raw_mcp_result: str = Field(
+ default="", description="The raw output from the Exa search"
+ )
+ mcp_result: str = Field(
+ default="", description="The LLM-processed result of the Exa search"
+ )
+
+ model_config = {
+ "extra": "ignore",
+ "frozen": False,
+ "validate_assignment": True,
+ }
diff --git a/huggingface-sagemaker/utils/misc.py b/huggingface-sagemaker/utils/misc.py
index a76229496..a4bf2437b 100644
--- a/huggingface-sagemaker/utils/misc.py
+++ b/huggingface-sagemaker/utils/misc.py
@@ -32,7 +32,7 @@ def compute_metrics(eval_pred: tuple) -> Dict[str, float]:
"""
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
- # calculate the mertic using the predicted and true value
+ # calculate the metric using the predicted and true value
accuracy = load_metric("accuracy").compute(
predictions=predictions, references=labels
)