An intelligent research assistant powered by LangGraph, Tavily Search, and OpenAI that automatically gathers, analyzes, and synthesizes information on any topic.
- Automated Research Pipeline: Seamlessly orchestrates web search and AI analysis
- Intelligent Information Synthesis: Uses GPT-4 to extract key insights from multiple sources
- Structured Output: Delivers research findings in a clean, organized format
- Source Attribution: Maintains references to all information sources
- Type-Safe Architecture: Built with Pydantic models for robust data validation
The agent is built as a LangGraph workflow with two main nodes:
βββββββββββββββ ββββββββββββββββ βββββββββββ
β Search β ββββΊ β Analyze β ββββΊ β END β
β (Tavily) β β (OpenAI) β βββββββββββ
βββββββββββββββ ββββββββββββββββ
- Search Node: Queries Tavily API to find the 5 most relevant sources for your research topic
- Analyze Node: Feeds all sources to GPT-4, which extracts key insights and creates a comprehensive summary
- Output: Returns a structured
ResearchReportwith insights, summary, and references
- Python 3.12
- OpenAI API key
- Tavily API key
- Clone the repository:
git clone <your-repo-url>
cd langgraph-research-agent- Install dependencies:
pip install langgraph tavily-python openai pydantic python-dotenv- Create a
.envfile in the project root:
OPENAI_API_KEY=your_openai_api_key_here
TAVILY_API_KEY=your_tavily_api_key_hereRun the agent:
python agent.pyEnter your research topic when prompted:
Enter your research topic: Recent advances in quantum computing
The agent will:
- Search for relevant information
- Analyze and synthesize findings
- Present key insights with source references
π Searching for relevant information...
β
Research Summary:
π§© Topic: Recent advances in quantum computing
π Key Insights:
1. Google's Willow chip achieved quantum error correction milestone
2. IBM expanded its quantum computing cloud platform
3. Quantum algorithms showing promise in drug discovery
4. Major tech companies investing billions in quantum research
5. Practical applications expected within 3-5 years
π§ Summary:
Recent developments in quantum computing show significant progress...
π References:
- Google Announces Quantum Breakthrough
https://example.com/google-quantum
- IBM Quantum Platform Update
https://example.com/ibm-quantum
...
.
βββ app/
β βββ __init__.py # Exports `agent`, `build_graph`, and data models
β βββ graph.py # Assembles the LangGraph and compiles the agent
β βββ state.py # AgentState and Pydantic models
β βββ nodes/
β βββ search.py # Tavily search node
β βββ validate.py # Credibility scoring and validation node
β βββ analyze.py # OpenAI summarization/analysis node
βββ streamlit_app.py # Web UI
βββ research_agent.py # Legacy monolithic module (kept for compatibility)
βββ .env # API keys (not committed)
βββ pyproject.toml # Python dependencies and metadata
βββ README.md # This file
Using the agent in code
from app import agent # pre-compiled graph
# or build it yourself
from app import build_graph
agent = build_graph().compile()
result = agent.invoke({"input_query": "Recent advances in quantum computing"})Modify the max_results parameter in the research_search function:
data = tavily.search(query, max_results=10) # Get more sourcesUpdate the model in the analyze_research function:
response = client.chat.completions.create(
model="gpt-4o", # Use a different model
messages=[{"role": "user", "content": prompt}],
)Customize the analysis style by modifying the prompt template in analyze_research:
prompt = f"""
You are an expert researcher specializing in {domain}.
Provide a technical analysis of the following sources...
"""ResearchSource: Represents a single information source
title: Source titleurl: Source URLcontent: Extracted content
ResearchReport: Final output structure
topic: Research querykey_insights: List of main findingssummary: Synthesized overviewreferences: List of sources
AgentState: Graph state container
input_query: User's research topicsources: Retrieved sourcesreport: Generated report
research_search(state): Tavily search integration
- Queries Tavily API with the input topic
- Extracts and structures search results
- Updates state with
sources
analyze_research(state): AI-powered analysis
- Combines all source content
- Generates structured insights using GPT-4
- Creates final
ResearchReport
def validate_sources(state: AgentState) -> AgentState:
"""Filter sources by relevance score."""
filtered = [s for s in state["sources"] if meets_criteria(s)]
return {"sources": filtered}
graph.add_node("validate", validate_sources)
graph.add_edge("search", "validate")
graph.add_edge("validate", "analyze")def should_continue(state: AgentState) -> str:
if len(state["sources"]) < 3:
return "search" # Re-search if insufficient sources
return "analyze"
graph.add_conditional_edges("search", should_continue)API Key Errors: Ensure your .env file contains valid API keys and is in the project root.
Import Errors: Verify all dependencies are installed with pip install -r requirements.txt
Model Not Found: Make sure you're using a valid OpenAI model name (e.g., gpt-4o-mini, gpt-4o)
Empty Results: Check that your Tavily API key has sufficient credits and the topic isn't too obscure
MIT License - feel free to use this project however you'd like!
Contributions are welcome! Feel free to open issues or submit pull requests.
- Add support for multiple search providers
- Implement source credibility scoring
- Add export functionality (PDF, Markdown)
- Create web interface with Streamlit
- Add multi-language support
- Implement citation formatting (APA, MLA, Chicago)
- Add research history and caching
Built with β€οΈ using LangGraph, Tavily, and OpenAI
A simple Streamlit web app is included to run the research agent with a UI.
streamlit run streamlit_app.pyThen open the URL shown in the terminal (usually http://localhost:8501).
The app will prompt for a topic, run the research pipeline, and display:
- Key insights
- A detailed summary
- Credibility metrics (high/medium/low) for sources
- Expandable reference cards with URLs, scores, and reasoning