Skip to content

vanalex/researcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”¬ LangGraph Research Agent

An intelligent research assistant powered by LangGraph, Tavily Search, and OpenAI that automatically gathers, analyzes, and synthesizes information on any topic.

✨ Features

  • Automated Research Pipeline: Seamlessly orchestrates web search and AI analysis
  • Intelligent Information Synthesis: Uses GPT-4 to extract key insights from multiple sources
  • Structured Output: Delivers research findings in a clean, organized format
  • Source Attribution: Maintains references to all information sources
  • Type-Safe Architecture: Built with Pydantic models for robust data validation

πŸ—οΈ Architecture

The agent is built as a LangGraph workflow with two main nodes:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Search    β”‚ ───► β”‚   Analyze    β”‚ ───► β”‚   END   β”‚
β”‚  (Tavily)   β”‚      β”‚  (OpenAI)    β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How It Works

  1. Search Node: Queries Tavily API to find the 5 most relevant sources for your research topic
  2. Analyze Node: Feeds all sources to GPT-4, which extracts key insights and creates a comprehensive summary
  3. Output: Returns a structured ResearchReport with insights, summary, and references

πŸš€ Getting Started

Prerequisites

  • Python 3.12
  • OpenAI API key
  • Tavily API key

Installation

  1. Clone the repository:
git clone <your-repo-url>
cd langgraph-research-agent
  1. Install dependencies:
pip install langgraph tavily-python openai pydantic python-dotenv
  1. Create a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here

Usage

Run the agent:

python agent.py

Enter your research topic when prompted:

Enter your research topic: Recent advances in quantum computing

The agent will:

  • Search for relevant information
  • Analyze and synthesize findings
  • Present key insights with source references

Example Output

πŸ” Searching for relevant information...

βœ… Research Summary:

🧩 Topic: Recent advances in quantum computing

πŸ“Œ Key Insights:
  1. Google's Willow chip achieved quantum error correction milestone
  2. IBM expanded its quantum computing cloud platform
  3. Quantum algorithms showing promise in drug discovery
  4. Major tech companies investing billions in quantum research
  5. Practical applications expected within 3-5 years

🧠 Summary:
 Recent developments in quantum computing show significant progress...

πŸ“š References:
 - Google Announces Quantum Breakthrough
   https://example.com/google-quantum
 - IBM Quantum Platform Update
   https://example.com/ibm-quantum
...

πŸ“¦ Project Structure

.
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py         # Exports `agent`, `build_graph`, and data models
β”‚   β”œβ”€β”€ graph.py            # Assembles the LangGraph and compiles the agent
β”‚   β”œβ”€β”€ state.py            # AgentState and Pydantic models
β”‚   └── nodes/
β”‚       β”œβ”€β”€ search.py       # Tavily search node
β”‚       β”œβ”€β”€ validate.py     # Credibility scoring and validation node
β”‚       └── analyze.py      # OpenAI summarization/analysis node
β”œβ”€β”€ streamlit_app.py        # Web UI
β”œβ”€β”€ research_agent.py       # Legacy monolithic module (kept for compatibility)
β”œβ”€β”€ .env                    # API keys (not committed)
β”œβ”€β”€ pyproject.toml          # Python dependencies and metadata
└── README.md               # This file

Using the agent in code

from app import agent  # pre-compiled graph
# or build it yourself
from app import build_graph
agent = build_graph().compile()
result = agent.invoke({"input_query": "Recent advances in quantum computing"})

πŸ”§ Configuration

Customizing Search Results

Modify the max_results parameter in the research_search function:

data = tavily.search(query, max_results=10)  # Get more sources

Changing the AI Model

Update the model in the analyze_research function:

response = client.chat.completions.create(
    model="gpt-4o",  # Use a different model
    messages=[{"role": "user", "content": prompt}],
)

Adjusting the Analysis Prompt

Customize the analysis style by modifying the prompt template in analyze_research:

prompt = f"""
You are an expert researcher specializing in {domain}.
Provide a technical analysis of the following sources...
"""

🧩 Core Components

Data Models

ResearchSource: Represents a single information source

  • title: Source title
  • url: Source URL
  • content: Extracted content

ResearchReport: Final output structure

  • topic: Research query
  • key_insights: List of main findings
  • summary: Synthesized overview
  • references: List of sources

AgentState: Graph state container

  • input_query: User's research topic
  • sources: Retrieved sources
  • report: Generated report

Node Functions

research_search(state): Tavily search integration

  • Queries Tavily API with the input topic
  • Extracts and structures search results
  • Updates state with sources

analyze_research(state): AI-powered analysis

  • Combines all source content
  • Generates structured insights using GPT-4
  • Creates final ResearchReport

πŸ› οΈ Extending the Agent

Adding New Nodes

def validate_sources(state: AgentState) -> AgentState:
    """Filter sources by relevance score."""
    filtered = [s for s in state["sources"] if meets_criteria(s)]
    return {"sources": filtered}

graph.add_node("validate", validate_sources)
graph.add_edge("search", "validate")
graph.add_edge("validate", "analyze")

Adding Conditional Routing

def should_continue(state: AgentState) -> str:
    if len(state["sources"]) < 3:
        return "search"  # Re-search if insufficient sources
    return "analyze"

graph.add_conditional_edges("search", should_continue)

πŸ› Troubleshooting

API Key Errors: Ensure your .env file contains valid API keys and is in the project root.

Import Errors: Verify all dependencies are installed with pip install -r requirements.txt

Model Not Found: Make sure you're using a valid OpenAI model name (e.g., gpt-4o-mini, gpt-4o)

Empty Results: Check that your Tavily API key has sufficient credits and the topic isn't too obscure

πŸ“š Resources

πŸ“„ License

MIT License - feel free to use this project however you'd like!

🀝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

πŸ’‘ Future Enhancements

  • Add support for multiple search providers
  • Implement source credibility scoring
  • Add export functionality (PDF, Markdown)
  • Create web interface with Streamlit
  • Add multi-language support
  • Implement citation formatting (APA, MLA, Chicago)
  • Add research history and caching

Built with ❀️ using LangGraph, Tavily, and OpenAI

πŸ–₯️ Web Interface (Streamlit)

A simple Streamlit web app is included to run the research agent with a UI.

Run the app

streamlit run streamlit_app.py

Then open the URL shown in the terminal (usually http://localhost:8501).

The app will prompt for a topic, run the research pipeline, and display:

  • Key insights
  • A detailed summary
  • Credibility metrics (high/medium/low) for sources
  • Expandable reference cards with URLs, scores, and reasoning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages