A sophisticated AI agent system built with LangGraph that combines document retrieval, weather services, and intelligent query processing. The pipeline automatically determines user intent and routes queries to appropriate services while maintaining conversation context.
- Multi-Modal Query Processing: Handles weather queries, document Q&A, and general conversations
- Smart Intent Detection: Automatically classifies user queries and routes to appropriate handlers
- Document Retrieval: PDF processing with vector database storage using ChromaDB
- Weather Integration: Real-time weather data retrieval
- Multiple Interfaces: Command-line, interactive mode, and Streamlit web UI
- LangSmith Integration: Built-in observability and debugging
- Modular Architecture: Clean separation of services and pipeline logic
- Python 3.8+
- OpenAI API key
- Weather API key (OpenWeatherMap)
- Git
- Clone the repository
git clone https://github.com/avinash00134/ai-agent-pipeline
cd ai-agent-pipeline
- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies
pip install -r requirements.txt
Create a .env
file in the root directory with the following variables:
# Required API Keys
OPENAI_API_KEY=your_openai_api_key_here
WEATHER_API_KEY=your_weather_api_key_here
# Optional: LangSmith (for debugging and observability)
LANGCHAIN_API_KEY=your_langsmith_api_key
LANGCHAIN_PROJECT=ai-agent-pipeline
LANGCHAIN_TRACING_V2=true
# Model Configuration (optional - defaults provided)
LLM_MODEL=gpt-3.5-turbo
EMBEDDING_MODEL=all-MiniLM-L6-v2
# Vector Database (optional - defaults provided)
CHROMA_PERSIST_DIRECTORY=./chroma_db
- OpenAI API Key: Get from OpenAI Platform
- Weather API Key: Get from OpenWeatherMap or similar service
- LangSmith API Key (optional): Get from LangSmith
Check configuration status:
python main.py --config
Load PDF documents into vector database:
python main.py --load-pdfs document1.pdf document2.pdf folder/*.pdf
Process a single query:
python main.py --query "What's the weather in London?"
Interactive mode:
python main.py --interactive
# or simply
python main.py
Launch Streamlit web interface:
python main.py --streamlit
🤖 AI Agent Pipeline - Interactive Mode
==================================================
You can ask about:
🌤️ Weather: 'What's the weather in London?'
📄 Documents: 'What does this document say about...?'
💬 General: Any other questions
Type 'quit' to exit
==================================================
💬 Your question: What's the weather like in Paris?
🔄 Processing...
🤖 Response: The current weather in Paris is 22°C with clear skies...
🎯 Intent: weather
🌤️ Weather data retrieved for: Paris
The Streamlit interface provides a user-friendly web UI for interacting with the pipeline:
streamlit run streamlit_app.py
ai-agent-pipeline/
├── src/
│ ├── pipeline/
│ │ └── langgraph_pipeline.py
│ ├── services/
│ │ ├── pdf_service.py
│ │ ├── vector_service.py
│ │ └── weather_service.py
│ └── config.py
├── tests/
│ ├── test_pipeline/
│ │ └── test_langgraph_pipeline.py
│ ├── test_services/
│ │ ├── test_pdf_service.py
│ │ ├── test_vector_service.py
│ │ └── test_weather_service.py
│ └── conftest.py
├── main.py
├── streamlit_app.py
├── setup.py
└── README.md
The system uses LangGraph to create a stateful pipeline that:
- Intent Classification: Determines if query is weather-related, document-related, or general
- Service Routing: Routes to appropriate service (Weather, Vector DB, or LLM)
- Response Generation: Combines retrieved data with LLM-generated responses
- Context Management: Maintains conversation state across interactions
- PDFService: Extracts and chunks text from PDF documents
- VectorService: Manages ChromaDB for document storage and retrieval
- WeatherService: Fetches real-time weather data
- AIAgentPipeline: Orchestrates the entire workflow
# These queries will be routed to weather service
"What's the weather in Tokyo?"
"Is it raining in Seattle?"
"Show me the forecast for New York"
# These queries will search your loaded documents
"What does the document say about machine learning?"
"Summarize the key findings from the research paper"
"Find information about data privacy policies"
# These will be handled by the general LLM
"Explain quantum computing"
"Write a poem about spring"
"Help me plan a vacation"
If you've configured LangSmith, you can monitor pipeline execution:
- Visit LangSmith
- Navigate to your project (configured in
LANGCHAIN_PROJECT
) - View detailed traces of pipeline execution, including:
- Intent classification decisions
- Service routing
- Retrieval results
- Response generation
Configuration Errors:
python main.py --config
Use this command to verify all API keys are properly set.
Vector Database Issues:
# Reset the vector database
python -c "from src.services.vector_service import VectorService; VectorService().reset_collection()"
PDF Processing Errors:
- Ensure PDFs are not password-protected
- Check file permissions
- Verify PDF files are not corrupted
❌ Configuration error
: Missing or invalid API keys❌ PDF file not found
: Check file paths❌ Failed to initialize pipeline
: Usually configuration-related
Key dependencies include:
langchain
- LLM frameworklanggraph
- Stateful pipeline orchestrationstreamlit
- Web interfacechromadb
- Vector databaseopenai
- OpenAI API integrationrequests
- HTTP requests for weather APIpypdf
- PDF processingpython-dotenv
- Environment variable management
If you encounter issues or have questions:
- Check the troubleshooting section above
- Review the configuration with
python main.py --config
- Check the logs for detailed error messages
- Open an issue on GitHub with:
- Your configuration (without API keys)
- Error messages
- Steps to reproduce