A sophisticated AI-powered personal assistant that combines Retrieval-Augmented Generation (RAG) with a multi-agent workflow to provide intelligent, context-aware responses based on personal knowledge bases including Obsidian vaults and PDF documents.
docker-demo.mp4
๐ Enhanced Multi-Agent RAG System
- Early Safety Validation: Questions are now evaluated for safety before processing, ensuring appropriate content filtering
- Content Quality Ranking: Automatic evaluation of retrieved information quality with multi-attempt retrieval for better answers
- Professional Response Formatting: Final answers are now polished with clear structure, proper formatting, and accessibility improvements
- Improved Error Handling: More robust error recovery and user-friendly error messages
๐ Web API & Containerization
- FastAPI Web Interface: Full REST API with session management and health checks
- Docker Support: Complete containerization with optimized Dockerfile and docker-compose
- AWS Lambda Ready: Mangum integration for serverless deployment
- Interactive Web Chat: HTML chat interface for browser-based interaction
๐ง Better Project Organization
- Modular Architecture: Clean separation between CLI and API interfaces
- Enhanced Configuration: Flexible config.json with multiple fallback paths
- Better Developer Experience: Clear separation between application logic, agents, and data management
What This Means for You:
- More reliable and safe responses to your questions
- Higher quality answers with better information filtering
- Professional-looking responses with proper formatting
- Multiple deployment options (CLI, Web API, Docker)
- Easier to customize and extend the system
- More stable and maintainable application
Getting Started with the Latest Version:
uv run src/main.py- Multi-Agent Workflow: Safety validation โ Assistant โ Retriever โ Ranker โ PR processing
- RAG Integration: Retrieval-Augmented Generation powered by Chroma vector database
- Early Safety Validation: Question safety evaluation before processing
- Content Quality Ranking: Automatic evaluation of retrieved content quality
- Obsidian & PDF Integration: Seamlessly indexes and retrieves information from Obsidian vaults and PDF documents
- Interactive Interfaces: Command-line interface and web API with real-time feedback
- LangGraph Integration: Multi-agent workflow orchestration
- Google Gemini Models: State-of-the-art LLM and embedding models
- FastAPI Web Server: Modern async web framework with automatic docs
- Docker Containerization: Ready-to-deploy containerized application
- Modular Architecture: Clean separation of concerns with dedicated modules
- Error Handling: Robust error recovery and graceful degradation
- Progress Indicators: Real-time status updates during processing
- UV Package Management: Fast and reliable dependency management
- Python 3.12+
- UV package manager
- Google Gemini API key
-
Clone the repository
git clone https://github.com/coletangsy/personal-rag-assistant.git cd personal-rag-assistant -
Install dependencies
uv sync
-
Set up your environment Create a
.envfile:GOOGLE_API_KEY=<your_google_api_key_here>
-
Configure your knowledge sources Edit
config.json:{ "llm": { "model": "gemini-2.5-flash", "temperature": 0 }, "vector_store": { "persist_directory": "./data/", "collection_name": "obsidian", "embedding_model": "models/gemini-embedding-001" }, "pdf": { "path": "" }, "obsidian": { "path": "/path/to/your/obsidian/vault" }, "retriever": { "search_type": "similarity", "k": 3 }, "text_splitter": { "chunk_size": 1000, "chunk_overlap": 200 } } -
Run the application
-
Option 1: Command Line Interface
uv run -m src/rag_app/main.py
-
Option 2: Web API Server
uv run src/app_api_handler.py
Then visit: http://localhost:8000/docs for API documentation
-
Option 3: Docker Container
# Build and run with Docker docker build -t personal-rag-assistant . docker run -p 8000:8000 -e GOOGLE_API_KEY=<your_key> personal-rag-assistant # Or use docker-compose docker-compose up
personal-rag-assistant/
โโโ ๐ src/ # Main source code
โ โโโ rag_app/ # Core RAG application
โ โ โโโ main.py # CLI application entry point
โ โ โโโ agents.py # All agent functions (safety, assistant, ranker, PR)
โ โ โโโ retriever_manager.py # Vector database and document processing
โ โโโ app_api_handler.py # FastAPI web server and API endpoints
โ โโโ data/ # Data storage (vector database, documents) ignored
โ โโโ chroma.sqlite3 # Chroma vector database
โ โโโ vector_indices/ # Vector index files
โ โโโ harrypotter.pdf # Sample PDF document
โโโ chat_interface.html # Web chat interface
โโโ config.json # Application configuration
โโโ Dockerfile # Container configuration
โโโ docker-compose.yml # Multi-container setup
โโโ pyproject.toml # Project dependencies (uv)
โโโ uv.lock # Dependency lock file
โโโ .env # Environment variables
โโโ README.md # Project documentation
โ
Configuration loaded from config.json
๐ Initializing RAG System Components...
โ
LLM initialized
โ
Retriever Manager initialized
โ
Retriever initialized
โ
Retriever tool created
๐งฉ Building Enhanced RAG Agent Graph with Early Safety...
โ
Enhanced RAG Agent with Early Safety compiled successfully
============================================================
๐ฌ RAG Conversation Started
Type 'exit', 'quit', or 'stop' to end the conversation
============================================================
โ What is your question: What is machine learning?
๐ Processing question: 'What is machine learning?'
๐ Safety Agent: Evaluating question safety
๐ Safety agent checking question: 'What is machine learning?'
...
๐ค Final Answer: Machine learning is a subset of artificial intelligence...
- Safety Agent: Validates question safety before processing
- Assistant Agent: Generates search queries for information retrieval
- Retriever Agent: Executes searches in the knowledge base
- Ranker Agent: Evaluates quality of retrieved content
- PR Agent: Processes final answer with proper formatting and context
| Variable | Description | Required |
|---|---|---|
GOOGLE_API_KEY |
API key for Google Gemini models | โ |
- LLM Settings: Model selection, temperature control
- Vector Store: Persistence directory, collection names, embedding models
- Document Sources: PDF and Obsidian vault paths
- Retriever Settings: Search type, result count (k)
- Text Processing: Chunk size and overlap for document splitting
- Early question safety evaluation
- Prevents processing of harmful or inappropriate content
- Configurable safety criteria
- Automatic vector database initialization
- Support for multiple document formats (PDF, Markdown)
- Configurable search parameters
- Automatic ranking of retrieved content
- Multi-attempt retrieval for better results
- Maximum attempt limits to prevent infinite loops
- Final answer polishing and formatting
- Context-aware response generation
- Professional tone and accessibility
- GUI Interface: Develop a web-based interface
- Advanced Caching: Implement response caching for frequently asked questions
- Web Search Integration: Add real-time web search capabilities
- Advanced Analytics: Performance monitoring and usage analytics