Skip to content

A sophisticated AI-powered personal assistant that combines Retrieval-Augmented Generation (RAG) with a two-stage agent workflow to provide intelligent, context-aware responses based on my personal knowledge base.

Notifications You must be signed in to change notification settings

coletangsy/personal-rag-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

33 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿค– Personal RAG Assistant

A sophisticated AI-powered personal assistant that combines Retrieval-Augmented Generation (RAG) with a multi-agent workflow to provide intelligent, context-aware responses based on personal knowledge bases including Obsidian vaults and PDF documents.

๐Ÿ“น Demo

docker-demo.mp4

๐Ÿ†• What's New

Latest Updates (October 2025)

๐Ÿš€ Enhanced Multi-Agent RAG System

  • Early Safety Validation: Questions are now evaluated for safety before processing, ensuring appropriate content filtering
  • Content Quality Ranking: Automatic evaluation of retrieved information quality with multi-attempt retrieval for better answers
  • Professional Response Formatting: Final answers are now polished with clear structure, proper formatting, and accessibility improvements
  • Improved Error Handling: More robust error recovery and user-friendly error messages

๐ŸŒ Web API & Containerization

  • FastAPI Web Interface: Full REST API with session management and health checks
  • Docker Support: Complete containerization with optimized Dockerfile and docker-compose
  • AWS Lambda Ready: Mangum integration for serverless deployment
  • Interactive Web Chat: HTML chat interface for browser-based interaction

๐Ÿ”ง Better Project Organization

  • Modular Architecture: Clean separation between CLI and API interfaces
  • Enhanced Configuration: Flexible config.json with multiple fallback paths
  • Better Developer Experience: Clear separation between application logic, agents, and data management

What This Means for You:

  • More reliable and safe responses to your questions
  • Higher quality answers with better information filtering
  • Professional-looking responses with proper formatting
  • Multiple deployment options (CLI, Web API, Docker)
  • Easier to customize and extend the system
  • More stable and maintainable application

Getting Started with the Latest Version:

uv run src/main.py

โœจ Features

๐ŸŽฏ Core Functionality

  • Multi-Agent Workflow: Safety validation โ†’ Assistant โ†’ Retriever โ†’ Ranker โ†’ PR processing
  • RAG Integration: Retrieval-Augmented Generation powered by Chroma vector database
  • Early Safety Validation: Question safety evaluation before processing
  • Content Quality Ranking: Automatic evaluation of retrieved content quality
  • Obsidian & PDF Integration: Seamlessly indexes and retrieves information from Obsidian vaults and PDF documents
  • Interactive Interfaces: Command-line interface and web API with real-time feedback

๐Ÿ› ๏ธ Technical Features

  • LangGraph Integration: Multi-agent workflow orchestration
  • Google Gemini Models: State-of-the-art LLM and embedding models
  • FastAPI Web Server: Modern async web framework with automatic docs
  • Docker Containerization: Ready-to-deploy containerized application
  • Modular Architecture: Clean separation of concerns with dedicated modules
  • Error Handling: Robust error recovery and graceful degradation
  • Progress Indicators: Real-time status updates during processing
  • UV Package Management: Fast and reliable dependency management

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.12+
  • UV package manager
  • Google Gemini API key

Installation

  1. Clone the repository

    git clone https://github.com/coletangsy/personal-rag-assistant.git
    cd personal-rag-assistant
  2. Install dependencies

    uv sync
  3. Set up your environment Create a .env file:

    GOOGLE_API_KEY=<your_google_api_key_here>
  4. Configure your knowledge sources Edit config.json:

    {
     "llm": {
         "model": "gemini-2.5-flash",
         "temperature": 0
     },
     "vector_store": {
         "persist_directory": "./data/",
         "collection_name": "obsidian",
         "embedding_model": "models/gemini-embedding-001"
     },
     "pdf": {
         "path": ""
     },
     "obsidian": {
         "path": "/path/to/your/obsidian/vault"
     },
     "retriever": {
         "search_type": "similarity",
         "k": 3
     },
     "text_splitter": {
         "chunk_size": 1000,
         "chunk_overlap": 200
     }
    }
  5. Run the application

  • Option 1: Command Line Interface

    uv run  -m src/rag_app/main.py
  • Option 2: Web API Server

    uv run  src/app_api_handler.py

    Then visit: http://localhost:8000/docs for API documentation

  • Option 3: Docker Container

    # Build and run with Docker
    docker build -t personal-rag-assistant .
    docker run -p 8000:8000 -e GOOGLE_API_KEY=<your_key> personal-rag-assistant
    
    # Or use docker-compose
    docker-compose up

๐Ÿ“ Project Structure

personal-rag-assistant/
โ”œโ”€โ”€ ๐Ÿ“ src/                    # Main source code
โ”‚   โ”œโ”€โ”€ rag_app/              # Core RAG application
โ”‚   โ”‚   โ”œโ”€โ”€ main.py           # CLI application entry point
โ”‚   โ”‚   โ”œโ”€โ”€ agents.py         # All agent functions (safety, assistant, ranker, PR)
โ”‚   โ”‚   โ””โ”€โ”€ retriever_manager.py  # Vector database and document processing
โ”‚   โ”œโ”€โ”€ app_api_handler.py    # FastAPI web server and API endpoints
โ”‚   โ””โ”€โ”€ data/                 # Data storage (vector database, documents) ignored
โ”‚       โ”œโ”€โ”€ chroma.sqlite3    # Chroma vector database
โ”‚       โ”œโ”€โ”€ vector_indices/   # Vector index files
โ”‚       โ””โ”€โ”€ harrypotter.pdf   # Sample PDF document
โ”œโ”€โ”€ chat_interface.html       # Web chat interface
โ”œโ”€โ”€ config.json               # Application configuration
โ”œโ”€โ”€ Dockerfile                # Container configuration
โ”œโ”€โ”€ docker-compose.yml        # Multi-container setup
โ”œโ”€โ”€ pyproject.toml            # Project dependencies (uv)
โ”œโ”€โ”€ uv.lock                   # Dependency lock file
โ”œโ”€โ”€ .env                      # Environment variables
โ””โ”€โ”€ README.md                 # Project documentation

๐ŸŽฎ Usage

Interactive Session Example

โœ… Configuration loaded from config.json
๐Ÿš€ Initializing RAG System Components...

โœ… LLM initialized
โœ… Retriever Manager initialized
โœ… Retriever initialized
โœ… Retriever tool created

๐Ÿงฉ Building Enhanced RAG Agent Graph with Early Safety...
โœ… Enhanced RAG Agent with Early Safety compiled successfully

============================================================
๐Ÿ’ฌ RAG Conversation Started
Type 'exit', 'quit', or 'stop' to end the conversation
============================================================

โ“ What is your question: What is machine learning?
๐Ÿ”„ Processing question: 'What is machine learning?'
๐Ÿ”’ Safety Agent: Evaluating question safety
๐Ÿ”’ Safety agent checking question: 'What is machine learning?'
...
๐Ÿค– Final Answer: Machine learning is a subset of artificial intelligence...

Agent Workflow

  1. Safety Agent: Validates question safety before processing
  2. Assistant Agent: Generates search queries for information retrieval
  3. Retriever Agent: Executes searches in the knowledge base
  4. Ranker Agent: Evaluates quality of retrieved content
  5. PR Agent: Processes final answer with proper formatting and context

๐Ÿ”ง Configuration

Environment Variables

Variable Description Required
GOOGLE_API_KEY API key for Google Gemini models โœ…

Configuration Options

  • LLM Settings: Model selection, temperature control
  • Vector Store: Persistence directory, collection names, embedding models
  • Document Sources: PDF and Obsidian vault paths
  • Retriever Settings: Search type, result count (k)
  • Text Processing: Chunk size and overlap for document splitting

๐Ÿ”„ Agent Workflow Details

Safety Validation

  • Early question safety evaluation
  • Prevents processing of harmful or inappropriate content
  • Configurable safety criteria

Content Retrieval

  • Automatic vector database initialization
  • Support for multiple document formats (PDF, Markdown)
  • Configurable search parameters

Quality Assessment

  • Automatic ranking of retrieved content
  • Multi-attempt retrieval for better results
  • Maximum attempt limits to prevent infinite loops

Response Generation

  • Final answer polishing and formatting
  • Context-aware response generation
  • Professional tone and accessibility

Future Enhancements

  • GUI Interface: Develop a web-based interface
  • Advanced Caching: Implement response caching for frequently asked questions
  • Web Search Integration: Add real-time web search capabilities
  • Advanced Analytics: Performance monitoring and usage analytics

About

A sophisticated AI-powered personal assistant that combines Retrieval-Augmented Generation (RAG) with a two-stage agent workflow to provide intelligent, context-aware responses based on my personal knowledge base.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published