Skip to content

Anupam0202/Contextual-RAG-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Contextual RAG Chatbot - Intelligent Document Assistant

A state-of-the-art Retrieval-Augmented Generation (RAG) chatbot built with Google Gemini AI, featuring advanced document processing, contextual conversation management, comprehensive analytics, and a modern, responsive UI. This enterprise-grade solution provides intelligent document-based question answering with thinking and reflection capabilities.

RAG Chatbot Banner

Document Upload

✨ Features

Core Capabilities

  • 🧠 Advanced AI Integration: Powered by Google Gemini 1.5 with adaptive thinking and reflection
  • πŸ“š Multi-Format Document Processing: Accurate PDF processing with OCR fallback support
  • πŸ” Hybrid Search: Combines semantic (dense) and keyword (sparse) retrieval methods
  • πŸ’¬ Contextual Conversations: Adjustable context window (1-20 messages) for maintaining conversation flow
  • πŸ“Š Comprehensive Analytics: Real-time metrics, interactive visualizations, and Excel export
  • 🎨 Modern UI/UX: Responsive design with multiple themes and accessibility features
  • πŸ”’ Enterprise Security: Privacy mode, API key rotation, and input validation
  • ⚑ High Performance: Async processing, intelligent caching, and circuit breakers

Key Features by Category

Document Processing

  • Accurate page counting from PDF metadata
  • Multiple extraction methods (PyPDF2, pdfplumber, OCR)
  • Semantic chunking with overlap
  • Table extraction and formatting
  • Header preservation for context
  • Batch processing support

Conversation Management

  • Session isolation and persistence
  • Adjustable context window
  • Conversation history export (JSON)
  • Search within conversations
  • Privacy mode for sensitive data
  • Multi-session support

Analytics & Reporting

  • Real-time performance metrics
  • Interactive Plotly dashboards
  • AI-generated insights and recommendations
  • Excel export with multiple sheets
  • HTML report generation
  • Predictive analytics

User Interface

  • 5-page application structure
  • 4 theme options (Modern, Dark, Light, Classic)
  • Responsive design for all devices
  • Accessibility features (WCAG compliant)
  • Smooth animations and transitions
  • Keyboard navigation support

πŸ—οΈ Architecture

System Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Streamlit UI (app.py)                   β”‚
β”‚                  [Chat | Documents | Analytics]             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”‚
β”‚                  Core RAG Engine (rag_core.py)              β”‚
β”‚              [Planning β†’ Retrieval β†’ Generation]           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   Vector Store           β”‚          PDF Processor           β”‚
β”‚  (vector_store.py)       β”‚       (pdf_processor.py)         β”‚
β”‚  β€’ In-Memory/FAISS       β”‚       β€’ Text Extraction          β”‚
β”‚  β€’ Hybrid Search         β”‚       β€’ Semantic Chunking        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚            Infrastructure Layer (utils.py, config.py)       β”‚
β”‚         [Caching | Sessions | Security | Analytics]         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Component Interaction Flow

graph TD
    A[User Query] --> B[Session Manager]
    B --> C[RAG Engine]
    C --> D[Query Planning]
    D --> E[Vector Store Search]
    E --> F[Hybrid Retrieval]
    F --> G[Reranking]
    G --> H[Response Generation]
    H --> I[Reflection & Improvement]
    I --> J[User Response]
    J --> K[Analytics Tracking]
Loading

πŸ“‹ Prerequisites

System Requirements

  • Operating System: Windows 10/11, macOS 10.15+, Linux (Ubuntu 20.04+)
  • Python: 3.12 or higher
  • RAM: Minimum 8GB (16GB recommended)
  • Storage: 5GB free space
  • Internet: Required for API calls and model downloads

πŸš€ Installation

Step 1: Clone the Repository

# Clone the repository
git clone https://github.com/Anupam0202/Contextual-RAG-Chatbot.git
cd Contextual-RAG-Chatbot

Step 2: Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

# Upgrade pip
pip install --upgrade pip

# Install all dependencies
pip install -r requirements.txt

Step 4: Verify Installation

# Check Python version
python --version

# Verify key packages
python -c "import streamlit; print(f'Streamlit: {streamlit.__version__}')"
python -c "import google.generativeai; print('Google GenAI: OK')"

βš™οΈ Configuration

Environment Variables

  1. Create .env file from template:
cp .env.example .env
  1. Edit .env file with your settings:
# REQUIRED: Google Gemini API Key
GOOGLE_API_KEY=your_gemini_api_key_here

# Model Configuration
RAG_MODEL_NAME=gemini-1.5-flash
RAG_TEMPERATURE=0.7
RAG_MAX_OUTPUT_TOKENS=2048

# Conversation Settings
RAG_CONTEXT_WINDOW=5
RAG_MAX_CONTEXT_LENGTH=4000

# Retrieval Configuration
RAG_RETRIEVAL_TOP_K=5
RAG_SIMILARITY_THRESHOLD=0.7
RAG_HYBRID_SEARCH_ALPHA=0.5

# Chunking Settings
RAG_CHUNK_SIZE=1000
RAG_CHUNK_OVERLAP=200
RAG_SEMANTIC_CHUNKING=true

# Performance Settings
RAG_ENABLE_CACHING=true
RAG_CACHE_TTL=3600
RAG_MAX_WORKERS=4

# Security Settings
RAG_SESSION_TIMEOUT=7200
RAG_MAX_FILE_SIZE=52428800

Getting a Gemini API Key

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the key and add it to your .env file

Configuration Files

The application uses several configuration files:

File Purpose Location
.env Environment variables Root directory
config.json Runtime configuration data/config.json (auto-created)
settings.json User preferences data/settings.json (auto-created)
sessions.json Session persistence data/sessions.json (auto-created)

πŸ“– Usage

Starting the Application

# Activate virtual environment (if not already active)
source venv/bin/activate  # macOS/Linux
# or
venv\Scripts\activate  # Windows

# Run the application
streamlit run app.py

The application will open in your default browser at http://localhost:8501

Basic Workflow

#### 1. Upload Documents

1. Navigate to the **πŸ“š Documents** page
2. Click "Browse files" or drag & drop PDF files
3. Wait for processing to complete
4. View document statistics and metadata

#### 2. Start Chatting

1. Go to the **πŸ’¬ Chat** page
2. Type your question in the chat input
3. Press Enter or click Send
4. View AI responses with source citations

Example queries:

1. "What is the main topic of the document?"
2. "Summarize the key findings"
3. "Compare section 2 and section 5"
4. "Explain the methodology used"

#### 3. View Analytics

1. Visit the **πŸ“Š Analytics** page
2. Review performance metrics
3. Explore interactive charts
4. Export reports (Excel/HTML)

#### 4. Configure Settings

1. Open **βš™οΈ Settings** page
2. Adjust model parameters
3. Configure context window
4. Set UI preferences
5. Click "Save All Settings"

Advanced Usage

Conversation Context Management

# Adjust context window (1-20 messages)
Settings β†’ Conversation β†’ Number of Historical Messages

# Enable privacy mode
Settings β†’ Conversation β†’ Privacy Mode βœ“

# Preserve context between sessions
Settings β†’ Conversation β†’ Preserve Context Between Sessions βœ“

Document Processing Options

# Enable semantic chunking
Settings β†’ Retrieval β†’ Enable Semantic Chunking βœ“

# Adjust chunk size
Settings β†’ Retrieval β†’ Chunk Size: 1000

# Set chunk overlap
Settings β†’ Retrieval β†’ Chunk Overlap: 200

Search Configuration

# Configure hybrid search
Settings β†’ Retrieval β†’ Hybrid Search Alpha: 0.5
# 0.0 = Pure keyword search
# 1.0 = Pure semantic search
# 0.5 = Balanced hybrid

# Enable reranking
Settings β†’ Retrieval β†’ Enable Reranking βœ“

πŸ”Œ API Documentation

Core Modules

PDFProcessor

from pdf_processor import createPDFProcessor

# Initialize processor
processor = createPDFProcessor()

# Process single PDF
processed_doc = processor.processPDF(
    file_path="document.pdf",
    file_content=bytes_content  # Optional
)

# Access results
print(f"Pages: {processed_doc.page_count}")
print(f"Chunks: {len(processed_doc.chunks)}")
print(f"Method: {processed_doc.extraction_method}")

# Batch processing
results = processor.processBatch(["doc1.pdf", "doc2.pdf"])

RAG Engine

from rag_core import getRAGEngine

# Get RAG engine instance
rag = getRAGEngine()

# Process query with context
async for response_chunk in rag.processQuery(
    query="What is the summary?",
    conversation_history=[
        {"role": "user", "content": "Previous question"},
        {"role": "assistant", "content": "Previous answer"}
    ]
):
    print(response_chunk, end="")

Vector Store

from vector_store import getGlobalVectorStore

# Get vector store instance
vector_store = getGlobalVectorStore()

# Add documents
success = vector_store.addDocuments(chunks)

# Search
results = vector_store.search(
    query="search term",
    top_k=5
)

# Delete document
vector_store.delete(document_id)

Analytics

from analytics_advanced import AdvancedAnalytics, QueryMetrics

# Initialize analytics
analytics = AdvancedAnalytics()

# Add query metric
metric = QueryMetrics(
    timestamp=datetime.now(),
    session_id="session_123",
    query="user question",
    response_time=1.5,
    chunks_retrieved=5,
    confidence=0.85
)
analytics.addQueryMetric(metric)

# Generate report
report = analytics.generateInteractiveReport()

# Export to Excel
excel_data = analytics.exportToExcel(query_history)

πŸš€ Advanced Features

1. Thinking & Reflection Pattern

The RAG engine implements a multi-step reasoning process:

Query β†’ Planning β†’ Retrieval β†’ Generation β†’ Reflection β†’ Improvement
  • Planning: Classifies intent and decomposes complex queries
  • Retrieval: Hybrid search with semantic and keyword matching
  • Generation: Context-aware response with streaming
  • Reflection: Self-evaluation and improvement for complex queries

2. Circuit Breaker Pattern

Prevents cascade failures with automatic recovery:

@pdf_circuit_breaker  # Automatically handles failures
def processPDF(self, file_path):
    # Processing logic
    pass

States: CLOSED β†’ OPEN (on failure) β†’ HALF_OPEN (testing) β†’ CLOSED (recovered)

3. Session Isolation

Each session maintains separate:

  • Conversation history
  • User preferences
  • Document context
  • Analytics data

4. Caching Strategy

Multi-level caching for performance:

  • Query cache (TTL: 1 hour)
  • Embedding cache (TTL: 2 hours)
  • Analytics cache (TTL: 5 minutes)

5. Privacy Features

  • PII sanitization in conversations
  • Sensitive data redaction
  • API key rotation
  • Session timeout management

πŸ”§ Troubleshooting

Common Issues and Solutions

1. Application Won't Start

Error: ModuleNotFoundError: No module named 'streamlit'

Solution:

# Ensure virtual environment is activated
source venv/bin/activate  # macOS/Linux
venv\Scripts\activate  # Windows

# Reinstall dependencies
pip install -r requirements.txt

2. Gemini API Key Error

Error: GOOGLE_API_KEY environment variable not set

Solution:

  1. Check .env file exists
  2. Verify API key is correct
  3. Restart the application
  4. Test API key:
python -c "import os; print('API Key:', os.getenv('GOOGLE_API_KEY')[:10] + '...')"

3. PDF Processing Fails

Error: No text could be extracted from PDF

Solutions:

  • Ensure PDF is not corrupted
  • Check file size (< 50MB default)
  • Install OCR dependencies for scanned PDFs:
# Install Tesseract
sudo apt-get install tesseract-ocr  # Linux
brew install tesseract  # macOS

# Install Python packages
pip install pytesseract pdf2image

4. Memory Issues

Error: MemoryError or slow performance

Solutions:

# Reduce chunk size
Settings β†’ Retrieval β†’ Chunk Size: 500

# Limit context window
Settings β†’ Conversation β†’ Context Window: 3

# Clear cache
Settings β†’ Clear Cache (button)

# Restart application

5. Connection Timeouts

Error: TimeoutError during API calls

Solutions:

  • Check internet connection
  • Verify firewall settings
  • Increase timeout in config:
# In config.py
request_timeout = 30  # seconds

Debug Mode

Enable debug logging:

# Set logging level
export LOG_LEVEL=DEBUG

# Run with debug
streamlit run app.py --logger.level=debug

❓ Frequently Asked Questions

General Questions

Q: What file formats are supported? A: Currently, the application supports PDF files (.pdf). Text files (.txt) support is planned for future releases.

Q: What is the maximum file size limit? A: Default limit is 50MB per file. This can be adjusted in the configuration.

Q: How many documents can I upload? A: There's no hard limit on the number of documents, but performance may degrade with very large document sets (>100 documents).

Q: Is my data secure? A: Yes, the application includes:

  • Local processing (documents aren't sent to external servers except for API calls)
  • Privacy mode for sensitive data
  • Session isolation
  • Secure API key management

Technical Questions

Q: Which AI models are supported? A: Currently supports Google Gemini models:

  • gemini-1.5-flash (default)
  • gemini-1.5-pro
  • gemini-1.0-pro

Q: Can I use my own embedding model? A: Yes, modify the embedding_model in configuration:

RAG_EMBEDDING_MODEL=sentence-transformers/your-model

Q: How does hybrid search work? A: Combines two search methods:

  • Dense (Semantic): Uses embeddings for meaning-based search
  • Sparse (Keyword): Uses BM25 for exact keyword matching
  • Results are fused using configurable weighting (alpha parameter)

Q: What is the context window? A: The number of previous messages included when generating responses (1-20 messages).

Performance Questions

Q: How can I improve response time? A:

  • Enable caching in settings
  • Reduce chunk size
  • Decrease retrieval top-k
  • Use faster model (gemini-1.5-flash)

Q: How much RAM is needed? A:

  • Minimum: 8GB
  • Recommended: 16GB
  • For large documents (>100MB): 32GB

Customization Questions

Q: Can I add custom themes? A: Yes, add theme CSS in app.py:

theme_css = {
    'Custom': """
        <style>
        :root {
            --primary-color: #your_color;
        }
        </style>
    """
}

Development Setup

  1. Fork the repository
  2. Create a feature branch:
git checkout -b feature/your-feature-name

Code Style

  • Follow PEP 8 guidelines
  • Use type hints
  • Add docstrings to functions
  • Maximum line length: 100 characters

πŸ“ž Support

Getting Help

πŸ™ Acknowledgments

This project wouldn't be possible without:

Technologies

Contributors

  • AI Tools
  • Open Source Community

Special Thanks

  • Google AI team for Gemini API

  • Streamlit team for the amazing framework

  • All contributors and users of this project

About

Contextual RAG Chatbot that processes PDF documents using the Google Gemini API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages