A state-of-the-art Retrieval-Augmented Generation (RAG) chatbot built with Google Gemini AI, featuring advanced document processing, contextual conversation management, comprehensive analytics, and a modern, responsive UI. This enterprise-grade solution provides intelligent document-based question answering with thinking and reflection capabilities.
- π§ Advanced AI Integration: Powered by Google Gemini 1.5 with adaptive thinking and reflection
- π Multi-Format Document Processing: Accurate PDF processing with OCR fallback support
- π Hybrid Search: Combines semantic (dense) and keyword (sparse) retrieval methods
- π¬ Contextual Conversations: Adjustable context window (1-20 messages) for maintaining conversation flow
- π Comprehensive Analytics: Real-time metrics, interactive visualizations, and Excel export
- π¨ Modern UI/UX: Responsive design with multiple themes and accessibility features
- π Enterprise Security: Privacy mode, API key rotation, and input validation
- β‘ High Performance: Async processing, intelligent caching, and circuit breakers
- Accurate page counting from PDF metadata
- Multiple extraction methods (PyPDF2, pdfplumber, OCR)
- Semantic chunking with overlap
- Table extraction and formatting
- Header preservation for context
- Batch processing support
- Session isolation and persistence
- Adjustable context window
- Conversation history export (JSON)
- Search within conversations
- Privacy mode for sensitive data
- Multi-session support
- Real-time performance metrics
- Interactive Plotly dashboards
- AI-generated insights and recommendations
- Excel export with multiple sheets
- HTML report generation
- Predictive analytics
- 5-page application structure
- 4 theme options (Modern, Dark, Light, Classic)
- Responsive design for all devices
- Accessibility features (WCAG compliant)
- Smooth animations and transitions
- Keyboard navigation support
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit UI (app.py) β
β [Chat | Documents | Analytics] β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Core RAG Engine (rag_core.py) β
β [Planning β Retrieval β Generation] β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Vector Store β PDF Processor β
β (vector_store.py) β (pdf_processor.py) β
β β’ In-Memory/FAISS β β’ Text Extraction β
β β’ Hybrid Search β β’ Semantic Chunking β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Infrastructure Layer (utils.py, config.py) β
β [Caching | Sessions | Security | Analytics] β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
graph TD
A[User Query] --> B[Session Manager]
B --> C[RAG Engine]
C --> D[Query Planning]
D --> E[Vector Store Search]
E --> F[Hybrid Retrieval]
F --> G[Reranking]
G --> H[Response Generation]
H --> I[Reflection & Improvement]
I --> J[User Response]
J --> K[Analytics Tracking]
- Operating System: Windows 10/11, macOS 10.15+, Linux (Ubuntu 20.04+)
- Python: 3.12 or higher
- RAM: Minimum 8GB (16GB recommended)
- Storage: 5GB free space
- Internet: Required for API calls and model downloads
# Clone the repository
git clone https://github.com/Anupam0202/Contextual-RAG-Chatbot.git
cd Contextual-RAG-Chatbot
# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python3 -m venv venv
source venv/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install all dependencies
pip install -r requirements.txt
# Check Python version
python --version
# Verify key packages
python -c "import streamlit; print(f'Streamlit: {streamlit.__version__}')"
python -c "import google.generativeai; print('Google GenAI: OK')"
- Create
.env
file from template:
cp .env.example .env
- Edit
.env
file with your settings:
# REQUIRED: Google Gemini API Key
GOOGLE_API_KEY=your_gemini_api_key_here
# Model Configuration
RAG_MODEL_NAME=gemini-1.5-flash
RAG_TEMPERATURE=0.7
RAG_MAX_OUTPUT_TOKENS=2048
# Conversation Settings
RAG_CONTEXT_WINDOW=5
RAG_MAX_CONTEXT_LENGTH=4000
# Retrieval Configuration
RAG_RETRIEVAL_TOP_K=5
RAG_SIMILARITY_THRESHOLD=0.7
RAG_HYBRID_SEARCH_ALPHA=0.5
# Chunking Settings
RAG_CHUNK_SIZE=1000
RAG_CHUNK_OVERLAP=200
RAG_SEMANTIC_CHUNKING=true
# Performance Settings
RAG_ENABLE_CACHING=true
RAG_CACHE_TTL=3600
RAG_MAX_WORKERS=4
# Security Settings
RAG_SESSION_TIMEOUT=7200
RAG_MAX_FILE_SIZE=52428800
- Visit Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the key and add it to your
.env
file
The application uses several configuration files:
File | Purpose | Location |
---|---|---|
.env |
Environment variables | Root directory |
config.json |
Runtime configuration | data/config.json (auto-created) |
settings.json |
User preferences | data/settings.json (auto-created) |
sessions.json |
Session persistence | data/sessions.json (auto-created) |
# Activate virtual environment (if not already active)
source venv/bin/activate # macOS/Linux
# or
venv\Scripts\activate # Windows
# Run the application
streamlit run app.py
The application will open in your default browser at http://localhost:8501
#### 1. Upload Documents
1. Navigate to the **π Documents** page
2. Click "Browse files" or drag & drop PDF files
3. Wait for processing to complete
4. View document statistics and metadata
#### 2. Start Chatting
1. Go to the **π¬ Chat** page
2. Type your question in the chat input
3. Press Enter or click Send
4. View AI responses with source citations
Example queries:
1. "What is the main topic of the document?"
2. "Summarize the key findings"
3. "Compare section 2 and section 5"
4. "Explain the methodology used"
#### 3. View Analytics
1. Visit the **π Analytics** page
2. Review performance metrics
3. Explore interactive charts
4. Export reports (Excel/HTML)
#### 4. Configure Settings
1. Open **βοΈ Settings** page
2. Adjust model parameters
3. Configure context window
4. Set UI preferences
5. Click "Save All Settings"
# Adjust context window (1-20 messages)
Settings β Conversation β Number of Historical Messages
# Enable privacy mode
Settings β Conversation β Privacy Mode β
# Preserve context between sessions
Settings β Conversation β Preserve Context Between Sessions β
# Enable semantic chunking
Settings β Retrieval β Enable Semantic Chunking β
# Adjust chunk size
Settings β Retrieval β Chunk Size: 1000
# Set chunk overlap
Settings β Retrieval β Chunk Overlap: 200
# Configure hybrid search
Settings β Retrieval β Hybrid Search Alpha: 0.5
# 0.0 = Pure keyword search
# 1.0 = Pure semantic search
# 0.5 = Balanced hybrid
# Enable reranking
Settings β Retrieval β Enable Reranking β
from pdf_processor import createPDFProcessor
# Initialize processor
processor = createPDFProcessor()
# Process single PDF
processed_doc = processor.processPDF(
file_path="document.pdf",
file_content=bytes_content # Optional
)
# Access results
print(f"Pages: {processed_doc.page_count}")
print(f"Chunks: {len(processed_doc.chunks)}")
print(f"Method: {processed_doc.extraction_method}")
# Batch processing
results = processor.processBatch(["doc1.pdf", "doc2.pdf"])
from rag_core import getRAGEngine
# Get RAG engine instance
rag = getRAGEngine()
# Process query with context
async for response_chunk in rag.processQuery(
query="What is the summary?",
conversation_history=[
{"role": "user", "content": "Previous question"},
{"role": "assistant", "content": "Previous answer"}
]
):
print(response_chunk, end="")
from vector_store import getGlobalVectorStore
# Get vector store instance
vector_store = getGlobalVectorStore()
# Add documents
success = vector_store.addDocuments(chunks)
# Search
results = vector_store.search(
query="search term",
top_k=5
)
# Delete document
vector_store.delete(document_id)
from analytics_advanced import AdvancedAnalytics, QueryMetrics
# Initialize analytics
analytics = AdvancedAnalytics()
# Add query metric
metric = QueryMetrics(
timestamp=datetime.now(),
session_id="session_123",
query="user question",
response_time=1.5,
chunks_retrieved=5,
confidence=0.85
)
analytics.addQueryMetric(metric)
# Generate report
report = analytics.generateInteractiveReport()
# Export to Excel
excel_data = analytics.exportToExcel(query_history)
The RAG engine implements a multi-step reasoning process:
Query β Planning β Retrieval β Generation β Reflection β Improvement
- Planning: Classifies intent and decomposes complex queries
- Retrieval: Hybrid search with semantic and keyword matching
- Generation: Context-aware response with streaming
- Reflection: Self-evaluation and improvement for complex queries
Prevents cascade failures with automatic recovery:
@pdf_circuit_breaker # Automatically handles failures
def processPDF(self, file_path):
# Processing logic
pass
States: CLOSED
β OPEN
(on failure) β HALF_OPEN
(testing) β CLOSED
(recovered)
Each session maintains separate:
- Conversation history
- User preferences
- Document context
- Analytics data
Multi-level caching for performance:
- Query cache (TTL: 1 hour)
- Embedding cache (TTL: 2 hours)
- Analytics cache (TTL: 5 minutes)
- PII sanitization in conversations
- Sensitive data redaction
- API key rotation
- Session timeout management
Error: ModuleNotFoundError: No module named 'streamlit'
Solution:
# Ensure virtual environment is activated
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# Reinstall dependencies
pip install -r requirements.txt
Error: GOOGLE_API_KEY environment variable not set
Solution:
- Check
.env
file exists - Verify API key is correct
- Restart the application
- Test API key:
python -c "import os; print('API Key:', os.getenv('GOOGLE_API_KEY')[:10] + '...')"
Error: No text could be extracted from PDF
Solutions:
- Ensure PDF is not corrupted
- Check file size (< 50MB default)
- Install OCR dependencies for scanned PDFs:
# Install Tesseract
sudo apt-get install tesseract-ocr # Linux
brew install tesseract # macOS
# Install Python packages
pip install pytesseract pdf2image
Error: MemoryError
or slow performance
Solutions:
# Reduce chunk size
Settings β Retrieval β Chunk Size: 500
# Limit context window
Settings β Conversation β Context Window: 3
# Clear cache
Settings β Clear Cache (button)
# Restart application
Error: TimeoutError
during API calls
Solutions:
- Check internet connection
- Verify firewall settings
- Increase timeout in config:
# In config.py
request_timeout = 30 # seconds
Enable debug logging:
# Set logging level
export LOG_LEVEL=DEBUG
# Run with debug
streamlit run app.py --logger.level=debug
Q: What file formats are supported? A: Currently, the application supports PDF files (.pdf). Text files (.txt) support is planned for future releases.
Q: What is the maximum file size limit? A: Default limit is 50MB per file. This can be adjusted in the configuration.
Q: How many documents can I upload? A: There's no hard limit on the number of documents, but performance may degrade with very large document sets (>100 documents).
Q: Is my data secure? A: Yes, the application includes:
- Local processing (documents aren't sent to external servers except for API calls)
- Privacy mode for sensitive data
- Session isolation
- Secure API key management
Q: Which AI models are supported? A: Currently supports Google Gemini models:
- gemini-1.5-flash (default)
- gemini-1.5-pro
- gemini-1.0-pro
Q: Can I use my own embedding model?
A: Yes, modify the embedding_model
in configuration:
RAG_EMBEDDING_MODEL=sentence-transformers/your-model
Q: How does hybrid search work? A: Combines two search methods:
- Dense (Semantic): Uses embeddings for meaning-based search
- Sparse (Keyword): Uses BM25 for exact keyword matching
- Results are fused using configurable weighting (alpha parameter)
Q: What is the context window? A: The number of previous messages included when generating responses (1-20 messages).
Q: How can I improve response time? A:
- Enable caching in settings
- Reduce chunk size
- Decrease retrieval top-k
- Use faster model (gemini-1.5-flash)
Q: How much RAM is needed? A:
- Minimum: 8GB
- Recommended: 16GB
- For large documents (>100MB): 32GB
Q: Can I add custom themes?
A: Yes, add theme CSS in app.py
:
theme_css = {
'Custom': """
<style>
:root {
--primary-color: #your_color;
}
</style>
"""
}
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name
- Follow PEP 8 guidelines
- Use type hints
- Add docstrings to functions
- Maximum line length: 100 characters
- Documentation: Full Documentation
- Issues: GitHub Issues
- Wiki: Project Wiki
This project wouldn't be possible without:
- Google Gemini AI - AI model powering the chatbot
- Streamlit - Web application framework
- LangChain - RAG patterns and concepts
- Sentence Transformers - Embedding models
- FAISS - Vector similarity search
- PyPDF2 - PDF processing
- AI Tools
- Open Source Community
-
Google AI team for Gemini API
-
Streamlit team for the amazing framework
-
All contributors and users of this project