A powerful Retrieval-Augmented Generation (RAG) system that transforms any developer documentation into an intelligent Q&A assistant. Built with GPU acceleration, hybrid retrieval, and modern NLP techniques.
Transform documentation websites into intelligent chatbots that can answer questions about the content. Simply provide a documentation URL (like React docs, Python docs, etc.), and the system will:
- Crawl the documentation website
- Process content into searchable chunks
- Generate vector embeddings with GPU acceleration
- Store everything in a vector database
- Provide a chat interface for Q&A
- 🚀 GPU Acceleration - 3-5x faster processing with automatic CUDA detection
- 🧠 Hybrid Retrieval - Combines dense vectors (Pinecone) + sparse search (BM25) + re-ranking
- 📚 Universal Support - Works with any documentation website
- ⚡ Smart Caching - Processes once, use forever
- 🎨 Clean Interface - Modern Gradio web UI
- 🔧 Flexible Usage - Web interface, Python API, or Jupyter notebook
git clone https://github.com/yourusername/dev-docs-rag.git
cd dev-docs-rag
pip install -r requirements.txtCreate a .env file with your API keys:
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX=your_pinecone_index_name
OPENROUTER_API_KEY=your_openrouter_api_key
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
APPWRITE_ENDPOINT=your_appwrite_endpoint
APPWRITE_PROJECT_ID=your_project_id
APPWRITE_API_KEY=your_api_key
APPWRITE_DATABASE_ID=your_database_id
APPWRITE_COLLECTION_ID=your_collection_id
APPWRITE_BUCKET_ID=your_bucket_idWe have 2 options because one is easier to run on local and other one is easier to run on a server.
Option A: Complete Pipeline (Recommended)
python run_pipeline.py
# Follow the interactive promptsOption B: Jupyter Notebook
jupyter notebook documentation_pipeline.ipynbOnce processing is complete, launch the web interface to ask questions:
python app.py
# Open http://localhost:7860Documentation URL → Crawl → Chunk → Embed → Store → Chat Interface
- Crawl: Extracts content from documentation websites
- Chunk: Splits content into manageable pieces with overlap
- Embed: Generates vector embeddings (GPU-accelerated)
- Store: Saves to Pinecone vector database + Appwrite storage
- Retrieve: Hybrid search (vector + keyword + re-ranking)
- Generate: LLM creates answers with retrieved context
- Pinecone - Vector database (free tier available)
- OpenRouter - LLM API access (pay-per-use)
- Appwrite - Storage and database (free tier available)
- GPU: NVIDIA GPU recommended (falls back to CPU)
- RAM: 8GB+ for large documentation sets
- Storage: Minimal (data stored in cloud)
- Batch Size: 200 (default), increase for better GPUs
- Auto-detection: Automatically uses GPU if available
- Memory Management: Automatic cache clearing
- Force Reprocess: Reprocess existing documentation
- URL Filtering: Filter results by documentation source
- Status Tracking: Database-backed completion tracking
python crawl_docs.py https://react.dev/learn
python chunk_docs.py https://react.dev/learn
python embed_upload.py https://react.dev/learn 200 truefrom embed_upload import embed_and_upload_chunks
from rag_pipeline import process_question_with_relevance_check
# Process documentation
embed_and_upload_chunks("https://react.dev/learn", batch_size=200, use_gpu=True)
# Ask questions
answer = process_question_with_relevance_check(
"How do React hooks work?",
selected_url="https://react.dev/learn"
)dev-docs-rag/
├── app.py # Web chat interface
├── run_pipeline.py # Complete pipeline
├── documentation_pipeline.ipynb # Jupyter notebook
├── crawl_docs.py # Web scraper
├── chunk_docs.py # Document chunking
├── embed_upload.py # GPU-accelerated embedding
├── rag_pipeline.py # RAG with hybrid retrieval
├── appwrite_service.py # Database integration
├── manual_process.py # Step-by-step processing
└── requirements.txt # Dependencies
GPU Not Working?
python -c "import torch; print(torch.cuda.is_available())"Memory Issues?
- Reduce batch size for smaller GPUs
- Check GPU memory usage
API Errors?
- Verify all API keys in
.env - Check service quotas and limits
Processing Failures?
- Ensure documentation URL is accessible
- Try with
force_reprocess=True
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Test with different documentation sets
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Built with Gradio for the web interface
- Powered by Pinecone for vector storage
- Uses OpenRouter for LLM access
- Storage provided by Appwrite
Note - This system can work with other (crawlable) websites as well, but is optimised for developer documentations specifically.
⚡ Ready to get started? Run python run_pipeline.py and follow the prompts!