A medical question-answering system that leverages Retrieval-Augmented Generation (RAG) to provide accurate and context-aware medical information. This application combines the power of large language models with vector search to deliver precise answers to medical queries.
- Advanced RAG Pipeline: Implements a robust Retrieval-Augmented Generation system for accurate medical information retrieval
- Multi-document Support: Processes and indexes multiple PDF documents from the
data/directory - Semantic Search: Utilizes Pinecone's vector database for efficient similarity search across medical documents
- State-of-the-Art LLM: Powered by Google's Gemini model through LangChain for high-quality response generation
- Web Interface: User-friendly Flask-based web interface for seamless interaction
- Scalable Architecture: Designed for easy extension and integration with additional data sources
- Customizable Prompts: Easily adjustable system prompts to tailor responses to medical domain requirements
- Efficient Chunking: Smart text splitting to maintain context while processing large documents
The application follows a modern microservices architecture with the following components:
- Frontend: Lightweight HTML/JS interface with responsive design
- Backend: Flask web server handling API requests
- Vector Database: Pinecone for efficient vector similarity search
- Embedding Model:
all-MiniLM-L6-v2for creating document embeddings - LLM Integration: Google's Gemini model for generating human-like responses
- Document Processing: Automated pipeline for PDF ingestion and text extraction
app.py: Flask app and RAG pipelinestore_index.py: Builds Pinecone index from PDFs indata/src/helper.py: Load PDF(s), split text, and create embeddingssrc/prompt.py: System prompt for the assistanttemplates/chat.html: Frontend chat pagestatic/style.css: Simple styles
- Python 3.10+
- A Pinecone account and API key
- A Google AI Studio API key (for Gemini)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txtCreate a .env file in the project root:
PINECONE_API_KEY="your_pinecone_api_key"
GOOGLE_API_KEY="your_google_api_key"Notes:
- The code uses the Pinecone index name
medical-catbootand expects embeddings of dimension 384 (all-MiniLM-L6-v2). - Default serverless spec: cloud
aws, regionus-east-1.
Place your PDFs in the data/ folder (the repo includes data/Medical_book.pdf). Rebuild the index after you change files.
.\.venv\Scripts\python store_index.pyThis will:
- Read PDFs from
data/ - Split into chunks
- Create sentence-transformer embeddings (384 dims)
- Create or reuse Pinecone index
medical-catboot - Upsert embeddings
.\.venv\Scripts\python app.pyOpen http://localhost:8080 in your browser.
- Ensure
.envcontains validPINECONE_API_KEYandGOOGLE_API_KEY. - If you change PDFs, rerun
store_index.pyto refresh embeddings. - If the index doesn’t exist, the script will create it (serverless
us-east-1). - On corporate networks, set proxy env vars for
pip/downloads if needed.
- Python, Flask
- LangChain (Retrieval chain)
- Sentence Transformers (
all-MiniLM-L6-v2) - Pinecone (Vector DB)
- Google Gemini (via
langchain-google-genai)