A Retrieval-Augmented Generation (RAG) chatbot application built with LangChain, Chroma, and Gradio. This application allows users to ask questions about Insurellm company information, employees, products, and contracts using semantic search.
- Automatic Vector Store Building: Automatically builds the vector database on first launch
- Semantic Search: Uses Chroma vector database for intelligent document retrieval
- Conversational Memory: Maintains conversation context across queries
- Beautiful UI: Modern Gradio interface with examples and controls
- Production Ready: Includes error handling and logging
- Python 3.8 or higher
- OpenAI API key
- Conda (recommended) or pip
-
Clone the repository:
git clone [email protected]:deXterbed/Insurellm-RAG-Assistant.git cd Insurellm-RAG-Assistant
-
Create a conda environment (recommended):
conda create -n llms python=3.11 conda activate llms
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables: Create a
.envfile in the project root:echo "OPENAI_API_KEY=your-api-key-here" > .env
Or manually create
.envand add:OPENAI_API_KEY=sk-your-actual-api-key
-
Run the application:
python app.py
-
Access the interface:
- The app will automatically open in your browser
- Or visit
http://localhost:7860
-
First Launch:
- On first launch, the app will automatically build the vector database from the
knowledge-base/directory - This may take a few minutes depending on the number of documents
- Subsequent launches will use the existing vector database
- On first launch, the app will automatically build the vector database from the
-
Ask Questions:
- Type your question in the chat interface
- Examples:
- "Who is Avery Lancaster?"
- "What is Carllm?"
- "Tell me about Insurellm"
- "What contracts does Insurellm have?"
week5/
├── app.py # Main application file
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore file
├── README.md # This file
├── .env # Environment variables (not in git)
├── knowledge-base/ # Source documents
│ ├── company/ # Company information
│ ├── employees/ # Employee profiles
│ ├── products/ # Product descriptions
│ └── contracts/ # Contract documents
└── vector_db/ # Vector database (auto-generated)
- Document Loading: Loads markdown files from
knowledge-base/ - Text Splitting: Splits documents into chunks (1000 chars, 200 overlap)
- Embedding: Uses OpenAI embeddings to create vector representations
- Vector Store: Stores embeddings in Chroma database
- Retrieval: Semantic search finds relevant document chunks
- Generation: GPT-4o-mini generates answers based on retrieved context
- Memory: Maintains conversation history for context
User Question
↓
Embedding (Query Vector)
↓
Vector Similarity Search (Chroma)
↓
Retrieve Top-K Relevant Chunks
↓
Combine with Chat History
↓
LLM (GPT-4o-mini) Generation
↓
Response + Update Memory
Edit app.py to change:
MODEL: Change the OpenAI model (default: "gpt-4o-mini")temperature: Adjust creativity (default: 0.7)search_kwargs: Change number of retrieved chunks (default: k=4)
Edit the RecursiveCharacterTextSplitter parameters:
chunk_size: Size of text chunks (default: 1000)chunk_overlap: Overlap between chunks (default: 200)
- Employee Information: "Who is Avery Lancaster?", "Tell me about Alex Chen"
- Product Information: "What is Carllm?", "Describe Rellm"
- Company Information: "What is Insurellm?", "Tell me about the company"
- Contracts: "What contracts does Insurellm have?", "Who is TechDrive Insurance?"
- Ensure
knowledge-base/directory exists with markdown files - Check that you have write permissions in the project directory
- Review logs for specific error messages
- Verify
.envfile exists and containsOPENAI_API_KEY - Check that your API key is valid and has credits
- Ensure no extra spaces or quotes in the
.envfile
- Use "Reset Memory" button to clear conversation history
- Restart the app if memory becomes too large
- LangChain: RAG pipeline and chain orchestration
- Chroma: Vector database for embeddings
- OpenAI: Embeddings and LLM (GPT-4o-mini)
- Gradio: Web interface
- Python: Core language
Feel free to submit issues, fork the repository, and create pull requests!
This project is for educational purposes.
- Built as part of LLM tutorials
- Uses LangChain's ConversationalRetrievalChain
- Inspired by RAG best practices
Note: Make sure to keep your .env file private and never commit it to version control!