Welcome to the RAG-Enhanced Chatbot Application, a powerful and scalable chatbot solution that leverages Retrieval-Augmented Generation (RAG) techniques to provide intelligent and context-aware responses. Built with Streamlit, Python, and advanced language models from OpenAI, this application is designed to enhance user interactions by integrating document and web-based knowledge sources.
Whether you're developing an e-commerce platform, a real estate service, or any application that requires dynamic and informed conversational agents, our chatbot offers the flexibility and robustness you need.
The goal of this project is to build a highly responsive and intelligent chatbot using Retrieval-Augmented Generation (RAG). The chatbot integrates Large Language Models (LLMs), such as OpenAI's GPT, with a document retriever mechanism powered by ChromaDB. This approach enhances the chatbot’s ability to provide precise, context-aware answers by referring to uploaded documents and web resources. The entire solution is designed to be efficient, scalable, and easily deployable via Docker.
- Real-Time Chat Interface: Seamless AI-driven conversation interface.
- Document Uploads: Upload various file formats (PDF, DOCX, TXT, MD) for data retrieval.
- OpenAI GPT-4 Integration: Utilizes OpenAI's GPT-4 models for advanced language generation.
- RAG Integration: Enhance chatbot responses by uploading documents or providing URLs, enabling the chatbot to retrieve and utilize external knowledge.
- User-Friendly Interface: Intuitive Streamlit-based UI with sidebar controls for API key management, model selection, and RAG source uploads.
- Comprehensive Logging: Detailed logging of user interactions, model selections, and system events stored in a dedicated
logs/
folder with log rotation.
Check out a short demo of the application in action:
Watch the application in action:
Click the image above to watch the demo video.
- Streamlit: For building the interactive web application.
- Python: Core programming language.
- LangChain: For integrating language models.
- OpenAI GPT-4: For advanced language generation.
- SQLite: For lightweight database management.
- Logging: Python's built-in logging module for comprehensive logging.
- Docker: For containerized deployments (if applicable).
- Other Libraries:
dotenv
,uuid
, etc.
Follow these steps to set up the RAG-Enhanced Chatbot Application on your local machine.
- Python 3.11+
- pip (Python package installer)
- Git (for cloning the repository)
- Docker (for containerized deployments)
-
Clone the Repository
git clone [email protected]:abdurrahimcs50/RAG_Chatbot_Project.git cd RAG_Chatbot_Project
-
Create a Virtual Environment
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies
cd src pip install -r requirements.txt
-
Set Up Environment Variables
Create a
.env
file in the root directory and add your API keys:OPENAI_API_KEY=your_openai_api_key
Ensure that you replace the placeholder values with your actual API keys.
-
Run the Application
streamlit run app.py
-
Access the App
Open your web browser and navigate to
http://localhost:8501
to interact with the chatbot.
-
API Key Management
- Navigate to the sidebar to enter your OpenAI API keys.
- These keys are essential for authenticating and utilizing the OpenAI GPT-4 models.
-
Model Selection
- Choose your preferred language model from the dropdown menu in the sidebar.
- Options include
openai/gpt-4o
,openai/gpt-4o-mini
depending on your API keys.
-
RAG Source Uploads
- Upload Documents: Click on the "Upload Documents for RAG Processing" button to upload PDFs, TXT, DOCX, or MD files.
- Add URLs: Enter a URL to integrate web-based content into the chatbot's knowledge base.
-
Chat Interface
- Type your message in the chat input field and press Enter.
- The assistant will respond based on your input and the integrated RAG sources.
-
Logging
- All interactions and system events are logged in the
logs/
directory. - Logs are rotated to prevent excessive file sizes, ensuring efficient storage management.
- All interactions and system events are logged in the
-
Clear Chat
- Use the "Clear Chat" button in the sidebar to reset the conversation history.
The application employs Python's built-in logging
module to capture and store logs systematically.
- Log Directory: All logs are stored in the
logs/
folder. - Log Rotation: Logs are rotated after reaching 5 MB, with up to 5 backup logs maintained to prevent storage issues.
- Log Contents:
- Session initializations
- API key inputs (without exposing the keys)
- Model selections
- RAG source uploads
- User messages and assistant responses
- Error and warning messages
Example log entry:
2024-10-16 14:30:45,123 - INFO - Initialized new session with ID: 123e4567-e89b-12d3-a456-426614174000
2024-10-16 14:31:10,456 - INFO - OpenAI API Key provided by user.
2024-10-16 14:31:15,789 - INFO - Selected model: openai/gpt-4o
2024-10-16 14:32:00,012 - INFO - 2 document(s) uploaded for RAG processing.
2024-10-16 14:32:30,345 - INFO - User input: How can I integrate RAG into my project?
2024-10-16 14:32:45,678 - INFO - Assistant response: To integrate RAG into your project...
We welcome contributions from the community! Whether it's bug fixes, feature enhancements, or documentation improvements, your input is valuable.
-
Fork the Repository
Click the "Fork" button at the top right of the repository page.
-
Clone Your Fork
git clone [email protected]:abdurrahimcs50/RAG_Chatbot_Project.git cd RAG_Chatbot_Project
-
Create a New Branch
git checkout -b feature/YourFeatureName
-
Make Your Changes
Implement your feature or bug fix.
-
Commit Your Changes
git commit -m "Add your descriptive commit message"
-
Push to Your Fork
git push origin feature/YourFeatureName
-
Open a Pull Request
Navigate to the original repository and click "Compare & pull request."
- Code Quality: Ensure your code follows best practices and is well-documented.
- Testing: If applicable, include tests to verify your changes.
- Documentation: Update the README or other documentation if your changes affect usage.
This project is licensed under the MIT License.
For any questions, suggestions, or feedback, feel free to reach out:
- MD Abdur Rahim
- Email: [email protected]
- Website: www.rahim.com.bd
- LinkedIn: https://www.linkedin.com/in/abdurrahimcs50/
© 2021 - 2024 RahimTech. All rights reserved. Developed by MD Abdur Rahim.