An intelligent chatbot powered by Google Gemini that helps users understand and navigate through RBI (Reserve Bank of India) circulars using natural language queries.
- PDF Processing: Automatically processes RBI circular PDFs
- Chunking: Intelligent text chunking for better context understanding
- Embedding Generation: Creates vector embeddings for semantic search
- Vector Database: Uses Qdrant for efficient vector storage and retrieval
- Semantic Search: Finds relevant circulars based on meaning, not just keywords
- Source Tracking: Shows which circulars were used to answer queries
- Natural Language: Ask questions in plain English
- Context-Aware: Maintains conversation context for better responses
- Source Display: Shows relevant circular sections used in answers
- Visual Graph: Interactive visualization of chat interactions and sources
- Neo4j Integration: Stores and visualizes chat interactions
- Interactive Graph: Shows relationships between queries and sources
- Real-time Updates: Graph updates as you chat
- Python 3.8+
- Docker and Docker Compose
- Google Cloud account (for Gemini API)
- Clone the repository:
git clone <repository-url>
cd rbi-circulars-chatbot
- Create a
.env
file with your credentials:
GOOGLE_API_KEY=your_gemini_api_key
NEO4J_URI=your_neo4j_uri
NEO4J_USER=your_neo4j_user
NEO4J_PASSWORD=your_neo4j_password
- Install dependencies:
pip install -r requirements.txt
- Start the services using Docker Compose:
docker-compose up -d
- Start the Streamlit app:
streamlit run app.py
- Open your browser and navigate to
http://localhost:8501
- Click the "🚀 Create Embeddings" button in the Dashboard
- Wait for the process to complete
- Monitor progress in the expandable progress section
- Click "💬 Start Chatting" after embeddings are created
- Type your question about RBI circulars
- View the response with relevant sources
- Explore the visualization graph
- Central Node: RBI Circulars
- Chat Nodes: Your queries and responses
- Source Nodes: Relevant circular sections
- Interactive Features:
- Hover for details
- Click to expand
- Drag to rearrange
- Zoom in/out
- Frontend: Streamlit
- LLM: Google Gemini
- Vector DB: Qdrant
- Graph DB: Neo4j
- Document Processing: PyPDF2, LangChain
- PDF Processing → Text Extraction
- Text Chunking → Embedding Generation
- Vector Storage → Qdrant
- Query Processing → Gemini
- Response Generation → Chat Interface
- Interaction Storage → Neo4j
- Chunk Size: 1000 characters
- Chunk Overlap: 200 characters
- Embedding Model: Google Gemini
- Vector Dimension: 768
- Collection Name: rbi_circulars
- Adjust chunk size in
src/utils/config.py
- Modify visualization settings in
src/utils/neo4j_utils.py
- Update UI layout in
app.py
- Response Time: < 2 seconds
- Accuracy: > 90% for relevant queries
- Scalability: Handles 1000+ circulars
- Memory Usage: Optimized for production
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Gemini API
- Qdrant Vector Database
- Neo4j Graph Database
- Streamlit Framework
- LangChain Framework
For support, please open an issue in the repository or contact the maintainers.
Made with ❤️ for RBI Circulars