Intelligent Document Interaction Platform
DocuMind is a powerful AI-driven document management and interaction platform that allows users to upload, manage, and chat with their documents using advanced Retrieval Augmented Generation (RAG) technology. Built with React and Flask, it provides secure, user-isolated document processing with real-time AI-powered conversations.
- Secure user authentication with cookie-based sessions
- Complete user isolation - users can only access their own documents
- Real-time document synchronization across components
- PDF - Extract text from PDF documents
- DOCX - Process Microsoft Word documents
- TXT - Handle plain text files
- CSV - Parse comma-separated values
- Chat with your documents using natural language
- Context-aware responses based on document content
- Powered by Groq's fast inference engine
- Real-time conversation interface
- Document Processing: Upload β Text Extraction β Chunking β Embeddings β Vector Storage
- Smart Chunking: Intelligent text segmentation for optimal retrieval
- Vector Search: Fast similarity search using Pinecone vector database
- Embedding Models: Support for multiple embedding providers
- Clean, responsive design with Tailwind CSS
- Real-time file upload with progress indicators
- Intuitive chat sidebar with document management
- Delete functionality with confirmation dialogs
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β React Client β β Flask Server β β External APIs β
β β β β β β
β β’ Upload UI βββββΊβ β’ File Routes βββββΊβ β’ Pinecone DB β
β β’ Chat Interfaceβ β β’ Query Routes β β β’ Groq LLM β
β β’ Document List β β β’ RAG Pipeline β β β’ Appwrite β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
- Docker and Docker Compose (Recommended)
- Node.js (v16 or higher) - if running manually
- Python (v3.8 or higher) - if running manually
- Git
The easiest way to get DocuMind running is with Docker:
# 1. Clone the repository
git clone <repository-url>
cd DocuMind
# 2. Setup environment variables
cp .env.example .env
# Edit .env with your API keys (see Environment Variables section below)
# 3. Start everything with Docker
docker-compose up --build
# Or run in background
docker-compose up -d --buildThat's it! Your application will be running at:
- Frontend: http://localhost:3000
- Backend API: http://localhost:5000
# To stop services
docker-compose down
# To view logs
docker-compose logs -f
# To restart services
docker-compose restartIf you prefer to run without Docker:
If you prefer to run without Docker:
# Navigate to server directory
cd docuMind_server
# Create virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt
# Setup environment variables
cp .env.example .env
# Edit .env with your API keys
# Run the server
python app/main.py# Navigate to client directory (in new terminal)
cd docuMind_client
# Install dependencies
npm install
# Start development server
npm run devAccess the application:
- Frontend: http://localhost:5173 (Vite dev server)
- Backend: http://localhost:5000
Create a .env file in the docuMind_server directory with the following variables:
# Pinecone Configuration (Required)
PINECONE_API_KEY=your_pinecone_api_key_here
# Groq Configuration (Required)
GROQ_API_KEY=your_groq_api_key_here
# Appwrite Configuration (Required)
APPWRITE_ENDPOINT=your_appwrite_endpoint_here
APPWRITE_PROJECT_ID=your_appwrite_project_id_here
APPWRITE_API_KEY=your_appwrite_api_key_here
APPWRITE_BUCKET_ID=your_appwrite_bucket_id_here-
Pinecone (Vector Database)
- Sign up at pinecone.io
- Create a new project and get your API key
- Create an index with 768 dimensions (cosine similarity)
-
Groq (LLM Inference)
- Sign up at groq.com
- Get your API key from the dashboard
-
Appwrite (File Storage)
- Sign up at appwrite.io
- Create a new project
- Set up a storage bucket
- Generate API keys
DocuMind/
βββ docker-compose.yml # Docker configuration
βββ .env.example # Environment variables template
βββ README.md # This file
βββ docuMind_client/ # React Frontend
β βββ Dockerfile # Frontend container setup
β βββ src/
β β βββ components/ # UI Components
β β β βββ ui/ # Reusable UI components
β β β βββ ChatSidebar.tsx
β β β βββ FileList.tsx
β β β βββ FileUploadCard.tsx
β β βββ pages/ # Page components
β β β βββ ChatPage.tsx
β β β βββ HomePage.tsx
β β β βββ UploadPage.tsx
β β βββ hooks/ # Custom React hooks
β βββ package.json
β βββ vite.config.ts
β
βββ docuMind_server/ # Flask Backend
β βββ Dockerfile # Backend container setup
β βββ app/
β β βββ routes/ # API Routes
β β β βββ upload.py # File upload endpoints
β β β βββ query.py # Chat query endpoints
β β β βββ files.py # File management endpoints
β β βββ services/ # Business Logic
β β β βββ file_processor.py # Document processing
β β β βββ file_parser.py # Text extraction
β β β βββ chunking.py # Text chunking
β β β βββ embeddings.py # Vector embeddings
β β β βββ vectorstore.py # Pinecone operations
β β β βββ llm.py # Groq integration
β β β βββ storage.py # Appwrite integration
β β βββ main.py # Flask application
β βββ requirements.txt
β βββ .env.example
β
βββ README.md # This file
DocuMind uses Docker for easy development setup. The configuration includes:
- Backend Container: Python 3.11 with Flask development server
- Frontend Container: Node.js 18 with Vite development server
- Hot Reload: Both containers support live code changes
- Volume Mounts: Source code is mounted for instant updates
# Start all services
docker-compose up
# Build and start (after code changes)
docker-compose up --build
# Run in background
docker-compose up -d
# View logs
docker-compose logs -f
# View specific service logs
docker-compose logs -f backend
docker-compose logs -f frontend
# Stop all services
docker-compose down
# Restart services
docker-compose restart
# Access container shell
docker-compose exec backend bash
docker-compose exec frontend sh# Rebuild containers from scratch
docker-compose down
docker-compose build --no-cache
docker-compose up
# Clean up Docker resources
docker system prune -f
# Check container status
docker-compose psGET /api/v1/files- List user's documentsPOST /api/v1/upload- Upload a new documentDELETE /api/v1/files/<file_id>- Delete a document
POST /api/v1/query- Send chat query to AI
GET /- API health status
Frontend:
- React 18 with TypeScript
- Vite for build tooling
- Tailwind CSS for styling
- Axios for HTTP requests
- React Router for navigation
- Shadcn/ui component library
Backend:
- Flask (Python web framework)
- Flask-CORS for cross-origin requests
- Sentence Transformers for embeddings
- PyPDF2 for PDF processing
- python-docx for Word documents
External Services:
- Pinecone (Vector Database)
- Groq (LLM Inference)
- Appwrite (File Storage)
- User Isolation: All operations are filtered by
user_idfrom cookies - File Processing Pipeline: Upload β Parse β Chunk β Embed β Store
- Real-time Updates: Custom events for cross-component synchronization
- Error Handling: Comprehensive error handling with user-friendly messages
- Update
file_parser.pywith new extraction logic - Add MIME type support in
FileUploadCard.tsx - Update file validation in both frontend and backend
-
Port Conflicts
# If ports 3000 or 5000 are in use, edit docker-compose.yml: ports: - "3001:3000" # Frontend - "5001:5000" # Backend
-
Container Build Failures
# Clean rebuild docker-compose down docker-compose build --no-cache docker-compose up -
Volume Mount Issues
# On Windows, ensure drive sharing is enabled in Docker Desktop # On Linux, check file permissions sudo chown -R $USER:$USER .
-
Import Errors (Manual Setup)
# Make sure you're in the virtual environment pip install -r requirements.txt -
CORS Issues
- Check that Flask-CORS is installed
- Verify frontend is running on correct port
-
File Upload Errors
- Check file type is supported
- Verify file size limits
- Check Appwrite bucket permissions
-
Chat Not Working
- Verify Groq API key is valid
- Check Pinecone connection
- Ensure documents are properly indexed
With Docker:
# View detailed logs
docker-compose logs -f
# Access container for debugging
docker-compose exec backend bashManual Setup:
export FLASK_DEBUG=1
export FLASK_ENV=developmentThis project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Built with modern React and Flask best practices
- Powered by advanced AI and vector search technologies
- Designed for scalability and user privacy
Happy document chatting! π
For support or questions, please open an issue in the repository.
Note: The hosted site includes only the frontend (UI). The backend isnβt hosted because it uses the Nomic open-source local embedding model. However, you can clone this repo and follow the Docker setup steps to run both frontend and backend locally.