A comprehensive Retrieval Augmented Generation (RAG) application built with Next.js, featuring document processing, website scraping, and AI-powered chat functionality.
🎥 YouTube Video: Click here
⚡ Important Note:
- Your OpenAI API key and Qdrant credentials are never stored on our servers.
- They are stored securely only on your localhost (or in your environment variables like .env.local).
- This means your credentials always remain under your control and are not shared with any third-party servers.
👉 So, you can safely use your API key without any worries everything stays on your localhost.
- Multiple Data Sources: Support for text input, file uploads (PDF, CSV, TXT), and website scraping
- Vector Database: Qdrant integration for efficient document storage and retrieval
- AI-Powered Chat: OpenAI GPT integration for intelligent responses based on your data
- Modern UI: Dark mode interface with light orange accent colors
- Form Validation: React Hook Form with Zod validation for better user experience
- Real-time Processing: Live feedback and progress indicators
- Flexible Deployment: Support for both local Docker and Qdrant Cloud
- Frontend: Next.js 14, React, TypeScript
- UI Components: shadcn/ui, Tailwind CSS
- Forms: React Hook Form, Zod validation
- Vector Database: Qdrant (Local Docker or Cloud)
- AI Integration: OpenAI API, LangChain
- File Processing: PDF parsing, CSV processing, web scraping
- Node.js 18+
- OpenAI API key
- Choose one of the following for vector database:
- Docker (for local Qdrant instance)
- Qdrant Cloud account (managed service)
- Clone the repository
git clone https://github.com/BCAPATHSHALA/RAGApplication.git
cd RAGApplication
- Install dependencies
pnpm install
- Choose your vector database setup:
3a. Start Qdrant vector database
docker-compose up -d
This will start Qdrant on http://localhost:6333
3b. Set up environment variables
Create a .env.local
file in the root directory:
QDRANT_URL=http://localhost:6333
OPENAI_API_KEY=your_openai_api_key_here
3a. Create a Qdrant Cloud account
- Visit https://cloud.qdrant.io/
- Sign up for a free account
- Create a new cluster
- Get your cluster URL and API key
3b. Set up environment variables
Create a .env.local
file in the root directory:
QDRANT_URL=https://your-cluster-url.qdrant.io
QDRANT_API_KEY=your_qdrant_api_key
OPENAI_API_KEY=your_openai_api_key_here
- Run the development server
pnpm run dev
- Open your browser
Navigate to
http://localhost:3000
The application provides an intuitive interface for configuration:
- OpenAI API Key: Enter your OpenAI API key in the API Key section
- Qdrant Configuration:
- For local Docker: Use
http://localhost:6333
(no API key needed) - For Qdrant Cloud: Enter your cluster URL and API key from the dashboard
- For local Docker: Use
You can also configure via environment variables:
QDRANT_URL
: Qdrant database connection URLQDRANT_API_KEY
: Qdrant API key (required for cloud, optional for local)OPENAI_API_KEY
: OpenAI API key for embeddings and chat
- OpenAI API Key: Enter your API key (must start with 'sk-')
- Qdrant Setup: Configure either local Docker or Qdrant Cloud connection
Text Input:
- Paste text directly into the textarea
- Minimum 10 characters required
- Text will be chunked and indexed automatically
Website Scraping:
- Enter a valid URL (must include http:// or https://)
- The system will scrape and index the website content
File Upload:
- Upload PDF, CSV, or TXT files (max 10MB)
- Files are processed and chunked automatically
- Progress feedback provided during processing
- View real-time statistics of indexed documents
- See total documents and chunks
- Review recent data sources
- Use the chat interface to ask questions about your indexed data
- The AI will provide responses based on the most relevant document chunks
- Conversation history is maintained during the session
- Create multiple chat sessions for different topics
POST /api/index-text
- Index text contentPOST /api/index-website
- Scrape and index websitePOST /api/index-file
- Process and index uploaded filesPOST /api/chat
- Chat with indexed dataGET /api/rag-store
- Get indexed document statisticsDELETE /api/delete-index
- Delete all indexed data
- Qdrant: High-performance vector database running in Docker
- Port: 6333 (default)
- Storage: Persistent volume for data retention
- Configuration: No API key required
- Managed Service: Fully managed Qdrant instance
- Scalability: Auto-scaling based on usage
- Security: Built-in authentication and encryption
- Global: Multiple regions available
- Configuration: Requires cluster URL and API key
- Text Chunking: Recursive character text splitter (1000 chars, 200 overlap)
- PDF Processing: LangChain PDF loader for text extraction
- CSV Processing: Structured data handling with metadata
- Web Scraping: Cheerio for clean HTML content extraction
- OpenAI GPT-4: For generating contextual responses
- LangChain: Document processing and RAG pipeline orchestration
- Embeddings: text-embedding-3-large for high-quality vector representations
pnpm run dev
- Build the application:
pnpm run build
- Start with Docker Compose:
docker-compose up -d
- Deploy to Vercel:
vercel deploy
- Set environment variables in Vercel dashboard:
QDRANT_URL
(your Qdrant Cloud cluster URL)QDRANT_API_KEY
(your Qdrant Cloud API key)OPENAI_API_KEY
(your OpenAI API key)
For Qdrant Cloud:
- Use environment variables for sensitive configuration
- Enable API key authentication
- Monitor usage and costs in Qdrant Cloud dashboard
- Consider data residency requirements
For Local Docker:
- Ensure persistent storage for production data
- Configure proper backup strategies
- Monitor resource usage and scaling needs
- Secure network access to Qdrant instance
"Collection not found" errors:
- The collection is created automatically when you first index data
- Ensure Qdrant is running and accessible
- Check your Qdrant URL and API key configuration
File upload failures:
- Check file size limits (10MB max)
- Ensure supported file formats (PDF, CSV, TXT)
- Verify OpenAI API key is valid
Chat not working:
- Ensure you have indexed some data first
- Check OpenAI API key configuration
- Verify Qdrant connection is working
- Use Qdrant Cloud for better performance and reliability
- Index documents in smaller batches for large datasets
- Monitor OpenAI API usage and costs
- Consider chunking strategies for different document types
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details
For issues and questions:
- Create an issue on GitHub
- Check the troubleshooting section
- Review the API documentation
- Visit Qdrant Cloud Documentation