An AI-based digital library that leverages advanced AI, RAG (Retrieval-Augmented Generation), and similarity search concepts to enable users to upload documents (PDFs and websites) and interact with AI-driven insights. The application generates summaries and embeddings, making documents easily searchable and interactive.
- Upload and embed PDFs and websites.
- Automatic summary generation and storage.
- Search for similar content using embeddings.
- AI-powered responses and insights from documents.
- User authentication for secure access.
- Cloud-based storage for documents and embeddings.
- Planned: Rate limiting for enhanced performance.
- Frontend: React (Client)
- Backend: Node.js, Express (Server)
- Database: MongoDB (Atlas)
- Vector Store: Upstash
- File Storage: Cloudinary
- Embeddings: Geko
- AI and RAG: LangChain, Gemini API
- Monorepo: Custom monorepo setup
vinitngr-iolib/
├── client/ # Frontend (React, Vite)
├── server/ # Backend (Node.js, Express, TypeScript)
├── practice-js/ # JS practice and RAG experiments
├── README.md
├── package.json
└── structure.txt
- Clone the repository.
git clone https://github.com/vinitngr/IOLIB.git
- Install dependencies for both client and server:
npm install cd client && npm install cd ../server && npm install
- Set up environment variables:
- MongoDB URI (Atlas)
- Upstash API Key
- Gemini API Key
- Cloudinary Credentials
- Run the application:
npm run dev
- Access the web app at
http://localhost:3000
. - Upload a PDF or website link to generate embeddings and summaries.
- Interact with the documents via AI-driven responses and similarity search.
- Access the web app at
http://localhost:3000
. - Upload a PDF or website link to generate embeddings and summaries.
- Interact with the documents via AI-driven responses and similarity search.
- Consider implementing rate limiting to manage high query volumes.
- Explore enhancing error handling and user feedback during file upload and processing.
- Evaluate the introduction of caching mechanisms to optimize frequent queries.
- Integrate Optical Character Recognition (OCR) to enable text extraction from scanned documents and images.
Feel free to open issues and contribute by submitting pull requests.