DOCUMIND-AI

Intelligent Offline Document Q&A Assistant

DOCUMIND-AI is an AI-powered offline document question-answering system that allows users to upload documents and ask natural language questions about their content. The system uses Retrieval-Augmented Generation (RAG) with local LLMs to ensure privacy, low latency, and zero dependency on cloud APIs.

This project focuses on practical AI engineering: document ingestion, semantic search, OCR, vector indexing, and end-to-end system integration.

🚀 Key Features

📄 Upload and process PDF, DOCX, and TXT documents
🔍 Semantic search using FAISS vector database
🧠 Context-aware Q&A using local LLMs (Ollama)
🖼️ OCR support for scanned PDFs using Tesseract
🔒 Fully offline & privacy-first architecture
📱 Mobile-friendly frontend built with React Native (Expo)

🧠 System Architecture

Document Ingestion Uploaded documents are parsed and split into chunks.
OCR Processing Scanned PDFs are processed using Tesseract OCR to extract text.
Embedding & Indexing Text chunks are converted into embeddings and stored in a FAISS index.
Query Processing User queries are embedded and matched against the FAISS index.
LLM Response Generation Relevant document context is passed to a local LLM via LangChain + Ollama.

🧩 Tech Stack

Backend

Python
Flask
LangChain
FAISS
Ollama (local LLM runtime)
Tesseract OCR

Frontend

React Native
Expo

📂 Project Structure

DOCUMIND-AI/
│
├── app.py                 # Flask backend entry point
├── ollama_llm.py          # Local LLM wrapper (Ollama + LangChain)
├── setup_models.py        # Script to setup required local models
├── utils.py               # Helper functions (OCR, embeddings, file handling)
│
├── document/              # Uploaded documents
├── index_store/           # FAISS vector indexes
│
├── frontend/              # React Native mobile application
│
├── requirements.txt       # Python dependencies
├── README.md              # Project documentation
└── LICENSE                # MIT License

📊 Design Decisions

FAISS was chosen for fast, in-memory vector similarity search.
Local LLMs (Ollama) ensure data privacy and offline usability.
LangChain simplifies RAG pipeline orchestration.
OCR integration enables handling of real-world scanned documents.

📈 Performance Notes

Average query latency depends on model size and hardware
Optimized for single-user, local inference
Suitable for personal research, study, and document analysis

🎯 Use Cases

Academic research paper analysis
Resume and document review
Legal or policy document exploration
Personal knowledge base creation

🔮 Future Improvements

Add unit and integration tests
Improve chunking and retrieval accuracy
Support multi-document conversation memory
Add admin dashboard for document management
Deploy backend as a containerized service

🧑‍💻 Author

Karan Shelar GitHub: https://github.com/Edge-Explorer

📜 License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DOCUMIND-AI

Intelligent Offline Document Q&A Assistant

🚀 Key Features

🧠 System Architecture

🧩 Tech Stack

Backend

Frontend

📂 Project Structure

📊 Design Decisions

📈 Performance Notes

🎯 Use Cases

🔮 Future Improvements

🧑‍💻 Author

📜 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
documind-frontend		documind-frontend
indices		indices
ingest		ingest
qa		qa
LICENSE		LICENSE
README.md		README.md
app.py		app.py
create_index.py		create_index.py
document		document
ollama_llm.py		ollama_llm.py
requirements.txt		requirements.txt
setup_models.py		setup_models.py

License

Edge-Explorer/DOCUMIND-AI

Folders and files

Latest commit

History

Repository files navigation

DOCUMIND-AI

Intelligent Offline Document Q&A Assistant

🚀 Key Features

🧠 System Architecture

🧩 Tech Stack

Backend

Frontend

📂 Project Structure

📊 Design Decisions

📈 Performance Notes

🎯 Use Cases

🔮 Future Improvements

🧑‍💻 Author

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages