Document Responder AI is a cutting-edge multi-agent AI system that revolutionizes how users interact with documents. This privacy-focused solution leverages local AI processing to provide intelligent document analysis, question answering, and knowledge extraction capabilities.
document_responder_ai_venv_chat_ui/
βββ agents/
β βββ document_processor_agent.py # Document ingestion & preprocessing
β βββ retriever_agent.py # Document retrieval & search
β βββ answerer_agent.py # Question answering with local LLaMA
βββ ui/
β βββ main_app.py # Main Streamlit application
|
βββ chroma_db/ # Persistent vector database
βββ docs/ # Uploaded documents storage
βββ requirements.txt # Python dependencies
- Python 3.8+ - Primary programming language
- Streamlit - Web framework for interactive UI
- LangChain - AI orchestration and document processing
- Ollama - Local LLaMA 3-8B model interface
- ChromaDB - Vector database for embeddings storage
- PyMuPDF - PDF text extraction
langchain # AI orchestration framework
langchain-community # Community integrations
sentence-transformers # Embedding models
chromadb # Vector database
faiss-cpu # Similarity search
unstructured[pdf] # Document parsing
pymupdf # PDF processing
streamlit # Web UI framework
torch # PyTorch for embeddings
Purpose: Document ingestion and preprocessing
- Features:
- Extract text from PDF, DOCX, TXT files
- Split documents into 500-character chunks with 50-char overlap
- Generate embeddings using HuggingFace all-MiniLM-L6-v2
- Store in persistent Chroma vector database
Purpose: Document retrieval and similarity search
- Features:
- Load Chroma vector store from disk
- Create retriever interface for similarity search
- Retrieve relevant document chunks for queries
Purpose: Question answering with local LLaMA
- Features:
- Interface with local LLaMA 3-8B via Ollama
- Run RetrievalQA chain for RAG
- Generate contextual answers from retrieved documents
RAG Workflow Implementation
- Upload β User uploads documents via web interface
- Store β Files saved to
docs/folder - Extract β Text extracted based on file type:
- PDF: PyMuPDF extraction
- DOCX: Processed with python-docx
- TXT: Direct text processing
- Chunk β Split into 500-character chunks with 50-char overlap
- Embed β Generate embeddings using HuggingFace models
- Store β Save to Chroma vector database
- Receive Query β User asks question via chat
- Retrieve Context β Similarity search in vector store
- Augment Query β Combine with relevant chunks
- Generate Answer β LLaMA 3-8B creates response
- Display Result β Show answer in chat interface
- Interface: Ollama Python client via
langchain_community.llms.Ollama - Model:
llama3:8b(8-billion parameter version) - Access: Local HTTP API (default: http://localhost:11434)
- Configuration:
from langchain_community.llms import Ollama llm = Ollama(model="llama3:8b", temperature=0.2)
- Install Ollama: Download from https://ollama.ai
- Download Model:
ollama pull llama3:8b - Start Service:
ollama serve
- PDF: Extracted using PyMuPDF (fitz)
- DOCX: Processed with python-docx
- TXT: Direct text processing
- Chunk Size: 500 characters
- Chunk Overlap: 50 characters
- Embedding Model: HuggingFace all-MiniLM-L6-v2 (384 dimensions)
- Vector Database: ChromaDB with cosine similarity
- Storage: Persistent vector database in
chroma_db/
- Model Size: 8 billion parameters (LLaMA 3-8B)
- Embedding Dimensions: 384
- Response Time: 2-5 seconds per query
- Supported Languages: English (primary)
- Storage: Persistent vector database
- Python: 3.8 or higher
- Memory: 8GB+ RAM recommended
- Storage: 2GB+ for model and data
- OS: Windows 10/11, macOS, or Linux
- Research Document Analysis: Extract insights from academic papers
- Legal Document Review: Analyze contracts and legal documents
- Academic Paper Q&A: Ask questions about research papers
- Technical Documentation: Query technical manuals and guides
- Report Summarization: Extract key points from reports
- Knowledge Base Creation: Build searchable document repositories
- Researchers and academics
- Legal professionals
- Students and educators
- Technical writers
- Business analysts
- Anyone needing efficient document analysis
python -m venv venv
venv\Scripts\activate
# 1. Install dependencies
pip install -r requirements.txt
# 2. Download LLaMA 3-8B model
ollama pull llama3:8b
ollama serve
ollama run llama3
# 3. Run the application
streamlit run ui/main_app.py- Upload Documents: Use the web interface to upload PDF, DOCX, or TXT files
- Ask Questions: Use the chat interface to ask questions about your documents
- Get Answers: Receive contextual answers generated by the local AI model
- Local Processing: All processing happens on your machine
- No External APIs: No data sent to external services
- Encrypted Storage: Vector database is stored locally
- Privacy-First: Designed for sensitive document handling
Document Responder AI - Your intelligent, privacy-focused document analysis companion.



