🚧 Project Status: Under Development 🚧

doc-bot

Overview

doc-bot is a fully offline Retrieval-Augmented Generation (RAG) app targeting iOS and iPadOS. The goal is to let you chat with your own PDF documents using AI, with all processing done locally and only using downloaded models—no internet connection or cloud APIs required. The app uses SwiftUI as its interface builder for a modern, native experience. Import a PDF, ask questions, and get answers powered by local language models and embeddings.

Features

Fully Offline RAG: All retrieval, embedding, and LLM inference is performed on-device using only downloaded models. No cloud or online API calls.
Offline RAG Chat: Chat with your imported PDF documents using AI, with all processing done locally.
Multiple Conversations: Create and manage multiple conversation threads for each document, with automatic subject generation.
Conversation Management: Switch between conversations with a side drawer interface, view conversation history, and manage chat sessions.
PDF Import: Uses Apple PDFKit to extract text from PDF files with progress tracking and error handling.
Chunking, Embedding & Similarity Search: Utilizes Apple's NaturalLanguage framework to split text into chunks, generate embeddings, and perform similarity search—all on-device, without Faiss or external libraries.
Local Embedding Storage: Embeddings are saved as JSON files in the app support directory using FileManager for fast, private retrieval.
CoreData Persistence: Documents, conversations, and messages are stored using CoreData for reliability and offline access with cascade deletion support.
Local LLM Inference: Answers are generated using Qwen2.5-0.5B Instruct (default) or other edge-optimized GGUF models via llama.cpp integration.
Modern SwiftUI UI: Clean, native interface with improved animations, typing indicators, and message bubbles.
Internationalization: Multi-language support with English, Spanish, and Portuguese (Brazil) localizations.
Customizable Themes: Light, dark, and system theme options for personalized user experience.
Settings & Configuration: Dedicated settings view for language selection, theme customization, and app preferences.

How It Works

Import PDF: Select a PDF to import. The app extracts its text using PDFKit with real-time progress tracking.
Chunking: The text is split into manageable chunks using Apple's NaturalLanguage framework, targeting optimal size for embeddings.
Embedding Generation: Each chunk is embedded using a local embedding model (e.g., nomic-embed-text-v1.5 or bge-small-en-v1.5, in GGUF format).
Vector Storage & Search: Embeddings are stored as JSON files in the app support directory using FileManager, and similarity search is performed using Apple's NaturalLanguage framework to find relevant chunks—no Faiss required.
Persistence: All documents, conversations, and messages are saved using CoreData for offline access and reliability.
Multiple Conversations: Create multiple conversation threads for each document, with automatic subject generation based on the first user message.
Chat Interface: When you ask a question, the app finds the most relevant chunks using Apple's NaturalLanguage similarity search and uses a local LLM (Qwen2.5-0.5B Instruct or other edge AI models) via llama.cpp to generate an answer.
Conversation Management: Switch between conversations using the side drawer, view conversation history, and manage your chat sessions.

Supported Models

LLM Models (2026 Edge AI Optimized)

The app includes a curated selection of state-of-the-art edge AI models optimized for on-device inference on mobile devices:

Ultra-Lightweight Models (< 500 MB) - Best for resource-constrained devices:

Qwen2.5-0.5B Instruct (Q4_K_M, 0.32 GiB) ⭐ - Default model, latest Qwen version with improved instruction following
Qwen3-0.6B Instruct (Q8_0, 0.48 GiB) - Next-gen Qwen model with enhanced performance
SmolLM-360M Instruct (Q4_K_M, 0.23 GiB) - HuggingFace's ultra-efficient model for maximum battery life

Lightweight Models (0.5-1 GiB) - Optimal balance of performance and efficiency:

Qwen2.5-1.5B Instruct (Q4_K_M, 0.94 GiB) 🔥 - Recommended for best quality-to-size ratio
Gemma-2-2B Instruct (Q4_K_M, 1.38 GiB) - Google's efficient instruction-tuned model
StableLM-2-1.6B (Q4_K_M, 0.98 GiB) - Stability AI's mobile-optimized model

Medium Models (1-2 GiB) - Higher quality, still mobile-friendly:

Phi-3.5-Mini Instruct (Q4_K_M, 2.2 GiB) - Microsoft's latest Phi model with improved capabilities

All models are in GGUF format and run via llama.cpp integration. While not as powerful as cloud models like Claude Sonnet 4 or GPT-4, these models provide excellent results for on-device document Q&A and work completely offline, ensuring privacy and zero latency.

Embedding Models

nomic-embed-text-v1.5 - High-quality text embeddings optimized for semantic search
bge-small-en-v1.5 - Lightweight embedding model for efficient document chunking

All embedding models support GGUF format for on-device inference.

User Interface

Tabbed Interface: Easy navigation between Documents, Models, and Settings
Document Management: Import, view, and organize your PDF documents with enhanced UI
Chat Interface: Modern message bubbles with user/assistant differentiation and typing indicators
Conversation Drawer: Side panel for switching between multiple conversations per document
Progress Tracking: Real-time progress indicators for document import and model downloads
Responsive Design: Optimized for both iPhone and iPad with adaptive layouts

Internationalization & Accessibility

Multi-language Support: Full localization for English, Spanish, and Portuguese (Brazil)
Theme Options: Light, dark, and system-adaptive themes
Accessibility: Proper accessibility labels and VoiceOver support
Localized Strings: All user-facing text is properly localized for international users

Privacy & Offline

All processing (PDF parsing, chunking, embedding, LLM inference) is done on-device.
No data is sent to external servers.

Cons & Considerations

High Battery Consumption: Local processing for chunking, embedding, and LLM inference can significantly increase battery usage, especially on mobile devices.
Device Heating: Intensive computations may cause some devices to heat up during prolonged use.
Large Model Sizes: Even "small" models can be 1GB or more, requiring substantial storage space on your device.

Requirements

iOS or iPadOS device with Apple Silicon recommended for best performance.
Xcode for building and running the app.

Getting Started

Clone the repository
Open doc-bot.xcodeproj in Xcode
Build and run on your device or simulator
Download a model from the Models tab (Qwen2.5-0.5B Instruct or Qwen2.5-1.5B Instruct recommended for best results)
Import a PDF from the Documents tab
Start chatting with your document!
Create multiple conversations using the conversation drawer for different topics
Customize your experience in the Settings tab with themes and language preferences

Assets & Screenshots

▶️ Watch Demo Video (Google Drive)

Architecture

SwiftUI for UI with component-based architecture and reusable views
PDFKit for PDF text extraction with progress tracking
NaturalLanguage for chunking, embedding, and similarity search
CoreData for persistence of documents, conversations, and messages with proper relationship management
llama.cpp (via Swift bindings) for LLM and embedding inference with 2026 edge-optimized models (Qwen2.5-0.5B Instruct default for optimal mobile performance)
JSON (in App Support via FileManager) for vector storage
Combine/Factory for dependency injection and state management
Modular Design: Separated view components, repositories, and infrastructure layers for maintainability

New in This Version

✨ Multiple Conversations: Create and manage multiple conversation threads per document
🌍 Internationalization: Support for English, Spanish, and Portuguese (Brazil)
🎨 Theme Support: Light, dark, and system-adaptive themes
🗂️ Better Organization: Restructured UI components for better maintainability
📱 Enhanced UX: Improved animations, loading states, and user feedback
🔄 Conversation Switching: Side drawer for easy conversation navigation
📊 Progress Tracking: Real-time progress for imports and downloads
🏗️ Repository Pattern: Better data management with repository abstraction
🧪 Expanded Testing: Comprehensive test coverage including integration and performance tests

Extending & Customizing

Add New Models: Update the Models list to include additional GGUF models for LLM and embedding
Custom Themes: Extend the theme system with additional color schemes
New Languages: Add more localizations by creating new .lproj folders
UI Components: Leverage the modular component architecture to add new features
Repository Extensions: Implement additional repositories for new data types
Embedding Models: Swap out embedding or LLM models as needed for different use cases
Chunking Strategies: Extend chunking or retrieval logic for specific document types

License

MIT License. See LICENSE file for details.

Credits

doc-bot: Your offline, private PDF AI chat companion.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
.vscode		.vscode
LlamaCpp		LlamaCpp
doc-bot.xcodeproj		doc-bot.xcodeproj
doc-bot		doc-bot
doc-botTests		doc-botTests
doc-botUITests		doc-botUITests
images		images
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
.slather.yml		.slather.yml
CHANGELOG.md		CHANGELOG.md
Gemfile		Gemfile
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚧 Project Status: Under Development 🚧

doc-bot

Overview

Features

How It Works

Supported Models

LLM Models (2026 Edge AI Optimized)

Embedding Models

User Interface

Internationalization & Accessibility

Privacy & Offline

Cons & Considerations

Requirements

Getting Started

Assets & Screenshots

Architecture

New in This Version

Extending & Customizing

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

felnanuke2/doc-bot

Folders and files

Latest commit

History

Repository files navigation

🚧 Project Status: Under Development 🚧

doc-bot

Overview

Features

How It Works

Supported Models

LLM Models (2026 Edge AI Optimized)

Embedding Models

User Interface

Internationalization & Accessibility

Privacy & Offline

Cons & Considerations

Requirements

Getting Started

Assets & Screenshots

Architecture

New in This Version

Extending & Customizing

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages