CookbookLM is an open source alternative for Google's NotebookLM. It is an intelligent document processing and note-taking application that transforms your PDFs into interactive, searchable knowledge bases. Built with modern web technologies and powered by local LLMs, it provides a secure, privacy-first alternative to cloud-based document analysis tools.
- π PDF Processing: Advanced OCR and table extraction
- π€ AI-Powered Analysis: Local LLM integration with Ollama & Groq
- ποΈ Smart Organization: Notebook-based document management
- π¬ Interactive Chat: Query your documents naturally
- π Note Taking: Rich text editor with AI-assisted writing
- π§ Memory Management: Persistent context and conversation history
- πΊοΈ Mindmap Generation: Visual knowledge maps from documents
- οΏ½ Multi-Model Support: Switch between Ollama and Groq models
- οΏ½π Privacy-First: All processing happens locally (Ollama) or securely (Groq)
- π Real-time Collaboration: Powered by Supabase
- Next.js 15 - React framework with App Router
- TypeScript - Type-safe development
- Tailwind CSS - Utility-first styling
- Radix UI - Accessible component primitives
- Framer Motion - Smooth animations
- Supabase - Backend-as-a-Service (Auth, Database, Storage, Realtime)
- MongoDB - Document database for flexible data storage
- Flask - Python web framework for PDF processing
- Ollama - Local LLM inference server
- LangChain - LLM orchestration framework
- Ollama - Local LLM inference (Llama, Qwen, DeepSeek, etc.)
- Groq - High-speed cloud LLM inference
- PDFPlumber - PDF text and table extraction
- Tesseract OCR - Optical character recognition
- Vector Embeddings - Semantic search capabilities
- Mindmap Engine - Dynamic knowledge graph generation
- Docker - Containerization
- Kong - API Gateway
- PostgreSQL - Relational database (via Supabase)
- Node.js 18.0 or higher
- Python 3.11 or higher
- Docker & Docker Compose (for containerized setup)
- Git for version control
- NVIDIA GPU with CUDA support
- NVIDIA Container Toolkit (for Docker GPU access)
- Code Editor (VS Code recommended)
- Terminal/Command Line access
# Clone the repository
git clone https://github.com/krishmakhijani/cookbookLM.git
cd cookbookLM
# Start all services
docker-compose up -d
# Access the application
open http://localhost:3000
- Web App: http://localhost:3000
- Supabase Studio: http://localhost:54323
- API Gateway: http://localhost:54321
- PDF Parser: http://localhost:5000
- MongoDB: localhost:27017
- Ollama: http://localhost:11434
# Download popular models
docker-compose exec ollama ollama pull llama3.2
docker-compose exec ollama ollama pull qwen2.5:7b
docker-compose exec ollama ollama pull deepseek-coder
git clone https://github.com/krishmakhijani/cookbookLM.git
cd cookbookLM
# Install and start Supabase CLI
npm install -g @supabase/cli
cd infrastructure/supabase
supabase start
# Install and start MongoDB
brew install mongodb/brew/mongodb-community
brew services start mongodb/brew/mongodb-community
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start Ollama service
ollama serve
# Download models (in another terminal)
ollama pull llama3.2
ollama pull qwen2.5:7b
cd pdfParserService
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install system dependencies (macOS)
brew install tesseract poppler
# Start the service
python app.py
cd web-app
# Install dependencies
npm install
# or
bun install
# Set up environment variables
cp .env.example .env.local
# Edit .env.local with your configuration
# Start development server
npm run dev
# or
bun dev
Create .env.local
in the web-app directory:
# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=""
SUPABASE_GRAPHQL_URL=""
SUPABASE_S3_STORAGE_URL=""
SUPABASE_DB_URL=""
SUPABASE_INBUCKET_URL=""
SUPABASE_JWT_SECRET=""
NEXT_PUBLIC_SUPABASE_ANON_KEY=""
SUPABASE_SERVICE_ROLE_KEY=""
SUPABASE_S3_ACCESS_KEY=""
SUPABASE_S3_SECRET_KEY=""
SUPABASE_S3_REGION="local"
MONGODB_URL=""
DEVELOPMENT_URL=""
OLLAMA_BASE_URL=""
GROQ_API_KEY=""
Feature | CookbookLM | NotebookLM |
---|---|---|
Privacy & Data Control | β Complete local processing | β Cloud-based processing |
Offline Functionality | β Works entirely offline (with Ollama) | β Requires internet connection |
Custom LLM Models | β Support for any Groq Model or Ollama Model | β Limited to Google's models |
Note Taking System | β Rich text editor with AI assistance | β Basic note creation |
Memory Management | β Persistent conversation history | β Limited session memory |
Mindmap Generation | β Dynamic visual knowledge maps | β No mindmap feature |
Multi-Provider AI | β Ollama + Groq integration | β Google models only |
Open Source | β Fully open source | β Proprietary |
Self-Hosting | β Deploy anywhere | β Google Cloud only |
PDF OCR Processing | β Advanced OCR with Tesseract | β Good PDF processing |
Table Extraction | β Structured table parsing | β Structured table parsing |
Real-time Collaboration | β Supabase real-time features | β Google Workspace integration |
Document Chat | β AI conversation | β AI conversation |
Note Organization | β Flexible notebook system | β Notebook organization |
Vector Search | β Semantic search capabilities | β AI-powered search |
Multi-language Support | β Configurable via models | β Google's language support |
API Access | β Full REST API | β Limited API access |
Customization | β Highly customizable | β Limited customization |
Cost | β Free (compute costs only) | β Usage-based pricing |
- π Privacy First: Your documents never leave your infrastructure (Ollama) or use secure APIs (Groq)
- π° Cost Effective: No subscription fees, pay-per-use with Groq or free with Ollama
- π οΈ Customizable: Modify and extend according to your needs
- π Offline Ready: Work without internet connectivity using Ollama models
- π€ Model Freedom: Use any Ollama model locally or Groq's optimized models
- π Data Ownership: Complete control over your data and processing
- π§ Smart Memory: Persistent context across sessions for better conversations
- πΊοΈ Visual Learning: Generate mindmaps to understand document relationships
- Custom Models - Load any GGUF model
- OpenAI OSS 120B - Latest OpenAI OSS models with high speed
- Deepseek R1 - Mixture of experts architecture
- Qwen 2.5 - Google's open models
- High-Speed Inference - Optimized for real-time responses
- Rich Text Editor with markdown support
- AI-Assisted Writing suggestions and completions
- Smart Formatting automatic structure detection
- Cross-References link notes to document sections
- Template System for consistent note organization
- Conversation History persistent across sessions
- Context Awareness remembers document relationships
- User Preferences adapts to your writing style
- Smart Summarization of long conversations
- Knowledge Graph builds connections between concepts
- Document Mapping visual representation of content structure
- Concept Extraction automatic identification of key topics
- Relationship Visualization shows connections between ideas
- Interactive Navigation click to jump to document sections
- Export Options save as image or interactive formats
- Collaborative Editing real-time mindmap sharing
- Upload Documents: Drag and drop PDF files to create new notebooks
- AI Processing: Documents are automatically processed and indexed with OCR and table extraction
- Interactive Chat: Ask questions about your documents using Ollama or Groq models
- Smart Notes: Create and organize notes with AI-assisted writing and suggestions
- Memory Management: Build persistent context that remembers your preferences and conversation history
- Mindmap Visualization: Generate interactive mindmaps to visualize document relationships and concepts
- Model Selection: Switch between local Ollama models and high-speed Groq models based on your needs
- Collaboration: Share notebooks with team members in real-time using Supabase
We welcome contributions! Please see our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
Built with β€οΈ for privacy-conscious knowledge workers