Skip to content

krishstwt/cookbookLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

48 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š CookbookLM

Preview

Next.js TypeScript React TailwindCSS Bun

Python Flask LangChain

Supabase PostgreSQL MongoDB

Ollama Groq OpenAI

Llama DeepSeek Qwen

Docker Kong Tesseract


🌟 Cover Page

CookbookLM is an open source alternative for Google's NotebookLM. It is an intelligent document processing and note-taking application that transforms your PDFs into interactive, searchable knowledge bases. Built with modern web technologies and powered by local LLMs, it provides a secure, privacy-first alternative to cloud-based document analysis tools.

✨ Key Features

  • πŸ“„ PDF Processing: Advanced OCR and table extraction
  • πŸ€– AI-Powered Analysis: Local LLM integration with Ollama & Groq
  • πŸ—‚οΈ Smart Organization: Notebook-based document management
  • πŸ’¬ Interactive Chat: Query your documents naturally
  • πŸ“ Note Taking: Rich text editor with AI-assisted writing
  • 🧠 Memory Management: Persistent context and conversation history
  • πŸ—ΊοΈ Mindmap Generation: Visual knowledge maps from documents
  • οΏ½ Multi-Model Support: Switch between Ollama and Groq models
  • οΏ½πŸ”’ Privacy-First: All processing happens locally (Ollama) or securely (Groq)
  • 🌐 Real-time Collaboration: Powered by Supabase

πŸ› οΈ Tech Stack

Frontend

  • Next.js 15 - React framework with App Router
  • TypeScript - Type-safe development
  • Tailwind CSS - Utility-first styling
  • Radix UI - Accessible component primitives
  • Framer Motion - Smooth animations

Backend Services

  • Supabase - Backend-as-a-Service (Auth, Database, Storage, Realtime)
  • MongoDB - Document database for flexible data storage
  • Flask - Python web framework for PDF processing
  • Ollama - Local LLM inference server

AI & Processing

  • LangChain - LLM orchestration framework
  • Ollama - Local LLM inference (Llama, Qwen, DeepSeek, etc.)
  • Groq - High-speed cloud LLM inference
  • PDFPlumber - PDF text and table extraction
  • Tesseract OCR - Optical character recognition
  • Vector Embeddings - Semantic search capabilities
  • Mindmap Engine - Dynamic knowledge graph generation

Infrastructure

  • Docker - Containerization
  • Kong - API Gateway
  • PostgreSQL - Relational database (via Supabase)

πŸ“‹ Prerequisites

System Requirements

  • Node.js 18.0 or higher
  • Python 3.11 or higher
  • Docker & Docker Compose (for containerized setup)
  • Git for version control

For GPU Acceleration (Optional)

  • NVIDIA GPU with CUDA support
  • NVIDIA Container Toolkit (for Docker GPU access)

Development Tools

  • Code Editor (VS Code recommended)
  • Terminal/Command Line access

πŸš€ Installation

Option 1: Docker Compose (Recommended)

Quick Start

# Clone the repository
git clone https://github.com/krishmakhijani/cookbookLM.git
cd cookbookLM

# Start all services
docker-compose up -d

# Access the application
open http://localhost:3000

Service URLs

Download LLM Models

# Download popular models
docker-compose exec ollama ollama pull llama3.2
docker-compose exec ollama ollama pull qwen2.5:7b
docker-compose exec ollama ollama pull deepseek-coder

Option 2: Manual Setup

1. Clone & Setup

git clone https://github.com/krishmakhijani/cookbookLM.git
cd cookbookLM

2. Database Setup

# Install and start Supabase CLI
npm install -g @supabase/cli
cd infrastructure/supabase
supabase start

# Install and start MongoDB
brew install mongodb/brew/mongodb-community
brew services start mongodb/brew/mongodb-community

3. Ollama Setup

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start Ollama service
ollama serve

# Download models (in another terminal)
ollama pull llama3.2
ollama pull qwen2.5:7b

4. PDF Parser Service

cd pdfParserService

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install system dependencies (macOS)
brew install tesseract poppler

# Start the service
python app.py

5. Web Application

cd web-app

# Install dependencies
npm install
# or
bun install

# Set up environment variables
cp .env.example .env.local
# Edit .env.local with your configuration

# Start development server
npm run dev
# or
bun dev

Environment Variables

Create .env.local in the web-app directory:

# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=""
SUPABASE_GRAPHQL_URL=""
SUPABASE_S3_STORAGE_URL=""
SUPABASE_DB_URL=""
SUPABASE_INBUCKET_URL=""
SUPABASE_JWT_SECRET=""
NEXT_PUBLIC_SUPABASE_ANON_KEY=""
SUPABASE_SERVICE_ROLE_KEY=""
SUPABASE_S3_ACCESS_KEY=""
SUPABASE_S3_SECRET_KEY=""
SUPABASE_S3_REGION="local"
MONGODB_URL=""
DEVELOPMENT_URL=""
OLLAMA_BASE_URL=""
GROQ_API_KEY=""

βš–οΈ CookbookLM vs NotebookLM

Feature CookbookLM NotebookLM
Privacy & Data Control βœ… Complete local processing ❌ Cloud-based processing
Offline Functionality βœ… Works entirely offline (with Ollama) ❌ Requires internet connection
Custom LLM Models βœ… Support for any Groq Model or Ollama Model ❌ Limited to Google's models
Note Taking System βœ… Rich text editor with AI assistance βœ… Basic note creation
Memory Management βœ… Persistent conversation history ❌ Limited session memory
Mindmap Generation βœ… Dynamic visual knowledge maps ❌ No mindmap feature
Multi-Provider AI βœ… Ollama + Groq integration ❌ Google models only
Open Source βœ… Fully open source ❌ Proprietary
Self-Hosting βœ… Deploy anywhere ❌ Google Cloud only
PDF OCR Processing βœ… Advanced OCR with Tesseract βœ… Good PDF processing
Table Extraction βœ… Structured table parsing βœ… Structured table parsing
Real-time Collaboration βœ… Supabase real-time features βœ… Google Workspace integration
Document Chat βœ… AI conversation βœ… AI conversation
Note Organization βœ… Flexible notebook system βœ… Notebook organization
Vector Search βœ… Semantic search capabilities βœ… AI-powered search
Multi-language Support βœ… Configurable via models βœ… Google's language support
API Access βœ… Full REST API ❌ Limited API access
Customization βœ… Highly customizable ❌ Limited customization
Cost βœ… Free (compute costs only) ❌ Usage-based pricing

🎯 Why Choose CookbookLM?

  • πŸ”’ Privacy First: Your documents never leave your infrastructure (Ollama) or use secure APIs (Groq)
  • πŸ’° Cost Effective: No subscription fees, pay-per-use with Groq or free with Ollama
  • πŸ› οΈ Customizable: Modify and extend according to your needs
  • 🌍 Offline Ready: Work without internet connectivity using Ollama models
  • πŸ€– Model Freedom: Use any Ollama model locally or Groq's optimized models
  • πŸ“Š Data Ownership: Complete control over your data and processing
  • 🧠 Smart Memory: Persistent context across sessions for better conversations
  • πŸ—ΊοΈ Visual Learning: Generate mindmaps to understand document relationships

πŸ€– AI Models & Features

Supported AI Providers

Ollama (Local Processing)

  • Custom Models - Load any GGUF model

Groq (Cloud Processing)

  • OpenAI OSS 120B - Latest OpenAI OSS models with high speed
  • Deepseek R1 - Mixture of experts architecture
  • Qwen 2.5 - Google's open models
  • High-Speed Inference - Optimized for real-time responses

Core Features

πŸ“ Advanced Note Taking

  • Rich Text Editor with markdown support
  • AI-Assisted Writing suggestions and completions
  • Smart Formatting automatic structure detection
  • Cross-References link notes to document sections
  • Template System for consistent note organization

🧠 Memory Management

  • Conversation History persistent across sessions
  • Context Awareness remembers document relationships
  • User Preferences adapts to your writing style
  • Smart Summarization of long conversations
  • Knowledge Graph builds connections between concepts

πŸ—ΊοΈ Mindmap Generation

  • Document Mapping visual representation of content structure
  • Concept Extraction automatic identification of key topics
  • Relationship Visualization shows connections between ideas
  • Interactive Navigation click to jump to document sections
  • Export Options save as image or interactive formats
  • Collaborative Editing real-time mindmap sharing

πŸ“– Usage

  1. Upload Documents: Drag and drop PDF files to create new notebooks
  2. AI Processing: Documents are automatically processed and indexed with OCR and table extraction
  3. Interactive Chat: Ask questions about your documents using Ollama or Groq models
  4. Smart Notes: Create and organize notes with AI-assisted writing and suggestions
  5. Memory Management: Build persistent context that remembers your preferences and conversation history
  6. Mindmap Visualization: Generate interactive mindmaps to visualize document relationships and concepts
  7. Model Selection: Switch between local Ollama models and high-speed Groq models based on your needs
  8. Collaboration: Share notebooks with team members in real-time using Supabase

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ for privacy-conscious knowledge workers

⭐ Star us on GitHub |

Releases

No releases published

Packages

No packages published