Chat with sacred texts using advanced AI - query the Bhagavad Gita, Bible, Quran, Buddhist sutras, Tao Te Ching, and hundreds more spiritual texts with semantic search and intelligent responses.
Texts courtesy of sacred-texts.com - "the largest freely available archive of online books about religion, mythology, folklore and the esoteric on the Internet"
# 1. Clone and install
git clone https://github.com/artvandelay/sacred-text-llm
cd sacred-text-LLM
pip install -r requirements.txt
# 2. Get the data (choose one):
python data/download_sacred_texts.py # Download from sacred-texts.com (4GB, 15min)
# OR contact authors for pre-built vector database
# 3. Set up environment (IMPORTANT!)
cp .env.example .env # Copy environment template
# Edit .env file with your settings:
# - OPENROUTER_API_KEY=your-key-here (for better AI models)
# - LLM_PROVIDER=openrouter (or ollama for local-only)
# 4. Install AI models
brew install ollama # macOS
ollama serve &
ollama pull nomic-embed-text
ollama pull qwen3:30b-a3b # or your preferred model
# 5. Create vector database (if downloaded texts)
# This processes 33M+ words into ChromaDB with semantic embeddings
# Converts texts into 210K+ searchable chunks for AI retrieval
# Takes 1-2 hours but enables intelligent semantic search
python data/ingest.py --sources sacred_texts_archive/extracted
# 6. Start chatting!
python agent_chat.py # Multi-mode interface./deploy/deploy.sh
# Visit http://localhost:8001 or your ngrok URLβ±οΈ Setup time: 15 minutes download + 1-2 hours processing | πΎ Space needed: ~10GB
After setup, you can tune the AI behavior by editing your .env file:
# AI Provider & Models
LLM_PROVIDER=openrouter # ollama (local) or openrouter (cloud)
OLLAMA_CHAT_MODEL=qwen3:30b-a3b # Local model choice
OPENROUTER_CHAT_MODEL=anthropic/claude-3.5-sonnet # Cloud model choice
# Deep Research Behavior
MAX_ITERATIONS_PER_QUERY=4 # How many research cycles (1-8)
CONFIDENCE_THRESHOLD=0.75 # When to stop researching (0.1-1.0)
MAX_PARALLEL_QUERIES=10 # Search breadth per iteration (1-20)
# Search Settings
DEFAULT_SEARCH_K=5 # Results per search (1-20)
MAX_TOTAL_EVIDENCE_CHUNKS=15 # Total passages to analyze (5-50)
# UI Options
SHOW_AGENT_PROGRESS=true # Show thinking process
SHOW_DETAILED_PROGRESS=true # Verbose progress updates
SHOW_CONFIDENCE_SCORES=true # Display confidence levelsQuick tweaks:
- Faster responses: Lower
MAX_ITERATIONS_PER_QUERYto 2 - Deeper research: Increase
MAX_PARALLEL_QUERIESto 15-20 - Better quality: Switch to
LLM_PROVIDER=openrouterwith API key - Local only: Keep
LLM_PROVIDER=ollama(no API key needed)
The v3.0 architecture introduces modes - isolated experimental features you can switch between or extend with your own ideas.
π Deep Research Mode (deep_research)
- Iterative AI agent that plans, searches, and synthesizes comprehensive responses
- Performs 1-4 research cycles with parallel queries for thorough investigation
- Best for: Complex questions, cross-tradition comparisons, scholarly research
python agent_chat.py --mode deep_research --query "How do different traditions view suffering?"π§ Contemplative Mode (contemplative)
- Returns a single relevant passage with a thoughtful reflection question
- Focused on personal spiritual practice and meditation
- Best for: Daily reflection, spiritual guidance, mindfulness practice
python agent_chat.py --mode contemplative --query "What is inner peace?"python scripts/chat.py # Simple chat interface
python scripts/query.py "What is compassion?" # Single questionsWant to experiment with a new approach? The architecture makes it easy:
# 1. Generate a new mode template
python scripts/new_mode.py your_mode_name
# 2. Edit the generated file with your logic
# app/modes/your_mode.py is created with the basic structure
# 3. Register it in the system
# Add to app/modes/registry.py
# 4. Test your mode
python agent_chat.py --mode your_mode --query "test question"
# 5. Deploy instantly
./deploy/deploy.sh restartMode Ideas: Koan generator, verse finder, tradition comparison, debate facilitator, meditation guide, ritual explanation, historical context, or anything you can imagine!
- π v3.0 Architecture Guide - Complete technical overview
- π οΈ Creating New Modes - Step-by-step mode development
- π 33,298,287 words (33.3 million) across 362 sacred texts
- π 2,200,367 lines of spiritual wisdom (~73,345 pages at 30 lines/page)
- π 40+ spiritual traditions spanning all major world religions
- π 210K+ semantic chunks optimized for AI retrieval
πΏ Hindu Classics: Bhagavad Gita, Upanishads, Mahabharata, Vedic literature
ποΈ Buddhist Wisdom: Jataka Tales, Dhammapada, Tibetan and Zen texts
βͺοΈ Islamic Heritage: Quran, Sufi poetry, classical Islamic philosophy
βοΈ Christian Tradition: Church Fathers (Augustine, Aquinas), mystical texts
ποΈ Other Traditions: Tao Te Ching, Jewish mystical texts, Indigenous wisdom, Hermetic texts
- π§ Semantic Search: Find wisdom by meaning, not just keywords
- ποΈ Experimental Modes: Deep research agent + contemplative reflection modes
- π€ Modular AI: Switch between local models (Ollama) and cloud APIs (GPT-4, Claude)
- π Privacy-First: Keep sacred texts local, choose your AI provider, zero telemetry
- π± Multiple Interfaces: Chat UI, command line, or integrate via API
- β‘ Fast & Accurate: Optimized chunking and retrieval for spiritual content
- ποΈ Clean Architecture: Easy to add new experimental modes and features
π v3.0.0: Clean modes architecture, unified configuration, privacy-first design, experimental modes system
- Python 3.10+ with pip
- ~10GB free space (texts + vector database)
- 2-3 hours for initial setup (mostly processing time)
git clone <your-repo-url>
cd sacred-text-LLM
pip install -r requirements.txt# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Start Ollama service
ollama serve &
# Install embedding model
ollama pull nomic-embed-text
# Install chat model
ollama pull qwen3:30b-a3b# Option A: Download sacred texts (~4GB, 15 minutes)
python data/download_sacred_texts.py
# Option B: Contact authors for pre-built vector database
# If using Option A, create vector database (~2GB, 1-2 hours)
python data/ingest.py --sources sacred_texts_archive/extracted --mode fast
# Verify completion
python data/check_progress.pyFor better chat quality, set up OpenRouter API:
export OPENROUTER_API_KEY="your-key-here"
export LLM_PROVIDER="openrouter" # Switches from local to cloud# Deploy web interface
./deploy/deploy.sh
# Test CLI interfaces
python agent_chat.py --list-modes
python agent_chat.py --mode contemplative --query "What is wisdom?"Edit environment variables or .env file:
# AI Provider Settings
LLM_PROVIDER=ollama # or "openrouter"
OLLAMA_CHAT_MODEL=qwen3:30b-a3b # Local model
OPENROUTER_CHAT_MODEL=anthropic/claude-3.5-sonnet # Cloud model
# Database Settings
VECTOR_STORE_DIR=vector_store/chroma # Database location
COLLECTION_NAME=sacred_texts # Collection name
# Agent Behavior
MAX_ITERATIONS_PER_QUERY=4 # Research depth
CONFIDENCE_THRESHOLD=0.75 # Quality threshold
MAX_PARALLEL_QUERIES=10 # Search breadthVector store empty?
python data/check_progress.py
# Should show ~200K documents. If 0, re-run ingestion.Ollama connection issues?
ollama list # Check installed models
ollama serve & # Restart service
curl http://localhost:11434/api/tags # Test APIDeployment issues?
./deploy/setup.sh check # Validate environment
python deploy/test_web.py # Test web interface- Archive Size: 249 MB total (183 MB extracted texts)
- RAG Storage: ~2.6 GB (with embeddings/indexes when complete)
- File Count: 362 texts (.txt format)
- Download Time: 10-15 minutes
- Structure: Maintains sacred-texts.com hierarchy by tradition
- Primary: sacred-texts.com/download.htm
- Alternative: Contact authors for pre-built vector database
Phase 1: Data Collection β COMPLETE
- β Text processing and chunking (adaptive semantic/verse/paragraph)
- β Embedding generation and vector database setup
- β Basic search interface development
Phase 2: AI Interface β COMPLETE
- β Multi-mode interface (agent_chat.py) with deep research and contemplative modes
- β Simple chat interface (scripts/chat.py)
- β Command-line query tool (scripts/query.py)
- β Rich formatting and source attribution
- β Multi-tradition knowledge synthesis
Phase 3: Architecture β COMPLETE (v3.0.0)
- β Clean modes system for experimental features
- β Unified configuration and privacy-first design
- β Streamlined codebase and deployment system
The v3.0 architecture makes it easy to contribute new modes or features:
- Create a mode: Use
python scripts/new_mode.py your_idea - Test locally:
python agent_chat.py --mode your_idea - Deploy instantly:
./deploy/deploy.sh restart
See docs/creating-new-modes.md for detailed guidance.
Sacred Texts LLM v3.0.0 - Built for spiritual wisdom seekers, researchers, and AI experimenters
