Transform any documentation site into a searchable code database - CodeDox crawls documentation websites, intelligently extracts code snippets with context, and provides lightning-fast search via PostgreSQL full-text search and MCP (Model Context Protocol) integration for AI assistants.
For full documentation, installation guides, API reference, and more, visit:
# Clone the repository
git clone https://github.com/chriswritescode-dev/codedox.git
cd codedox
# Configure environment
cp .env.example .env
# Edit .env to add your CODE_LLM_API_KEY (optional for AI-enhanced extraction)
# Run the automated setup
./docker-setup.sh
# Access the web UI at http://localhost:5173
See the full installation guide for detailed instructions.
- Intelligent Web Crawling: Depth-controlled crawling with URL pattern filtering and domain restrictions
- Smart Code Extraction: Dual-mode extraction (Automatic Title / Description or LLM Generated Titles and Descriptions)
- Lightning-Fast Search: PostgreSQL full-text search with fuzzy matching
- GitHub Repository Processing: Clone and extract documentation from GitHub repositories with full path support (e.g.,
/tree/main/docs
) - MCP Integration: Native Model Context Protocol support for AI assistants
- Modern Web Dashboard: React + TypeScript UI for visual management
- Version Support: Track multiple versions of documentation
- Real-time Monitoring: Live crawl progress and health monitoring
- Upload Support: Upload documentation directly or from GitHub repositories (useful for repos with doc sites)
See our Contributing Guide for details on how to contribute to CodeDox.
Chris Scott - chriswritescode.dev
This project is licensed under the MIT License - see the LICENSE file for details.