-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the comprehensive documentation for Lobster AI - the AI-powered multi-omics bioinformatics analysis platform. This documentation provides everything you need to use, develop, and extend Lobster AI.
Start here if you're new to Lobster AI
- 01 - Getting Started - Quick 5-minute setup guide
- 02 - Installation - Comprehensive installation instructions
- 03 - Configuration - API keys, environment setup, and model profiles
Learn how to use Lobster AI for your research
- 04 - User Guide Overview - Understanding how Lobster AI works
- 05 - CLI Commands - Complete command reference with examples
- 06 - Data Analysis Workflows - Step-by-step analysis guides
- 07 - Data Formats - Supported input/output formats
Extend and contribute to Lobster AI
- 08 - Developer Overview - Architecture and development setup
- 09 - Creating Agents - Build new specialized AI agents
- 10 - Creating Services - Implement analysis services
- 11 - Creating Adapters - Add support for new data formats
- 12 - Testing Guide - Writing and running tests
- 49 - Custom Feature Agent 🆕 - AI-powered automated feature generation with Claude Code SDK ✨
Complete API documentation
- 13 - API Overview - API organization and conventions
- 14 - Core API - DataManagerV2 and client interfaces
- 15 - Agents API - Agent tools and capabilities
- 16 - Services API - Analysis service interfaces
- 17 - Interfaces API - Abstract interfaces and contracts
Deep dive into system design
- 18 - Architecture Overview - System design and components
- 19 - Agent System - Multi-agent coordination architecture
- 20 - Data Management - DataManagerV2 and modality system
- 21 - Cloud/Local Architecture - Hybrid deployment design
- 22 - Performance Optimization - Memory and speed optimizations
Deep dives into specialized capabilities and system internals (v0.2+)
Agent Enhancements:
- 31 - Data Expert Agent Enhancements - Workspace restoration and session continuity
- 32 - Agent-Guided Formula Construction - Interactive formula design for DE analysis
- 36 - Supervisor Configuration - Dynamic agent registry and auto-discovery
- 45 - Agent Customization Advanced - Advanced agent development patterns
Content & Publication Intelligence:
- 37 - Publication Intelligence Deep Dive 🆕 - Docling integration & PDF parsing ✨
- 38 - Workspace Content Service - Type-safe caching for research content
Infrastructure & Performance:
- 35 - Download Queue System 🆕 - Robust multi-step data acquisition with JSONL persistence ✨
- 39 - Two-Tier Caching Architecture - 30-50x speedup on repeat content access
- 43 - Docker Deployment Guide - Production containerization strategies
- 47 - Fix #7: HTTPS GEO Download 🆕 - 20x reliability improvement (91% → <5% corruption) ✨
Specialized Features:
- 40 - Protein Structure Visualization 🆕 - PyMOL integration for 3D protein analysis ✨
- 43 - S3 Backend Guide - Cloud storage integration
- 46 - Multi-Omics Integration - Cross-platform analysis workflows
Migration & Maintenance:
- 41 - Migration Guides - Upgrade paths and breaking changes
- 44 - Maintaining Documentation - Documentation workflows and standards
Learn by doing with practical tutorials
- 23 - Single-Cell RNA-seq Tutorial - Complete workflow with real data
- 24 - Bulk RNA-seq Tutorial - Differential expression analysis
- 26 - Custom Agent Tutorial - Create your own agent
- 27 - Examples Cookbook - Code recipes and patterns
Help and additional resources
- 28 - Troubleshooting - Common issues and solutions
- 29 - FAQ - Frequently asked questions
- 30 - Glossary - Bioinformatics and technical terms
- Analyze single-cell RNA-seq data
- Perform bulk RNA-seq differential expression
- Download and analyze GEO datasets
- Understand the two-tier caching system
- Implement custom download workflows
- Optimize publication content access
- Visualize protein structures with PyMOL
- Deploy with Docker in production
- Natural language interface for complex bioinformatics
- 8+ specialized AI agents for different analysis domains
- Intelligent workflow coordination and parameter optimization
- Single-Cell RNA-seq: QC, clustering, annotation, trajectory analysis
- Bulk RNA-seq: pyDESeq2 differential expression with complex designs
- Multi-Omics: Integrated cross-platform analysis
- Local Mode: Full privacy with data on your machine
- Cloud Mode: Scalable computing with managed infrastructure
- Hybrid: Automatic switching between modes
- Publication-ready visualizations
- W3C-PROV compliant provenance tracking
- Comprehensive quality control metrics
- Batch effect detection and correction
Current Release: v0.2 is the first public release of Lobster AI. See the comprehensive documentation for features and upgrade information.
Content Intelligence & Publications:
- 🧬 Protein Structure Visualization - PyMOL integration for 3D protein visualization and analysis (Details)
- 🔌 ContentAccessService - Unified publication/dataset access with 5 specialized providers (Details)
- 📄 Docling PDF Parsing - Structure-aware Methods section extraction with >90% hit rate (Details)
- 📊 Table Extraction - Parameter tables from scientific publications
- 🧮 Formula Preservation - Mathematical formulas in LaTeX format
Data Management:
- 📥 Download Queue System - Robust multi-step data acquisition with JSONL persistence (Details)
- ⚡ Enhanced Two-Tier Caching - 30-50x speedup on repeat content access (0.2-0.5s cached)
- 🔄 Workspace Restoration - Seamless session continuity (Details)
- 📂 Pattern-based Dataset Loading - Smart memory management
- 💾 Session Persistence - Automatic state tracking
- 💾 WorkspaceContentService - Type-safe caching for research content (Details)
Analysis & Workflows:
- 🧪 Formula-Based Differential Expression - Complex experimental designs with R-style formulas (Details)
- 🤖 Enhanced Data Expert Agent - New restoration tools and workflows
Infrastructure:
- 🏗️ Provider Infrastructure - Modular, extensible architecture for content retrieval
- 🏗️ Agent Registry Auto-Discovery - Dynamic agent configuration (Details)
- ⌨️ Enhanced CLI - Arrow navigation and command history
- 🎨 Rich Interface - Professional orange branding
- ⚡ Performance - Optimized startup and processing
Quick reference for feature availability across deployment modes.
| Feature | Local | Cloud |
|---|---|---|
| Content Intelligence | ||
| Docling structure-aware parsing | ✅ | ✅ |
| Two-tier publication access | ✅ | ✅ |
| ContentAccessService | ✅ | ✅ |
| Provider infrastructure (5 providers) | ✅ | ✅ |
| Analysis Capabilities | ||
| Simple DE (two-group) | ✅ | ✅ |
| Formula-based DE | ✅ | ✅ |
| Agent-guided formulas | ✅ | ✅ |
| Protein visualization (batch) | ✅ | ✅ |
| Protein visualization (interactive) | ✅ | |
| Data Management | ||
| Basic workspace | ✅ | ✅ |
| WorkspaceContentService | ✅ | ✅ |
| Download queue (JSONL) | ✅ | ✅ |
| Two-tier caching | ✅ | ✅ |
| Infrastructure | ||
| Auto agent discovery | ✅ | ✅ |
| FTP retry logic | ✅ | ✅ |
Legend:
- ✅ Full support
⚠️ Partial support (see notes below)
Note: Interactive PyMOL visualization requires local GUI support. Cloud mode supports batch image generation only.
For detailed feature documentation, see the Migration Guide.
- GitHub Repository: github.com/the-omics-os/lobster
- Issue Tracker: Report bugs or request features
- Discord Community: Join our community
- Enterprise Support: info@omics-os.com
This documentation follows these principles:
- Progressive Disclosure: Start simple, dive deeper as needed
- Task-Oriented: Organized by what you want to accomplish
- Example-Rich: Real datasets and practical code examples
- Cross-Referenced: Links between related topics
- Maintained: Regular updates with each release
Found an issue or want to improve the documentation?
- Check our developer overview
- Submit a pull request to the
docs/wikidirectory - Follow our code style guidelines
Documentation for Lobster AI v0.2+ | Last updated: 2025
Made with ❤️ by Omics-OS