Distributed AI Architecture for Offensive Security and Developer Tooling
Inspired by the octopus's distributed nervous system, OctoLLM reimagines AI architecture through biological intelligence principles, achieving superior cost efficiency, security, and flexibility compared to monolithic LLM systems.
OctoLLM is a distributed AI system inspired by octopus neurobiology, where:
- Central Brain (Orchestrator): Strategic planning and coordination using frontier LLMs (GPT-4, Claude Opus)
- Autonomous Arms (Specialists): Domain-specific execution with local decision-making
- Reflex Layer: Fast preprocessing for common patterns without LLM involvement
- Distributed Memory: Global semantic memory + local episodic stores per arm
Biological octopuses have:
- 40M neurons in brain → Strategic planning (Orchestrator)
- 350M neurons in arms → Local intelligence (6 specialized arms)
- Direct arm-to-arm communication → Inter-component messaging without orchestrator bottleneck
- Fast reflexes → Preprocessing layer for common patterns (<10ms latency)
OctoLLM applies these principles to build a distributed AI system that is more efficient, robust, and specialized than monolithic LLM architectures.
graph TB
subgraph "Ingress Layer"
API[API Gateway]
REF[Reflex Layer<br/><10ms]
end
subgraph "Orchestration Layer"
ORCH[Orchestrator Brain<br/>GPT-4/Claude]
GMEM[(Global Memory<br/>PostgreSQL)]
end
subgraph "Execution Layer - Arms"
PLAN[Planner Arm<br/>Task Decomposition]
EXEC[Executor Arm<br/>Sandboxed Actions]
CODE[Coder Arm<br/>Code Generation]
JUDG[Judge Arm<br/>Validation]
SAFE[Guardian Arm<br/>PII Protection]
RETR[Retriever Arm<br/>Knowledge Search]
end
API --> REF
REF --> ORCH
ORCH --> PLAN & EXEC & CODE & JUDG & SAFE & RETR
ORCH <--> GMEM
- Swarm Decision-Making: Multiple arms work in parallel for high-stakes decisions
- Autonomous Execution: Arms make local decisions without constant orchestrator involvement
- Conflict Resolution: Judge arm arbitrates disagreements between proposals
- Reflex Cache: 60%+ cache hit rate for common queries (<10ms latency)
- Hierarchical Processing: Simple tasks use cheap models, complex tasks use GPT-4
- Cost Efficiency: 50% cost reduction vs monolithic LLM systems
- Capability Isolation: Time-limited JWT tokens, sandboxed execution (gVisor)
- PII Protection: Multi-layer redaction (regex, NLP, embedding-based)
- Prompt Injection Defense: Pattern matching at reflex layer
- Compliance Ready: SOC 2 Type II, ISO 27001, GDPR/CCPA
- Kubernetes Deployment: Horizontal Pod Autoscaling, multi-zone HA
- Comprehensive Monitoring: Prometheus + Grafana + Loki + Jaeger
- Disaster Recovery: 15-minute RPO, 1-hour RTO with automated backups
- Cost Optimization: Spot instances, model selection, aggressive caching
Current Sprint: Sprint 1.2 ✅ COMPLETE (2025-11-15) Sprint Status: Orchestrator Core complete - production-ready Next Sprint: Phase 1 Sprint 1.3 (Planner Arm Integration) - Ready to start Overall Progress: Phase 0: 100% ✅ | Phase 1: ~22% (2/9 sprints complete: Reflex Layer + Orchestrator) Version: 1.2.0 | Sprint 1.2 Completion: November 15, 2025 | Sprint 1.3 Start: TBD
Documentation Reorganization (2025-11-16):
- ✅ Comprehensive mdBook Site: 134 pages across 12 major sections deployed to GitHub Pages
- Live Site: https://doublegate.github.io/OctoLLM/
- Full-text search with 17.1MB search index
- Hierarchical navigation with breadcrumbs
- Complete OpenAPI specifications for all 8 components
- ✅ Phase 0 Archive: Historical record preserved at docs/phases/PHASE-0-README-ARCHIVE.md
- ✅ README Streamlined: Reduced by 291 lines (26.5%) by archiving Phase 0 content
- ✅ 100% Documentation Coverage: All project content integrated into searchable mdBook format
Codecov Integration (2025-11-16):
- ✅ Coverage Analytics: 80% project target, 85% patch target with branch coverage
- ✅ Test Analytics: JUnit XML reporting for pytest with automated result uploads
- ✅ Bundle Analysis: Configuration prepared for TypeScript SDK optimization
- ✅ Component Flags: 10 granular coverage flags (orchestrator, reflex-layer, 6 arms, 2 SDKs)
- ✅ CI/CD Integration: Automated coverage and test result uploads in GitHub Actions
- ✅ Advanced Features: Carryforward flags for stable components, after_n_builds for accuracy
Sprint 1.2 Completion (2025-11-15):
- ✅ Orchestrator Core: 1,776 lines Python, 87 tests passing, 85%+ coverage
- ✅ Reflex Layer Integration: Circuit breaker pattern, retry logic, health checks
- ✅ Database Layer: Async SQLAlchemy 2.0 + PostgreSQL with connection pooling
- ✅ Complete OpenAPI specification and comprehensive documentation
| Component | Status | Details |
|---|---|---|
| CI/CD Pipeline | ✅ Complete | 4 workflows operational (lint, test, security, build) |
| Security Scanning | ✅ Complete | Multi-layer: SAST (Bandit), dependencies (Snyk, cargo-audit), secrets (gitleaks) |
| Test Framework | ✅ Complete | pytest + cargo test + Codecov integration |
| Development Environment | ✅ Complete | Docker Compose with 13 services, hot-reload support |
| Pre-commit Hooks | ✅ Complete | 15+ quality checks (Black, Ruff, mypy, rustfmt, clippy) |
| Container Builds | ⏸️ Disabled | Multi-arch builds configured, will enable in Phase 1 |
| Documentation | ✅ Complete | 170+ files, ~243,210 lines (includes Sprint 0.10 validation & Phase 1 planning) |
| API Documentation & SDKs | ✅ Complete | TypeScript SDK, Postman/Insomnia collections, 8 service docs, 6 schema docs, 6 diagrams |
| Monitoring & Observability | ✅ Complete | Grafana (6 dashboards), Prometheus (50+ alerts), Loki, Jaeger (Sprint 0.9) |
Phase 0 established the foundational infrastructure for OctoLLM development. For complete details on all 10 sprints, achievements, and metrics, see:
Phase 0: Foundation - Complete Archive - 100% Complete (November 10-13, 2025)
Key Highlights:
- 10/10 sprints completed (Repository setup, Dev environment, CI/CD, API specs, Cloud IaC, Local deployment, Monitoring)
- 170+ documentation files (~243,210 lines)
- 4 GitHub Actions workflows operational
- 3 deployment options (Docker Compose, GCP/Kubernetes, Unraid)
- $15,252/year cloud cost savings (vs AWS)
- 96/100 security score, 0 critical vulnerabilities
| Component | Phase 0 Status | Phase 1 Target |
|---|---|---|
| Orchestrator | ✅ Docker + config | Full implementation (FastAPI + LLM integration) |
| Reflex Layer | ✅ Docker + minimal Rust | Full implementation (PII detection, caching, rate limiting) |
| Planner Arm | ✅ Docker + config | Full implementation (task decomposition) |
| Executor Arm | ✅ Docker + minimal Rust | Full implementation (sandboxed execution) |
| Coder Arm | ✅ Docker + config | Phase 2 implementation |
| Judge Arm | ✅ Docker + config | Phase 2 implementation |
| Retriever Arm | ✅ Docker + config | Phase 2 implementation |
| Safety Guardian | ✅ Docker + config | Phase 2 implementation |
| PostgreSQL | ✅ Operational | Schema implementation in Phase 1 |
| Redis | ✅ Operational | Full caching logic in Phase 1 |
| Qdrant | ✅ Operational | Vector embeddings in Phase 2 |
Note: Phase 0 focuses on infrastructure setup. Full service implementation begins in Phase 1 (December 2025).
- Architecture: 3 files, 5,550 lines (system overview, data flow, swarm decision-making)
- Components: 11 files (orchestrator, reflex layer, 6 arms + 2 infrastructure)
- Implementation: 7 files (getting started, dev environment, custom arms, memory systems)
- Security: 6 files, 22,394 lines (threat model, PII protection, compliance)
- Operations: 11 files (deployment, monitoring, DR, scaling, troubleshooting, monitoring runbook, alert procedures)
- Engineering: 5 files, 3,360 lines (coding standards, error handling, logging)
- API: 23 files (API-OVERVIEW, 8 services, 6 schemas, collections, OpenAPI specs)
- Architecture Diagrams: 6 files, 1,544 lines (Mermaid diagrams for visualization)
- ADR: 7 files (architecture decisions - includes ADR-007 Unraid)
- Sprint Reports: 9 files (completion reports for Sprints 0.3-0.9)
- TODOs: 15 files (MASTER-TODO + phase-specific + Sprint 0.6 tracking)
OctoLLM uses GitHub Actions for continuous integration and deployment with 4 comprehensive workflows:
Status: ✅ PASSING | Duration: ~1m 9s | Triggers: Push/PR to main/develop
- Python: Ruff (linting + import sorting), Black (formatting), mypy (type checking)
- Rust: rustfmt (formatting), clippy (linting with
-D warnings) - Features: Dependency caching, concurrency control, automatic failure detection
Status: ✅ PASSING | Duration: ~2m 30s | Triggers: Push/PR to main/develop
- Python Unit Tests: pytest on Python 3.11 and 3.12 with matrix strategy
- Rust Unit Tests: cargo test for reflex-layer and executor services
- Integration Tests: PostgreSQL 15 + Redis 7 services with health checks
- Coverage: Codecov integration with XML reports
- Phase 0 Note: Placeholder tests validate project structure (real tests in Phase 1)
Status: ✅ PASSING | Duration: ~3m 0s | Triggers: Push/PR, daily at midnight UTC
- SAST: Bandit for Python code vulnerabilities
- Dependencies: Snyk (Python packages), cargo-audit (Rust crates)
- Secrets: gitleaks for credential detection (full git history scan)
- Container: Trivy (disabled in Phase 0, will enable in Phase 1)
- Integration: SARIF results to GitHub Security tab, 30-day artifact retention
Status: ⏸️ DISABLED (Phase 0) | Duration: ~7s (summary only) | Triggers: Push/PR, version tags
- Multi-arch: linux/amd64, linux/arm64
- Registry: GitHub Container Registry (GHCR)
- Services: 8 total (orchestrator, reflex-layer, 6 arms)
- Features: BuildKit caching, automatic tagging, post-build vulnerability scanning
- Phase 1 Activation: Change
if: falsetoif: truein workflow file
Sprint 0.3 Security Fixes (2025-11-11):
- ✅ Fixed 7 CVEs: 4 HIGH + 3 MEDIUM severity
- ✅ python-multipart: DoS and ReDoS vulnerabilities patched
- ✅ starlette: DoS vulnerabilities patched
- ✅ langchain: SQL injection and DoS vulnerabilities patched
- ✅ Daily automated security scans operational
Current Security Posture:
- Multi-layer scanning (SAST, dependency, secrets)
- Zero HIGH/CRITICAL vulnerabilities in production dependencies
- Automated daily scans with GitHub Security tab integration
.github/workflows/
├── lint.yml # Code quality checks
├── test.yml # Unit + integration tests
├── security.yml # Security scanning
└── build.yml # Docker image builds
Required:
- Docker 24.0+ and Docker Compose 2.20+
- Git
- Text editor (VS Code recommended)
Optional (for local development without Docker):
- Python 3.11+ (for Python services)
- Rust 1.82.0+ (for Rust services)
- OpenAI API key (for LLM functionality in Phase 1+)
# Clone repository
git clone https://github.com/doublegate/OctoLLM.git
cd OctoLLM
# Install pre-commit hooks (recommended)
python -m pip install pre-commit
pre-commit install
# Verify setup - run linting and formatting checks
make lint # or: ruff check . && black --check . && cargo fmt --check
# Run placeholder tests (validates project structure)
make test # or: pytest tests/ -v
# View all available commands
make help# Start development environment (placeholder services + databases)
cd infrastructure/docker-compose
docker-compose -f docker-compose.dev.yml up -d
# Check service health
docker-compose -f docker-compose.dev.yml ps
# View logs for specific service
docker-compose -f docker-compose.dev.yml logs -f orchestrator
# Stop environment
docker-compose -f docker-compose.dev.yml downOctoLLM Services (placeholder in Phase 0):
- Orchestrator: http://localhost:8000
- Reflex Layer: http://localhost:8080 (minimal Rust service)
- Planner Arm: http://localhost:8001
- Coder Arm: http://localhost:8003
- Judge Arm: http://localhost:8004
- Retriever Arm: http://localhost:8002
- Safety Guardian: http://localhost:8005
- Executor Arm: http://localhost:8006 (minimal Rust service)
Infrastructure Services (fully operational):
- PostgreSQL: localhost:15432 (user: octollm, db: octollm)
- Redis: localhost:6379
- Qdrant: localhost:6333 (REST API), localhost:6334 (gRPC)
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000
Note: Phase 0 services are minimal "hello world" placeholders. Full implementation begins in Phase 1 (December 2025).
# Make code changes
# ...
# Run pre-commit hooks manually (before commit)
pre-commit run --all-files
# Run tests
pytest tests/unit/ -v
cargo test --manifest-path services/reflex-layer/Cargo.toml
# Format code
black .
cargo fmt --manifest-path services/reflex-layer/Cargo.toml
# Check linting
ruff check .
cargo clippy --manifest-path services/reflex-layer/Cargo.toml -- -D warnings
# Commit changes (pre-commit hooks run automatically)
git add .
git commit -m "feat: your feature description"See Local Development Guide for detailed instructions.
- Vulnerability Assessment: Swarm of arms analyzes code from multiple perspectives (OWASP, STRIDE, pentesting)
- Exploit Development: Coder arm generates exploits, Judge arm validates, Guardian ensures ethical boundaries
- Reconnaissance: Retriever arm aggregates OSINT from multiple sources
- Code Review: 4-arm swarm checks style, performance, security, test coverage
- Documentation Generation: Coder arm writes docs, Judge validates accuracy
- Debugging Assistance: Retriever finds similar issues, Planner suggests fix strategies
- Literature Review: Retriever arm queries arXiv, Google Scholar, GitHub
- Comparative Analysis: Multiple arms research alternatives, Judge synthesizes findings
- Technical Writing: Coder drafts content, Judge ensures accuracy
- Phase 0: Foundation - 100% Complete (November 10-13, 2025)
- Infrastructure setup, CI/CD, API specifications, cloud provisioning, monitoring
- 10/10 sprints completed in <1 week (>50% faster than 2-week estimate)
Timeline: December 2025 (estimated 2 weeks) | Prerequisites: Phase 0 complete
Goal: Minimal end-to-end task execution with 4 core components
Deliverables:
- Reflex Layer implementation (Rust) - <10ms latency, PII detection, caching
- Orchestrator core implementation (Python) - Task routing, LLM integration
- Planner Arm implementation (Python) - Task decomposition with GPT-3.5-turbo
- Executor Arm implementation (Rust) - Sandboxed command execution
- Basic end-to-end workflow testing
- E2E tests with >90% success rate
Success Criteria:
- All 4 components deployed and healthy
- E2E tests passing (>90% success rate)
- Latency targets met (P99 <30s)
- Security tests passing (no sandbox escapes)
Timeline: Estimated 8-10 weeks | Prerequisites: Phase 1 complete
Deliverables:
- 4 additional arms (Coder, Judge, Retriever, Safety Guardian)
- Distributed memory system (PostgreSQL + Qdrant + Redis)
- Kubernetes production deployment
- Swarm decision-making
- Load testing (1,000 concurrent tasks)
See MASTER-TODO.md for complete 7-phase roadmap (420+ tasks).
Phase 3: Operations & Deployment (4-6 weeks) - Monitoring, alerting, disaster recovery Phase 4: Engineering & Standards (3-4 weeks) - Code quality, testing, documentation Phase 5: Security Hardening (8-10 weeks) - Capability isolation, PII protection, penetration testing Phase 6: Production Readiness (8-10 weeks) - Autoscaling, cost optimization, compliance certification
Timeline: 12 months estimated (36-48 weeks) Budget: ~$177,900 total (37 sprints, 420+ tasks) Team Size: 5-8 engineers (mixed skills) | Phase 0: Single developer | Phase 1: 3-4 engineers Overall Progress: 100% Phase 0 complete (10/10 sprints) | Phase 0 COMPLETE ✅ Documentation: 170+ files, ~243,210 lines Next Milestone: Phase 1 Sprint 1.1 (Reflex Layer Implementation) - 8.5 weeks, 340 hours Phase 0 Completion: November 13, 2025 ✅ Phase 1 Start: TBD (team onboarding in progress) Production Launch: Estimated Q3 2026
| Language | Version | Usage | Phase 0 Status | Phase 1+ Status |
|---|---|---|---|---|
| Python | 3.11+ | Orchestrator, 5 Arms | ✅ Configured (pyproject.toml, linting) | Full implementation |
| Rust | 1.82.0 | Reflex Layer, Executor | ✅ Configured (Cargo.toml, linting) | Full implementation |
| Framework | Version | Purpose | Phase 0 Status | Phase 1+ Status |
|---|---|---|---|---|
| FastAPI | Latest | Python web services | ✅ Dependencies configured | API implementation |
| Axum | 0.7+ | Rust web services | 📋 Planned | Implementation |
| Database | Version | Purpose | Phase 0 Status | Phase 1+ Status |
|---|---|---|---|---|
| PostgreSQL | 15 | Global memory | ✅ Operational (Docker, port 15432) | Schema + queries |
| Redis | 7 | Caching | ✅ Operational (Docker, port 6379) | Cache logic |
| Qdrant | 1.7 | Vector store | ✅ Operational (Docker, ports 6333-6334) | Embeddings |
| Tool | Purpose | Phase 0 Status | Details |
|---|---|---|---|
| GitHub Actions | CI/CD automation | ✅ Operational | 4 workflows (lint, test, security, build) |
| Codecov | Coverage reporting | ✅ Integrated | Coverage uploads for Python 3.11 |
| Pre-commit | Quality gates | ✅ Operational | 15+ hooks (Black, Ruff, mypy, rustfmt, clippy) |
| Tool | Purpose | Phase 0 Status | Details |
|---|---|---|---|
| Bandit | SAST (Python) | ✅ Operational | Scanning services/orchestrator, services/arms |
| Snyk | Dependency scanning (Python) | ✅ Operational | HIGH+ severity threshold, SARIF output |
| cargo-audit | Dependency scanning (Rust) | ✅ Operational | Auditing reflex-layer + executor |
| gitleaks | Secret detection | ✅ Operational | Full git history scanning |
| Trivy | Container scanning | ⏸️ Disabled (Phase 0) | Will enable in Phase 1 |
| Tool | Version | Purpose | Phase 0 Status | Phase 1+ Status |
|---|---|---|---|---|
| Docker | 24+ | Containerization | ✅ Complete | Production builds |
| Docker Compose | 2.20+ | Dev orchestration | ✅ Complete (13 services) | Continued use |
| Kubernetes | 1.28+ | Production orchestration | 📋 Planned | Phase 2 deployment |
| Terraform | 1.6+ | Infrastructure as Code | 📋 Planned | Phase 2 provisioning |
| Tool | Purpose | Phase 0 Status | Details |
|---|---|---|---|
| Poetry | Python dependency management | ✅ Configured | pyproject.toml with 80+ lines |
| Cargo | Rust build system | ✅ Configured | Workspace with 2 members |
| Black | Python formatting | ✅ Operational | Line length: 100 |
| Ruff | Python linting | ✅ Operational | Import sorting + 50+ rules |
| mypy | Python type checking | ✅ Operational | Strict mode with plugins |
| rustfmt | Rust formatting | ✅ Operational | Edition 2021 |
| clippy | Rust linting | ✅ Operational | -D warnings (treat as errors) |
| VS Code Devcontainer | Unified dev environment | ✅ Complete | 14 extensions, 13 ports |
| Makefile | Task automation | ✅ Complete | 20+ commands |
| Tool | Purpose | Phase 0 Status | Phase 3+ Status |
|---|---|---|---|
| Prometheus | Metrics collection | ✅ Complete (Sprint 0.9) | Production deployment + 50+ alerts |
| Grafana | Visualization | ✅ Complete (Sprint 0.9) | 6 dashboards (cluster, namespaces, services, logs) |
| Loki | Log aggregation | ✅ Complete (Sprint 0.9) | GCS backend, tiered retention policies |
| Jaeger | Distributed tracing | ✅ Complete (Sprint 0.9) | OTLP endpoints, OpenTelemetry instrumentation |
| Provider | Models | Purpose | Status |
|---|---|---|---|
| OpenAI | GPT-4, GPT-4 Turbo, GPT-3.5-turbo | Orchestrator, Planner, Coder | 📋 Phase 1 |
| Anthropic | Claude 3 Opus, Sonnet | Alternative provider | 📋 Phase 1 |
| Local | vLLM, Ollama | Cost optimization | 📋 Phase 3 |
Legend: ✅ Complete/Operational | ⏸️ Disabled (temporary) | 📋 Planned | 🚧 In Progress
| Metric | Target | Measurement |
|---|---|---|
| Reflex Cache Hit Rate | > 60% | octollm_cache_hits_total / octollm_tasks_total |
| P50 Latency | < 2s | histogram_quantile(0.5, octollm_task_duration_seconds) |
| P95 Latency | < 10s | histogram_quantile(0.95, octollm_task_duration_seconds) |
| P99 Latency | < 30s | histogram_quantile(0.99, octollm_task_duration_seconds) |
| Task Success Rate | > 95% | octollm_tasks_total{status="success"} / octollm_tasks_total |
| Cost per Task | < 50% baseline | Token usage metrics vs monolithic LLM |
| PII Leakage Rate | < 0.1% | Manual evaluation + automated scanning |
- Getting Started - 15-minute quick-start
- Development Environment - Docker Compose setup
- Custom Arms Guide - Build your own specialized arms
- Testing Guide - Unit, integration, E2E testing
- Debugging Guide - Troubleshooting playbooks
- System Overview - High-level architecture
- Data Flow - Request processing pipeline
- Swarm Decision-Making - Multi-arm consensus
- ADRs - Architecture Decision Records
- Deployment Guide - Docker Compose + Kubernetes
- Monitoring & Alerting - Prometheus + Grafana
- Disaster Recovery - Backup and restore
- Performance Tuning - Optimization strategies
- Scaling Guide - HPA, VPA, cluster autoscaling
- Security Overview - Defense in depth
- Threat Model - STRIDE analysis (5,106 lines)
- PII Protection - GDPR/CCPA compliance
- Security Testing - SAST, DAST, penetration testing
- Compliance - SOC 2, ISO 27001, HIPAA
- Master TODO - Complete 7-phase roadmap (420+ tasks)
- Phase 0 TODO - Project setup (45 tasks, 2 weeks)
- Status & Progress - Sprint reports and tracking
- Pre-Phase 0 Readiness Report - Documentation audit
We welcome contributions! Please see:
- CONTRIBUTING.md - Contribution guidelines
- CODE_OF_CONDUCT.md - Community standards
- SECURITY.md - Vulnerability disclosure policy
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes (follow coding standards)
- Write tests (target: 85%+ coverage)
- Run linters and tests locally
- Commit with conventional commits (
feat:,fix:,docs:,test:) - Push to your fork
- Open a Pull Request
DO NOT open public issues for security vulnerabilities.
Please report security issues to: [email protected]
We will acknowledge your email within 24 hours and provide a fix timeline within 7 days. We practice coordinated disclosure with a 90-day embargo.
See SECURITY.md for full policy.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- Permissive open-source license
- Patent protection for contributors and users
- Compatible with commercial use
- Requires preservation of copyright and license notices
- Octopus vulgaris research on distributed nervous systems
- Neuroscience studies on autonomous arm control
- Swarm intelligence and consensus algorithms
- LangChain / LlamaIndex for LLM orchestration patterns
- Ray for distributed Python execution
- Kubernetes for container orchestration
- Prometheus ecosystem for observability
- OpenAI, Anthropic for LLM APIs
- PostgreSQL, Redis, Qdrant for data persistence
- FastAPI, Axum for web frameworks
- Docker, Kubernetes for deployment
- Project Lead: TBD
- Security: [email protected]
- General Inquiries: [email protected]
- GitHub Issues: github.com/doublegate/OctoLLM/issues
OctoLLM is currently in Phase 0 (Project Setup & Infrastructure) and not yet accepting external contributions. We're focused on establishing the foundational architecture and CI/CD infrastructure before opening to the community.
- Team: Single developer (Phase 0)
- Status: Pre-implementation (infrastructure and documentation)
- Phase 0 Completion: Late November 2025
- Phase 1 Start: December 2025
Once we enter Phase 1, we will welcome contributions! The planned process:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Install pre-commit hooks (
pre-commit install) - Make your changes following our coding standards
- Test thoroughly (target: 85% Python coverage, 80% Rust coverage)
- Commit using Conventional Commits format (
feat:,fix:,docs:,test:) - Push to your fork
- Open a Pull Request
Code Quality (enforced by CI/CD):
- Python: Black (formatting), Ruff (linting), mypy (type checking)
- Rust: rustfmt (formatting), clippy (linting with
-D warnings) - Commits: Conventional Commits format
- Coverage: 85% Python, 80% Rust minimum (Phase 1+)
CI/CD Requirements (must pass):
- ✅ Lint workflow (Python + Rust)
- ✅ Test workflow (unit + integration)
- ✅ Security workflow (SAST + dependency scanning)
- ✅ Pre-commit hooks (15+ checks)
# Clone repository
git clone https://github.com/doublegate/OctoLLM.git
cd OctoLLM
# Install pre-commit hooks
python -m pip install pre-commit
pre-commit install
# Run pre-commit checks manually
pre-commit run --all-files
# Run tests (Phase 1+)
pytest tests/ -v --cov=services
cargo test --manifest-path services/reflex-layer/Cargo.toml
# Start development environment
cd infrastructure/docker-compose
docker-compose -f docker-compose.dev.yml up -dBe respectful, inclusive, and collaborative. Detailed code of conduct will be published when external contributions open in Phase 1.
- Watch this repository for updates
- Star to show your interest
- Follow development progress in CHANGELOG.md
- Read sprint completion reports in to-dos/status/
See CONTRIBUTING.md for comprehensive guidelines (will be expanded for Phase 1).
- Octopus vulgaris research on distributed nervous systems and autonomous control
- Neuroscience studies on swarm intelligence and consensus algorithms
- Nature's solution to distributed computing challenges
- LangChain and LlamaIndex for LLM orchestration patterns
- OpenAI and Anthropic for frontier LLM capabilities
- PostgreSQL, Redis, and Qdrant for data persistence
- FastAPI and Axum for high-performance web frameworks
- Docker and Kubernetes for containerization and orchestration
- Claude Code (Anthropic) for AI-assisted development and documentation
- GitHub Actions for CI/CD automation
- Codecov for test coverage analysis
- Snyk for security vulnerability scanning
Last Updated: 2025-12-16 Document Version: 7.1 (Python Code Quality and Codecov Enhancements) Sprint Status: Sprint 1.2 complete (Orchestrator Core) | Phase 1 SPRINT 1.3 NEXT ✅ Next Review: After Phase 1 Sprint 1.3 completion (Planner Arm Integration) Repository: https://github.com/doublegate/OctoLLM Documentation Site: https://doublegate.github.io/OctoLLM/
Built with ❤️ / 🐙 Inspired by the Intelligence of Cephalopods
Production-Ready Preprocessing Service (~8,650 lines Rust, 218 tests passing):
-
✅ PII Detection: 18 patterns (SSN, credit cards, emails, API keys, etc.)
- Performance: 1.2-460µs (10-5,435x faster than 5ms target)
- Luhn validation for credit cards, RFC-compliant email validation
- 62/62 unit tests passing
-
✅ Injection Detection: 14 OWASP-aligned patterns with context analysis
- Performance: 1.8-6.7µs (1,493-5,435x faster than 10ms target)
- Context-aware severity adjustment (quoted, academic, testing)
- 63/63 unit tests passing (all edge cases fixed)
-
✅ Redis Caching: SHA-256 keyed with differential TTL
- Performance: <0.5ms P95 for cache hits (2x better than target)
- Deterministic cache key generation
- 64/64 unit tests passing (7 require Redis)
-
✅ Rate Limiting: Token bucket algorithm (distributed via Redis)
- Performance: <3ms P95 for checks (1.67x better than target)
- Multi-dimensional: User (1K/h), IP (100/h), Endpoint, Global
- 64/64 unit tests passing (7 require Redis)
-
✅ HTTP API: Full integration with middleware and metrics
- POST /process - Main processing pipeline
- GET /health, /ready - Kubernetes probes
- GET /metrics - 13 Prometheus metrics
- 37/37 API tests passing
-
✅ Comprehensive Testing: 218/218 tests passing (100%)
- 188 unit tests + 30 integration tests
- ~85% code coverage
- Criterion benchmarks for all components
-
✅ Complete Documentation:
- Component documentation with troubleshooting
- OpenAPI 3.0 specification
- Sprint 1.1 Completion Report
- Sprint 1.2 Handoff Document
Technology: Rust 1.82.0 | Axum 0.8 | Tokio 1.43 | Redis 7+ Status: Ready for Sprint 1.2 (Orchestrator integration)
Production-Ready Orchestrator Core (1,776 lines Python, 87 tests passing, 85%+ coverage):
-
✅ FastAPI Application: 6 REST endpoints operational
- POST /submit - Task submission with Reflex Layer validation
- GET /tasks/{id} - Task status retrieval
- GET /health - Kubernetes liveness probe
- GET /ready - Readiness check (database + Reflex Layer)
- GET /metrics - Prometheus metrics (prepared)
- GET / - Service information
-
✅ Reflex Layer Integration: Production-ready circuit breaker (504 lines)
- Circuit breaker pattern (failure threshold: 5, reset: 60s)
- Retry logic with exponential backoff (1-5 seconds)
- Health check and readiness probes
- Performance: All endpoints 2-5x faster than targets
-
✅ Database Layer: Async SQLAlchemy 2.0 + PostgreSQL (383 lines)
- Connection pooling (pool_size=10, max_overflow=20)
- CRUD operations for tasks and results
- Async operations with asyncpg driver
-
✅ Data Models: Pydantic + SQLAlchemy ORM (255 lines)
- TaskRequest, TaskResponse, ResourceBudget, TaskContract
- Task, TaskResult (database models)
- TaskStatus, Priority enums
-
✅ Configuration Management: Environment-based settings (148 lines)
- Pydantic BaseSettings with ORCHESTRATOR_ prefix
- Environment variable validation
- PostgreSQL-only database support
-
✅ Comprehensive Testing: 87/87 tests passing (100% pass rate)
- test_reflex_client.py: 39 tests, 97% coverage
- test_models.py: 34 tests, 92% coverage
- test_config.py: 26 tests, 88% coverage
- test_database.py: 27 tests, 85% coverage
- Total: 2,776 lines test code
-
✅ Complete Documentation: 4,769 lines
- services/orchestrator/README.md (641 lines)
- docs/components/orchestrator.md (1,039 lines)
- docs/api/openapi/orchestrator.yaml (957 lines)
- Sprint 1.2 completion report (956 lines)
- Sprint 1.3 handoff document (1,176 lines)
Technology: Python 3.11+ | FastAPI 0.115+ | SQLAlchemy 2.0 | PostgreSQL 15+ Status: Core complete, pipeline deferred to Sprint 1.3 Next: Sprint 1.3 - Planner Arm Integration
