OctoLLM

Distributed AI Architecture for Offensive Security and Developer Tooling

Inspired by the octopus's distributed nervous system, OctoLLM reimagines AI architecture through biological intelligence principles, achieving superior cost efficiency, security, and flexibility compared to monolithic LLM systems.

What is OctoLLM?

OctoLLM is a distributed AI system inspired by octopus neurobiology, where:

Central Brain (Orchestrator): Strategic planning and coordination using frontier LLMs (GPT-4, Claude Opus)
Autonomous Arms (Specialists): Domain-specific execution with local decision-making
Reflex Layer: Fast preprocessing for common patterns without LLM involvement
Distributed Memory: Global semantic memory + local episodic stores per arm

The Octopus Advantage

Biological octopuses have:

40M neurons in brain → Strategic planning (Orchestrator)
350M neurons in arms → Local intelligence (6 specialized arms)
Direct arm-to-arm communication → Inter-component messaging without orchestrator bottleneck
Fast reflexes → Preprocessing layer for common patterns (<10ms latency)

OctoLLM applies these principles to build a distributed AI system that is more efficient, robust, and specialized than monolithic LLM architectures.

Architecture Overview

graph TB
    subgraph "Ingress Layer"
        API[API Gateway]
        REF[Reflex Layer<br/><10ms]
    end

    subgraph "Orchestration Layer"
        ORCH[Orchestrator Brain<br/>GPT-4/Claude]
        GMEM[(Global Memory<br/>PostgreSQL)]
    end

    subgraph "Execution Layer - Arms"
        PLAN[Planner Arm<br/>Task Decomposition]
        EXEC[Executor Arm<br/>Sandboxed Actions]
        CODE[Coder Arm<br/>Code Generation]
        JUDG[Judge Arm<br/>Validation]
        SAFE[Guardian Arm<br/>PII Protection]
        RETR[Retriever Arm<br/>Knowledge Search]
    end

    API --> REF
    REF --> ORCH
    ORCH --> PLAN & EXEC & CODE & JUDG & SAFE & RETR
    ORCH <--> GMEM

Key Features

1. Distributed Intelligence

Swarm Decision-Making: Multiple arms work in parallel for high-stakes decisions
Autonomous Execution: Arms make local decisions without constant orchestrator involvement
Conflict Resolution: Judge arm arbitrates disagreements between proposals

2. Performance Optimization

Reflex Cache: 60%+ cache hit rate for common queries (<10ms latency)
Hierarchical Processing: Simple tasks use cheap models, complex tasks use GPT-4
Cost Efficiency: 50% cost reduction vs monolithic LLM systems

3. Security First

Capability Isolation: Time-limited JWT tokens, sandboxed execution (gVisor)
PII Protection: Multi-layer redaction (regex, NLP, embedding-based)
Prompt Injection Defense: Pattern matching at reflex layer
Compliance Ready: SOC 2 Type II, ISO 27001, GDPR/CCPA

4. Production Ready

Kubernetes Deployment: Horizontal Pod Autoscaling, multi-zone HA
Comprehensive Monitoring: Prometheus + Grafana + Loki + Jaeger
Disaster Recovery: 15-minute RPO, 1-hour RTO with automated backups
Cost Optimization: Spot instances, model selection, aggressive caching

Current Status

Phase 1 Progress: Proof of Concept

Current Sprint: Sprint 1.2 ✅ COMPLETE (2025-11-15) Sprint Status: Orchestrator Core complete - production-ready Next Sprint: Phase 1 Sprint 1.3 (Planner Arm Integration) - Ready to start Overall Progress: Phase 0: 100% ✅ | Phase 1: ~22% (2/9 sprints complete: Reflex Layer + Orchestrator) Version: 1.2.0 | Sprint 1.2 Completion: November 15, 2025 | Sprint 1.3 Start: TBD

Recent Achievements

Documentation Reorganization (2025-11-16):

✅ Comprehensive mdBook Site: 134 pages across 12 major sections deployed to GitHub Pages
- Live Site: https://doublegate.github.io/OctoLLM/
- Full-text search with 17.1MB search index
- Hierarchical navigation with breadcrumbs
- Complete OpenAPI specifications for all 8 components
✅ Phase 0 Archive: Historical record preserved at docs/phases/PHASE-0-README-ARCHIVE.md
✅ README Streamlined: Reduced by 291 lines (26.5%) by archiving Phase 0 content
✅ 100% Documentation Coverage: All project content integrated into searchable mdBook format

Codecov Integration (2025-11-16):

✅ Coverage Analytics: 80% project target, 85% patch target with branch coverage
✅ Test Analytics: JUnit XML reporting for pytest with automated result uploads
✅ Bundle Analysis: Configuration prepared for TypeScript SDK optimization
✅ Component Flags: 10 granular coverage flags (orchestrator, reflex-layer, 6 arms, 2 SDKs)
✅ CI/CD Integration: Automated coverage and test result uploads in GitHub Actions
✅ Advanced Features: Carryforward flags for stable components, after_n_builds for accuracy

Sprint 1.2 Completion (2025-11-15):

✅ Orchestrator Core: 1,776 lines Python, 87 tests passing, 85%+ coverage
✅ Reflex Layer Integration: Circuit breaker pattern, retry logic, health checks
✅ Database Layer: Async SQLAlchemy 2.0 + PostgreSQL with connection pooling
✅ Complete OpenAPI specification and comprehensive documentation

Operational Infrastructure

Component	Status	Details
CI/CD Pipeline	✅ Complete	4 workflows operational (lint, test, security, build)
Security Scanning	✅ Complete	Multi-layer: SAST (Bandit), dependencies (Snyk, cargo-audit), secrets (gitleaks)
Test Framework	✅ Complete	pytest + cargo test + Codecov integration
Development Environment	✅ Complete	Docker Compose with 13 services, hot-reload support
Pre-commit Hooks	✅ Complete	15+ quality checks (Black, Ruff, mypy, rustfmt, clippy)
Container Builds	⏸️ Disabled	Multi-arch builds configured, will enable in Phase 1
Documentation	✅ Complete	170+ files, ~243,210 lines (includes Sprint 0.10 validation & Phase 1 planning)
API Documentation & SDKs	✅ Complete	TypeScript SDK, Postman/Insomnia collections, 8 service docs, 6 schema docs, 6 diagrams
Monitoring & Observability	✅ Complete	Grafana (6 dashboards), Prometheus (50+ alerts), Loki, Jaeger (Sprint 0.9)

Phase 0 Foundation (Complete)

Phase 0 established the foundational infrastructure for OctoLLM development. For complete details on all 10 sprints, achievements, and metrics, see:

Phase 0: Foundation - Complete Archive - 100% Complete (November 10-13, 2025)

Key Highlights:

10/10 sprints completed (Repository setup, Dev environment, CI/CD, API specs, Cloud IaC, Local deployment, Monitoring)
170+ documentation files (~243,210 lines)
4 GitHub Actions workflows operational
3 deployment options (Docker Compose, GCP/Kubernetes, Unraid)
$15,252/year cloud cost savings (vs AWS)
96/100 security score, 0 critical vulnerabilities

Component Status

Component	Phase 0 Status	Phase 1 Target
Orchestrator	✅ Docker + config	Full implementation (FastAPI + LLM integration)
Reflex Layer	✅ Docker + minimal Rust	Full implementation (PII detection, caching, rate limiting)
Planner Arm	✅ Docker + config	Full implementation (task decomposition)
Executor Arm	✅ Docker + minimal Rust	Full implementation (sandboxed execution)
Coder Arm	✅ Docker + config	Phase 2 implementation
Judge Arm	✅ Docker + config	Phase 2 implementation
Retriever Arm	✅ Docker + config	Phase 2 implementation
Safety Guardian	✅ Docker + config	Phase 2 implementation
PostgreSQL	✅ Operational	Schema implementation in Phase 1
Redis	✅ Operational	Full caching logic in Phase 1
Qdrant	✅ Operational	Vector embeddings in Phase 2

Note: Phase 0 focuses on infrastructure setup. Full service implementation begins in Phase 1 (December 2025).

Documentation Inventory

Architecture: 3 files, 5,550 lines (system overview, data flow, swarm decision-making)
Components: 11 files (orchestrator, reflex layer, 6 arms + 2 infrastructure)
Implementation: 7 files (getting started, dev environment, custom arms, memory systems)
Security: 6 files, 22,394 lines (threat model, PII protection, compliance)
Operations: 11 files (deployment, monitoring, DR, scaling, troubleshooting, monitoring runbook, alert procedures)
Engineering: 5 files, 3,360 lines (coding standards, error handling, logging)
API: 23 files (API-OVERVIEW, 8 services, 6 schemas, collections, OpenAPI specs)
Architecture Diagrams: 6 files, 1,544 lines (Mermaid diagrams for visualization)
ADR: 7 files (architecture decisions - includes ADR-007 Unraid)
Sprint Reports: 9 files (completion reports for Sprints 0.3-0.9)
TODOs: 15 files (MASTER-TODO + phase-specific + Sprint 0.6 tracking)

CI/CD Pipeline

OctoLLM uses GitHub Actions for continuous integration and deployment with 4 comprehensive workflows:

Lint Workflow

Status: ✅ PASSING | Duration: ~1m 9s | Triggers: Push/PR to main/develop

Python: Ruff (linting + import sorting), Black (formatting), mypy (type checking)
Rust: rustfmt (formatting), clippy (linting with -D warnings)
Features: Dependency caching, concurrency control, automatic failure detection

Test Workflow

Status: ✅ PASSING | Duration: ~2m 30s | Triggers: Push/PR to main/develop

Python Unit Tests: pytest on Python 3.11 and 3.12 with matrix strategy
Rust Unit Tests: cargo test for reflex-layer and executor services
Integration Tests: PostgreSQL 15 + Redis 7 services with health checks
Coverage: Codecov integration with XML reports
Phase 0 Note: Placeholder tests validate project structure (real tests in Phase 1)

Security Workflow

Status: ✅ PASSING | Duration: ~3m 0s | Triggers: Push/PR, daily at midnight UTC

SAST: Bandit for Python code vulnerabilities
Dependencies: Snyk (Python packages), cargo-audit (Rust crates)
Secrets: gitleaks for credential detection (full git history scan)
Container: Trivy (disabled in Phase 0, will enable in Phase 1)
Integration: SARIF results to GitHub Security tab, 30-day artifact retention

Build Workflow

Status: ⏸️ DISABLED (Phase 0) | Duration: ~7s (summary only) | Triggers: Push/PR, version tags

Multi-arch: linux/amd64, linux/arm64
Registry: GitHub Container Registry (GHCR)
Services: 8 total (orchestrator, reflex-layer, 6 arms)
Features: BuildKit caching, automatic tagging, post-build vulnerability scanning
Phase 1 Activation: Change if: false to if: true in workflow file

Security Achievements

Sprint 0.3 Security Fixes (2025-11-11):

✅ Fixed 7 CVEs: 4 HIGH + 3 MEDIUM severity
✅ python-multipart: DoS and ReDoS vulnerabilities patched
✅ starlette: DoS vulnerabilities patched
✅ langchain: SQL injection and DoS vulnerabilities patched
✅ Daily automated security scans operational

Current Security Posture:

Multi-layer scanning (SAST, dependency, secrets)
Zero HIGH/CRITICAL vulnerabilities in production dependencies
Automated daily scans with GitHub Security tab integration

Workflow Files

.github/workflows/
├── lint.yml          # Code quality checks
├── test.yml          # Unit + integration tests
├── security.yml      # Security scanning
└── build.yml         # Docker image builds

Quick Start (Development)

Prerequisites

Required:

Docker 24.0+ and Docker Compose 2.20+
Git
Text editor (VS Code recommended)

Optional (for local development without Docker):

Python 3.11+ (for Python services)
Rust 1.82.0+ (for Rust services)
OpenAI API key (for LLM functionality in Phase 1+)

Phase 0 Setup (Current)

# Clone repository
git clone https://github.com/doublegate/OctoLLM.git
cd OctoLLM

# Install pre-commit hooks (recommended)
python -m pip install pre-commit
pre-commit install

# Verify setup - run linting and formatting checks
make lint  # or: ruff check . && black --check . && cargo fmt --check

# Run placeholder tests (validates project structure)
make test  # or: pytest tests/ -v

# View all available commands
make help

Development Environment (Phase 0)

# Start development environment (placeholder services + databases)
cd infrastructure/docker-compose
docker-compose -f docker-compose.dev.yml up -d

# Check service health
docker-compose -f docker-compose.dev.yml ps

# View logs for specific service
docker-compose -f docker-compose.dev.yml logs -f orchestrator

# Stop environment
docker-compose -f docker-compose.dev.yml down

Service Access (Phase 0)

OctoLLM Services (placeholder in Phase 0):

Orchestrator: http://localhost:8000
Reflex Layer: http://localhost:8080 (minimal Rust service)
Planner Arm: http://localhost:8001
Coder Arm: http://localhost:8003
Judge Arm: http://localhost:8004
Retriever Arm: http://localhost:8002
Safety Guardian: http://localhost:8005
Executor Arm: http://localhost:8006 (minimal Rust service)

Infrastructure Services (fully operational):

PostgreSQL: localhost:15432 (user: octollm, db: octollm)
Redis: localhost:6379
Qdrant: localhost:6333 (REST API), localhost:6334 (gRPC)
Prometheus: http://localhost:9090
Grafana: http://localhost:3000

Note: Phase 0 services are minimal "hello world" placeholders. Full implementation begins in Phase 1 (December 2025).

Development Workflow

# Make code changes
# ...

# Run pre-commit hooks manually (before commit)
pre-commit run --all-files

# Run tests
pytest tests/unit/ -v
cargo test --manifest-path services/reflex-layer/Cargo.toml

# Format code
black .
cargo fmt --manifest-path services/reflex-layer/Cargo.toml

# Check linting
ruff check .
cargo clippy --manifest-path services/reflex-layer/Cargo.toml -- -D warnings

# Commit changes (pre-commit hooks run automatically)
git add .
git commit -m "feat: your feature description"

See Local Development Guide for detailed instructions.

Use Cases

1. Offensive Security Operations

Vulnerability Assessment: Swarm of arms analyzes code from multiple perspectives (OWASP, STRIDE, pentesting)
Exploit Development: Coder arm generates exploits, Judge arm validates, Guardian ensures ethical boundaries
Reconnaissance: Retriever arm aggregates OSINT from multiple sources

2. Developer Tooling

Code Review: 4-arm swarm checks style, performance, security, test coverage
Documentation Generation: Coder arm writes docs, Judge validates accuracy
Debugging Assistance: Retriever finds similar issues, Planner suggests fix strategies

3. Research & Analysis

Literature Review: Retriever arm queries arXiv, Google Scholar, GitHub
Comparative Analysis: Multiple arms research alternatives, Judge synthesizes findings
Technical Writing: Coder drafts content, Judge ensures accuracy

Roadmap

Historical Phases

Phase 0: Foundation - 100% Complete (November 10-13, 2025)
- Infrastructure setup, CI/CD, API specifications, cloud provisioning, monitoring
- 10/10 sprints completed in <1 week (>50% faster than 2-week estimate)

Phase 1: Proof of Concept (Planned)

Timeline: December 2025 (estimated 2 weeks) | Prerequisites: Phase 0 complete

Goal: Minimal end-to-end task execution with 4 core components

Deliverables:

Reflex Layer implementation (Rust) - <10ms latency, PII detection, caching
Orchestrator core implementation (Python) - Task routing, LLM integration
Planner Arm implementation (Python) - Task decomposition with GPT-3.5-turbo
Executor Arm implementation (Rust) - Sandboxed command execution
Basic end-to-end workflow testing
E2E tests with >90% success rate

Success Criteria:

All 4 components deployed and healthy
E2E tests passing (>90% success rate)
Latency targets met (P99 <30s)
Security tests passing (no sandbox escapes)

Phase 2: Core Capabilities (Planned)

Timeline: Estimated 8-10 weeks | Prerequisites: Phase 1 complete

Deliverables:

4 additional arms (Coder, Judge, Retriever, Safety Guardian)
Distributed memory system (PostgreSQL + Qdrant + Redis)
Kubernetes production deployment
Swarm decision-making
Load testing (1,000 concurrent tasks)

Phase 3-6: Advanced Features (Planned)

See MASTER-TODO.md for complete 7-phase roadmap (420+ tasks).

Phase 3: Operations & Deployment (4-6 weeks) - Monitoring, alerting, disaster recovery Phase 4: Engineering & Standards (3-4 weeks) - Code quality, testing, documentation Phase 5: Security Hardening (8-10 weeks) - Capability isolation, PII protection, penetration testing Phase 6: Production Readiness (8-10 weeks) - Autoscaling, cost optimization, compliance certification

Overall Project Status

Timeline: 12 months estimated (36-48 weeks) Budget: ~$177,900 total (37 sprints, 420+ tasks) Team Size: 5-8 engineers (mixed skills) | Phase 0: Single developer | Phase 1: 3-4 engineers Overall Progress: 100% Phase 0 complete (10/10 sprints) | Phase 0 COMPLETE ✅ Documentation: 170+ files, ~243,210 lines Next Milestone: Phase 1 Sprint 1.1 (Reflex Layer Implementation) - 8.5 weeks, 340 hours Phase 0 Completion: November 13, 2025 ✅ Phase 1 Start: TBD (team onboarding in progress) Production Launch: Estimated Q3 2026

Technology Stack

Core Languages

Language	Version	Usage	Phase 0 Status	Phase 1+ Status
Python	3.11+	Orchestrator, 5 Arms	✅ Configured (pyproject.toml, linting)	Full implementation
Rust	1.82.0	Reflex Layer, Executor	✅ Configured (Cargo.toml, linting)	Full implementation

Web Frameworks

Framework	Version	Purpose	Phase 0 Status	Phase 1+ Status
FastAPI	Latest	Python web services	✅ Dependencies configured	API implementation
Axum	0.7+	Rust web services	📋 Planned	Implementation

Databases

Database	Version	Purpose	Phase 0 Status	Phase 1+ Status
PostgreSQL	15	Global memory	✅ Operational (Docker, port 15432)	Schema + queries
Redis	7	Caching	✅ Operational (Docker, port 6379)	Cache logic
Qdrant	1.7	Vector store	✅ Operational (Docker, ports 6333-6334)	Embeddings

CI/CD

Tool	Purpose	Phase 0 Status	Details
GitHub Actions	CI/CD automation	✅ Operational	4 workflows (lint, test, security, build)
Codecov	Coverage reporting	✅ Integrated	Coverage uploads for Python 3.11
Pre-commit	Quality gates	✅ Operational	15+ hooks (Black, Ruff, mypy, rustfmt, clippy)

Security

Tool	Purpose	Phase 0 Status	Details
Bandit	SAST (Python)	✅ Operational	Scanning services/orchestrator, services/arms
Snyk	Dependency scanning (Python)	✅ Operational	HIGH+ severity threshold, SARIF output
cargo-audit	Dependency scanning (Rust)	✅ Operational	Auditing reflex-layer + executor
gitleaks	Secret detection	✅ Operational	Full git history scanning
Trivy	Container scanning	⏸️ Disabled (Phase 0)	Will enable in Phase 1

Infrastructure

Tool	Version	Purpose	Phase 0 Status	Phase 1+ Status
Docker	24+	Containerization	✅ Complete	Production builds
Docker Compose	2.20+	Dev orchestration	✅ Complete (13 services)	Continued use
Kubernetes	1.28+	Production orchestration	📋 Planned	Phase 2 deployment
Terraform	1.6+	Infrastructure as Code	📋 Planned	Phase 2 provisioning

Development Tools

Tool	Purpose	Phase 0 Status	Details
Poetry	Python dependency management	✅ Configured	pyproject.toml with 80+ lines
Cargo	Rust build system	✅ Configured	Workspace with 2 members
Black	Python formatting	✅ Operational	Line length: 100
Ruff	Python linting	✅ Operational	Import sorting + 50+ rules
mypy	Python type checking	✅ Operational	Strict mode with plugins
rustfmt	Rust formatting	✅ Operational	Edition 2021
clippy	Rust linting	✅ Operational	`-D warnings` (treat as errors)
VS Code Devcontainer	Unified dev environment	✅ Complete	14 extensions, 13 ports
Makefile	Task automation	✅ Complete	20+ commands

Monitoring

Tool	Purpose	Phase 0 Status	Phase 3+ Status
Prometheus	Metrics collection	✅ Complete (Sprint 0.9)	Production deployment + 50+ alerts
Grafana	Visualization	✅ Complete (Sprint 0.9)	6 dashboards (cluster, namespaces, services, logs)
Loki	Log aggregation	✅ Complete (Sprint 0.9)	GCS backend, tiered retention policies
Jaeger	Distributed tracing	✅ Complete (Sprint 0.9)	OTLP endpoints, OpenTelemetry instrumentation

LLM Providers (Phase 1+)

Provider	Models	Purpose	Status
OpenAI	GPT-4, GPT-4 Turbo, GPT-3.5-turbo	Orchestrator, Planner, Coder	📋 Phase 1
Anthropic	Claude 3 Opus, Sonnet	Alternative provider	📋 Phase 1
Local	vLLM, Ollama	Cost optimization	📋 Phase 3

Legend: ✅ Complete/Operational | ⏸️ Disabled (temporary) | 📋 Planned | 🚧 In Progress

Performance Targets

Metric	Target	Measurement
Reflex Cache Hit Rate	> 60%	`octollm_cache_hits_total / octollm_tasks_total`
P50 Latency	< 2s	`histogram_quantile(0.5, octollm_task_duration_seconds)`
P95 Latency	< 10s	`histogram_quantile(0.95, octollm_task_duration_seconds)`
P99 Latency	< 30s	`histogram_quantile(0.99, octollm_task_duration_seconds)`
Task Success Rate	> 95%	`octollm_tasks_total{status="success"} / octollm_tasks_total`
Cost per Task	< 50% baseline	Token usage metrics vs monolithic LLM
PII Leakage Rate	< 0.1%	Manual evaluation + automated scanning

Documentation

For Developers

Getting Started - 15-minute quick-start
Development Environment - Docker Compose setup
Custom Arms Guide - Build your own specialized arms
Testing Guide - Unit, integration, E2E testing
Debugging Guide - Troubleshooting playbooks

For Architects

System Overview - High-level architecture
Data Flow - Request processing pipeline
Swarm Decision-Making - Multi-arm consensus
ADRs - Architecture Decision Records

For Operators

Deployment Guide - Docker Compose + Kubernetes
Monitoring & Alerting - Prometheus + Grafana
Disaster Recovery - Backup and restore
Performance Tuning - Optimization strategies
Scaling Guide - HPA, VPA, cluster autoscaling

For Security Teams

Security Overview - Defense in depth
Threat Model - STRIDE analysis (5,106 lines)
PII Protection - GDPR/CCPA compliance
Security Testing - SAST, DAST, penetration testing
Compliance - SOC 2, ISO 27001, HIPAA

For Project Managers

Master TODO - Complete 7-phase roadmap (420+ tasks)
Phase 0 TODO - Project setup (45 tasks, 2 weeks)
Status & Progress - Sprint reports and tracking
Pre-Phase 0 Readiness Report - Documentation audit

Contributing

We welcome contributions! Please see:

CONTRIBUTING.md - Contribution guidelines
CODE_OF_CONDUCT.md - Community standards
SECURITY.md - Vulnerability disclosure policy

Development Workflow

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes (follow coding standards)
Write tests (target: 85%+ coverage)
Run linters and tests locally
Commit with conventional commits (feat:, fix:, docs:, test:)
Push to your fork
Open a Pull Request

Security

Reporting Vulnerabilities

DO NOT open public issues for security vulnerabilities.

Please report security issues to: [email protected]

We will acknowledge your email within 24 hours and provide a fix timeline within 7 days. We practice coordinated disclosure with a 90-day embargo.

See SECURITY.md for full policy.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Why Apache 2.0?

Permissive open-source license
Patent protection for contributors and users
Compatible with commercial use
Requires preservation of copyright and license notices

Acknowledgments

Biological Inspiration

Octopus vulgaris research on distributed nervous systems
Neuroscience studies on autonomous arm control
Swarm intelligence and consensus algorithms

Technology Inspiration

LangChain / LlamaIndex for LLM orchestration patterns
Ray for distributed Python execution
Kubernetes for container orchestration
Prometheus ecosystem for observability

Open Source Projects

OpenAI, Anthropic for LLM APIs
PostgreSQL, Redis, Qdrant for data persistence
FastAPI, Axum for web frameworks
Docker, Kubernetes for deployment

Contact

Project Lead: TBD
Security: [email protected]
General Inquiries: [email protected]
GitHub Issues: github.com/doublegate/OctoLLM/issues

Contributing

OctoLLM is currently in Phase 0 (Project Setup & Infrastructure) and not yet accepting external contributions. We're focused on establishing the foundational architecture and CI/CD infrastructure before opening to the community.

Current Development Status

Team: Single developer (Phase 0)
Status: Pre-implementation (infrastructure and documentation)
Phase 0 Completion: Late November 2025
Phase 1 Start: December 2025

Future Contribution Process (Phase 1+)

Once we enter Phase 1, we will welcome contributions! The planned process:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Install pre-commit hooks (pre-commit install)
Make your changes following our coding standards
Test thoroughly (target: 85% Python coverage, 80% Rust coverage)
Commit using Conventional Commits format (feat:, fix:, docs:, test:)
Push to your fork
Open a Pull Request

Quality Standards

Code Quality (enforced by CI/CD):

Python: Black (formatting), Ruff (linting), mypy (type checking)
Rust: rustfmt (formatting), clippy (linting with -D warnings)
Commits: Conventional Commits format
Coverage: 85% Python, 80% Rust minimum (Phase 1+)

CI/CD Requirements (must pass):

✅ Lint workflow (Python + Rust)
✅ Test workflow (unit + integration)
✅ Security workflow (SAST + dependency scanning)
✅ Pre-commit hooks (15+ checks)

Development Setup (For Future Contributors)

# Clone repository
git clone https://github.com/doublegate/OctoLLM.git
cd OctoLLM

# Install pre-commit hooks
python -m pip install pre-commit
pre-commit install

# Run pre-commit checks manually
pre-commit run --all-files

# Run tests (Phase 1+)
pytest tests/ -v --cov=services
cargo test --manifest-path services/reflex-layer/Cargo.toml

# Start development environment
cd infrastructure/docker-compose
docker-compose -f docker-compose.dev.yml up -d

Code of Conduct

Be respectful, inclusive, and collaborative. Detailed code of conduct will be published when external contributions open in Phase 1.

How to Stay Updated

Watch this repository for updates
Star to show your interest
Follow development progress in CHANGELOG.md
Read sprint completion reports in to-dos/status/

See CONTRIBUTING.md for comprehensive guidelines (will be expanded for Phase 1).

Acknowledgments

Biological Inspiration

Octopus vulgaris research on distributed nervous systems and autonomous control
Neuroscience studies on swarm intelligence and consensus algorithms
Nature's solution to distributed computing challenges

Technology Stack

LangChain and LlamaIndex for LLM orchestration patterns
OpenAI and Anthropic for frontier LLM capabilities
PostgreSQL, Redis, and Qdrant for data persistence
FastAPI and Axum for high-performance web frameworks
Docker and Kubernetes for containerization and orchestration

Development Tools

Claude Code (Anthropic) for AI-assisted development and documentation
GitHub Actions for CI/CD automation
Codecov for test coverage analysis
Snyk for security vulnerability scanning

Last Updated: 2025-12-16 Document Version: 7.1 (Python Code Quality and Codecov Enhancements) Sprint Status: Sprint 1.2 complete (Orchestrator Core) | Phase 1 SPRINT 1.3 NEXT ✅ Next Review: After Phase 1 Sprint 1.3 completion (Planner Arm Integration) Repository: https://github.com/doublegate/OctoLLM Documentation Site: https://doublegate.github.io/OctoLLM/

Built with ❤️ / 🐙 Inspired by the Intelligence of Cephalopods

Sprint 1.1: Reflex Layer Implementation ✅ COMPLETE (2025-11-14)

Production-Ready Preprocessing Service (~8,650 lines Rust, 218 tests passing):

✅ PII Detection: 18 patterns (SSN, credit cards, emails, API keys, etc.)
- Performance: 1.2-460µs (10-5,435x faster than 5ms target)
- Luhn validation for credit cards, RFC-compliant email validation
- 62/62 unit tests passing
✅ Injection Detection: 14 OWASP-aligned patterns with context analysis
- Performance: 1.8-6.7µs (1,493-5,435x faster than 10ms target)
- Context-aware severity adjustment (quoted, academic, testing)
- 63/63 unit tests passing (all edge cases fixed)
✅ Redis Caching: SHA-256 keyed with differential TTL
- Performance: <0.5ms P95 for cache hits (2x better than target)
- Deterministic cache key generation
- 64/64 unit tests passing (7 require Redis)
✅ Rate Limiting: Token bucket algorithm (distributed via Redis)
- Performance: <3ms P95 for checks (1.67x better than target)
- Multi-dimensional: User (1K/h), IP (100/h), Endpoint, Global
- 64/64 unit tests passing (7 require Redis)
✅ HTTP API: Full integration with middleware and metrics
- POST /process - Main processing pipeline
- GET /health, /ready - Kubernetes probes
- GET /metrics - 13 Prometheus metrics
- 37/37 API tests passing
✅ Comprehensive Testing: 218/218 tests passing (100%)
- 188 unit tests + 30 integration tests
- ~85% code coverage
- Criterion benchmarks for all components
✅ Complete Documentation:
- Component documentation with troubleshooting
- OpenAPI 3.0 specification
- Sprint 1.1 Completion Report
- Sprint 1.2 Handoff Document

Technology: Rust 1.82.0 | Axum 0.8 | Tokio 1.43 | Redis 7+ Status: Ready for Sprint 1.2 (Orchestrator integration)

Sprint 1.2: Orchestrator Integration ✅ PHASE 2 COMPLETE (2025-11-15)

Production-Ready Orchestrator Core (1,776 lines Python, 87 tests passing, 85%+ coverage):

✅ FastAPI Application: 6 REST endpoints operational
- POST /submit - Task submission with Reflex Layer validation
- GET /tasks/{id} - Task status retrieval
- GET /health - Kubernetes liveness probe
- GET /ready - Readiness check (database + Reflex Layer)
- GET /metrics - Prometheus metrics (prepared)
- GET / - Service information
✅ Reflex Layer Integration: Production-ready circuit breaker (504 lines)
- Circuit breaker pattern (failure threshold: 5, reset: 60s)
- Retry logic with exponential backoff (1-5 seconds)
- Health check and readiness probes
- Performance: All endpoints 2-5x faster than targets
✅ Database Layer: Async SQLAlchemy 2.0 + PostgreSQL (383 lines)
- Connection pooling (pool_size=10, max_overflow=20)
- CRUD operations for tasks and results
- Async operations with asyncpg driver
✅ Data Models: Pydantic + SQLAlchemy ORM (255 lines)
- TaskRequest, TaskResponse, ResourceBudget, TaskContract
- Task, TaskResult (database models)
- TaskStatus, Priority enums
✅ Configuration Management: Environment-based settings (148 lines)
- Pydantic BaseSettings with ORCHESTRATOR_ prefix
- Environment variable validation
- PostgreSQL-only database support
✅ Comprehensive Testing: 87/87 tests passing (100% pass rate)
- test_reflex_client.py: 39 tests, 97% coverage
- test_models.py: 34 tests, 92% coverage
- test_config.py: 26 tests, 88% coverage
- test_database.py: 27 tests, 85% coverage
- Total: 2,776 lines test code
✅ Complete Documentation: 4,769 lines
- services/orchestrator/README.md (641 lines)
- docs/components/orchestrator.md (1,039 lines)
- docs/api/openapi/orchestrator.yaml (957 lines)
- Sprint 1.2 completion report (956 lines)
- Sprint 1.3 handoff document (1,176 lines)

Technology: Python 3.11+ | FastAPI 0.115+ | SQLAlchemy 2.0 | PostgreSQL 15+ Status: Core complete, pipeline deferred to Sprint 1.3 Next: Sprint 1.3 - Planner Arm Integration

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.devcontainer		.devcontainer
.github		.github
docs		docs
images		images
infra		infra
infrastructure		infrastructure
ref-docs		ref-docs
scripts		scripts
sdks		sdks
services		services
shared		shared
tests		tests
to-dos		to-dos
tools		tools
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.markdownlint.json		.markdownlint.json
.nojekyll		.nojekyll
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint.yaml		.yamllint.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SPRINT-1.1-PROGRESS-SUMMARY.md		SPRINT-1.1-PROGRESS-SUMMARY.md
codecov.yml		codecov.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

doublegate/OctoLLM

Folders and files

Latest commit

History

Repository files navigation

OctoLLM

What is OctoLLM?

The Octopus Advantage

Architecture Overview

Key Features

1. Distributed Intelligence

2. Performance Optimization

3. Security First

4. Production Ready

Current Status

Phase 1 Progress: Proof of Concept

Recent Achievements

Operational Infrastructure

Phase 0 Foundation (Complete)

Component Status

Documentation Inventory

CI/CD Pipeline

Lint Workflow

Test Workflow

Security Workflow

Build Workflow

Security Achievements

Workflow Files

Quick Start (Development)

Prerequisites

Phase 0 Setup (Current)

Development Environment (Phase 0)

Service Access (Phase 0)

Development Workflow

Use Cases

1. Offensive Security Operations

2. Developer Tooling

3. Research & Analysis

Roadmap

Historical Phases

Phase 1: Proof of Concept (Planned)

Phase 2: Core Capabilities (Planned)

Phase 3-6: Advanced Features (Planned)

Overall Project Status

Technology Stack

Core Languages

Web Frameworks

Databases

CI/CD

Security

Infrastructure

Development Tools

Monitoring

LLM Providers (Phase 1+)

Performance Targets

Documentation

For Developers

For Architects

For Operators

For Security Teams

For Project Managers

Contributing

Development Workflow

Security

Reporting Vulnerabilities

License

Why Apache 2.0?

Acknowledgments

Biological Inspiration

Technology Inspiration

Open Source Projects

Contact

Contributing

Current Development Status

Future Contribution Process (Phase 1+)

Quality Standards

Development Setup (For Future Contributors)

Code of Conduct

How to Stay Updated

Packages