The core runtime and orchestrator for Project AEGIS - a secure, serverless runtime environment for autonomous AI agents.
The AEGIS Orchestrator is the control plane that manages agent lifecycle, enforces security policies, and provides runtime isolation through Docker (development) and Firecracker (production) micro-VMs.
┌─────────────────────────────────────────────┐
│ AEGIS Orchestrator (Rust) │
│ • Scheduling • Security • State Mgmt │
└─────────────────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌─────────┐ ┌─────────┐
│ Docker │ │Firecracker│
│ Runtime │ │ Runtime │
└─────────┘ └─────────┘Pure domain logic implementing:
- Agent lifecycle management
- Runtime trait abstraction
- Security policy engine
- Swarm coordination
HTTP/gRPC server built with Axum for:
- Agent deployment
- Task execution
- Status monitoring
- Management operations
- Docker (
runtime-docker/): Development runtime using containers - Firecracker (
runtime-firecracker/): Production runtime with micro-VMs
Command-line tool for local development and agent management:
# Daemon management
aegis daemon start # Start daemon
aegis daemon stop # Stop daemon
aegis daemon status # Check status
# Agent management
aegis agent deploy agent.yaml # Deploy agent
aegis agent list # List agents
aegis agent logs <agent-name> # Stream agent logs
aegis agent remove <agent-id> # Remove agent
# Task execution
aegis task execute <agent-name> # Execute task
aegis task list # List executions
aegis task logs <execution-id> # View execution logs
aegis task cancel <execution-id> # Cancel executionSee CLI Reference for complete documentation.
Lightweight binary for hybrid cloud/on-prem deployments.
- Rust 1.75+
- Docker 24.0+
- Ollama (for local LLM) or OpenAI API key
- (Production) Linux with KVM support
# Build the CLI and orchestrator
cargo build -p aegis-orchestrator
# Or build in release mode
cargo build --release -p aegis-orchestratorCreate or edit aegis-config.yaml:
apiVersion: 100monkeys.ai/v1
kind: NodeConfig
metadata:
name: "my-aegis-node"
spec:
node:
id: "my-node-001"
type: "edge"
llm_providers:
- name: "local"
type: "ollama"
endpoint: "http://localhost:11434"
enabled: true
models:
- alias: "default"
model: "phi3:mini"
capabilities: ["code", "reasoning"]
context_window: 4096
cost_per_1k_tokens: 0.0
llm_selection:
strategy: "prefer-local"
default_provider: "local"
observability:
logging:
level: "info"See Node Config Reference and aegis-config.yaml for a complete example.
The orchestrator uses structured logging via the tracing crate. Log levels: trace, debug, info, warn, error.
Set Log Level:
# Via environment variable (recommended for development)
export RUST_LOG=debug
cargo run -p aegis-orchestrator -- daemon start
# Via CLI flag
cargo run -p aegis-orchestrator -- daemon start --log-level debug
# Via config file (aegis-config.yaml)
spec:
observability:
logging:
level: "debug" # trace, debug, info, warn, errorBootstrap.py Debugging:
When running at debug level, the orchestrator automatically:
- Logs all stdout from
bootstrap.py(the Python script inside agent containers) - Logs all stderr from
bootstrap.pyas warnings - Enables verbose mode in
bootstrap.py(viaAEGIS_BOOTSTRAP_DEBUG=trueenvironment variable)
This is useful for tracing LLM connectivity issues, prompt delivery, or agent execution failures.
Example Debug Output:
# Start with debug logging
RUST_LOG=debug cargo run -p aegis-orchestrator -- daemon start
# In another terminal, execute an agent
cargo run -p aegis-orchestrator -- task execute my-agent --input "test"
# You'll see in the orchestrator logs:
# DEBUG aegis_orchestrator_core::infrastructure::runtime: Starting bootstrap.py execution container_id="abc123"
# DEBUG aegis_orchestrator_core::infrastructure::runtime: Bootstrap output: "Attempting to connect to Orchestrator at http://host.docker.internal:8088..."
# DEBUG aegis_orchestrator_core::infrastructure::runtime: Bootstrap output: "[BOOTSTRAP DEBUG] Bootstrap starting - execution_id=xxx, iteration=1"
# DEBUG aegis_orchestrator_core::infrastructure::runtime: Bootstrap output: "[BOOTSTRAP DEBUG] Received prompt (1234 chars)"Troubleshooting Bootstrap Issues:
If agents fail to execute or you see connection errors:
- Enable debug logging:
RUST_LOG=debug - Check bootstrap.py output in orchestrator logs
- Verify
AEGIS_ORCHESTRATOR_URLis reachable from inside containers
# Start the daemon
target/debug/aegis daemon start
# Check daemon status
target/debug/aegis daemon status
# Deploy demo agents
cd ../aegis-examples && aegis agent deploy ./agents/echo/agent.yaml
cd ../aegis-examples && aegis agent deploy ./agents/greeter/agent.yaml
# List deployed agents
target/debug/aegis agent list
# Execute a task
target/debug/aegis task execute echo --input "Hello Daemon"
# View agent logs
target/debug/aegis agent logs echo
# Stop the daemon
target/debug/aegis daemon stopFor detailed instructions, see Getting Started Guide.
aegis-orchestrator/
├── core/ # Domain logic (DDD)
├── api/ # HTTP/gRPC server
├── runtime-docker/ # Docker adapter
├── runtime-firecracker/ # Firecracker adapter
├── security/ # Policy enforcement
├── cli/ # CLI tool
├── edge-node/ # Edge node binary
└── tests/ # Integration tests- Domain-Driven Design: Clear bounded contexts
- Hexagonal Architecture: Pure domain core with infrastructure adapters
- Type Safety: Leverage Rust's type system
- Security First: Default-deny policies
# Unit tests
cargo test --lib
# Integration tests
cargo test --test '*'
# Specific component
cargo test -p aegis-coreSee examples/ for sample configurations.
The orchestrator enforces:
- Isolation: Kernel-level (Firecracker) or namespace-based (Docker)
- Network Control: DNS/IP allow-listing
- Resource Limits: CPU, memory, execution time
- Audit Trail: Immutable logging
For details, see Security Model.
- Cold Start: <125ms (Firecracker)
- Throughput: 1,000+ agents/second (target)
- Memory: ~128MB per Firecracker VM
Full documentation is available at docs.100monkeys.ai.
| Section | Description |
|---|---|
| Getting Started | Install, configure, and run your first agent |
| Core Concepts | Agents, executions, workflows, swarms, security model |
| Writing Agents | Author and structure agent code |
| Deploying Agents | Deploy agents with the CLI or API |
| LLM Providers | Configure Ollama, OpenAI, and other LLM backends |
| Building Workflows | Chain agents into multi-step workflows |
| Building Swarms | Coordinate parallel agent swarms |
| Configuring Storage | Persistent storage backends for agents |
| Local Testing | Test agents locally before deploying |
| Architecture | Execution engine, SMCP, storage gateway, event bus |
| Security Model | Isolation, network control, secrets, audit trail |
| Deployment — Docker | Run the orchestrator with Docker |
| Deployment — Firecracker | Production micro-VM setup |
| Secrets Management | OpenBao integration via secret-store ACL (Keymaster Pattern) |
| IAM | Keycloak identity and access management |
| Configuration Reference | NodeConfig YAML reference (aegis-config.yaml) |
| Agent Manifest Reference | AgentManifest YAML field reference |
| Workflow Manifest Reference | WorkflowManifest YAML field reference |
| CLI Reference | Complete aegis CLI command reference |
| gRPC API | aegis.runtime.v1 service methods and message types |
- Enable CUDA for containers
- https://learn.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl
- https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl-2
- https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
sudo nvidia-ctk runtime configure --runtime=dockersudo systemctl restart docker
AGPL-3.0. See LICENSE for details.
- aegis-sdk-python - Python SDK
- aegis-sdk-typescript - TypeScript SDK
- aegis-control-plane - Web dashboard
- aegis-examples - Example agents
Built with Rust for security, performance, and reliability.