You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AI agent governance platform providing safety, cost control, and reliability for enterprise AI agents
Architecture
Term
Definition
Control Plane
Management layer: API Gateway, Dashboard, Scheduler. Cloud-based by default (management only, NOT in request path)
Data Plane
Processing layer: Safety, Cost, Reliability, Observability services. By default (95%), runs LOCALLY on developer machine. Only in Server Execution Mode (5%, future) runs in cloud
Agent Runtime
Execution layer: Agent execution environment. By default (95%), agent runs as local process on developer machine. Only in Server Execution Mode (5%, future) runs in cloud (Kubernetes pods)
Local Execution
Agent runs on developer's machine (primary). Control Panel manages budget via IC Token protocol. See architecture/001
Server Execution
Agent runs on Iron Cage servers (future, post-pilot). Control Panel manages budget identically to local execution
Control Panel
ALWAYS present standalone admin service. Admin allocates budgets, manages developers, stores IP Tokens, monitors spending. See architecture/003
Gateway
Central orchestrator that routes requests through processing layers. By default (95%), runs LOCALLY on developer machine (localhost:8084). Only in Server Execution Mode (5%, future) runs in cloud. See architecture/005
Layer Model
Six processing layers: Safety, Cost, Reliability, Provider, Output Safety, Observability
Service Boundaries
Separation between Control Plane, Data Plane, and Agent Runtime
Data Flow
End-to-end request journey from user input to LLM response
Execution Models
Where agents execute: Local (primary) vs Server (future, post-pilot)
Library Mode
Default SDK deployment where runtime embedded in-process via PyO3. Developer code: from iron_cage import LlmRouter. Overhead: ~0.5ms (FFI). Single process, no separate runtime. See architecture/008
Router Mode
Optional deployment where runtime runs as separate process exposing HTTP API. Two use cases: (1) Non-SDK frameworks (LangChain, CrewAI) point to localhost:8080, (2) iron_cage optionally configured for HTTP. Same developer code as Library mode for SDK users. Overhead: ~5ms (HTTP). See architecture/008
Entities
Term
Definition
Agent
AI agent executing on developer's machine. Has exactly one IC Token (1:1), exactly one Agent Budget (1:1, restrictive), can use multiple Inference Providers. Belongs to one Project
Project
Collection of agents, Inference Provider assignments, and entities. Has exactly one Project Budget (1:1, informative). Owned by admin or team
Master Project
Special project containing ALL resources (all agents, all Inference Providers, all budgets). Admin-only. Has Master Budget (informative). MUST be in Pilot
Orphaned Agents Project
Special project containing agents from deleted user accounts. ID: proj-orphaned, Name: "Orphaned Agents", Owner: System admin. Agents automatically moved here when user deleted. Admin can reassign to new users or delete. Agents remain operational (budgets active, IC Tokens valid). See Protocol 008
IP
Inference Provider entity (OpenAI, Anthropic, etc.). Has IP Budget (informative), has IP Token(s). Can be assigned to multiple agents. Deletion cascades - removing provider automatically unassigns from all agents (ON DELETE CASCADE). See Protocol 011
Agent Budget
Restrictive budget (ONLY budget that blocks requests). 1:1 with agent. Hard limit enforcement
Project Budget
Informative budget (statistics only, no blocking). 1:1 with project. Shows aggregate agent spending
IP Budget
Informative budget (statistics only, no blocking). Per Inference Provider. Shows provider spending
Master Budget
Informative budget (statistics only, no blocking). Part of master project. Shows all spending across all projects
Budget Control
Agents are the ONLY way to control budget. Agent budget blocks requests (restrictive). All other budgets (project, Inference Provider, master) are informative only (show spending, can't block)
Budget Change Request
Developer-initiated request for admin to modify agent budget. Workflow: developer creates request with justification → admin approves/rejects → budget automatically updated on approval. State machine: pending → approved/rejected/cancelled. See Protocol 017
Resources
Term
Definition
Resource
REST API endpoint or endpoint group exposed by Control Panel. Maps to domain entities or operations. See architecture/009
Entity Resource
REST resource with 1:1 or 1:N mapping to domain entity, supporting CRUD operations. Plural names (/api/tokens, /api/projects). Example: /api/tokens → IC Token entity
Operation Resource
REST resource exposing operations/actions not mapping directly to single entity CRUD. Action-oriented, often POST-only. Example: /api/auth → login/logout operations
Analytics Resource
Read-only REST resource providing aggregated/derived metrics from multiple entities. GET-only, statistical nature. Example: /api/analytics → usage, spending, and performance metrics
Exhaustive inventory of all REST API resources with entity mapping, authentication patterns, and certainty classification. See architecture/009
User-Facing Resource
REST resource accessible via CLI and Control Panel dashboard. Requires User Token authentication. Has CLI-API parity
Agent-Facing Resource
REST resource used by iron_runtime for agent operations. Requires IC Token authentication. No CLI mapping (e.g., /api/budget/*)
Roles
Term
Definition
Admin
Full Control Panel access. Allocates budgets, creates developer accounts, monitors all spending, manages IP Tokens
Super User
Developer + read-only Control Panel dashboard access (own budgets only). Cannot allocate budgets or see other developers
Developer
Regular user managed by admin. Runs agents with IC Token, views usage via CLI + Dashboard (read-only own usage). Can select model and Inference Provider among allowed
Tokens
Term
Definition
IC Token
Internal Control Token - Developer-visible JWT for agent authentication. 1:1 with agent (one agent = one IC token, can't share). Developer can regenerate their own IC Token (replaces existing). Admin can regenerate any IC Token. Lifetime: Until agent deleted (long-lived, no auto-expiration). See protocol/005
User Token
Control Panel CLI/Dashboard authentication token. Different from IC Token (agent auth). Users can have multiple active User Tokens. Developer can regenerate own, admin can regenerate any. Lifetime: 30 days default
IP Token
Inference Provider Token - LLM provider API key (sk-proj-, sk-ant-). Stored in Control Panel vault, NEVER exposed to developer. Runtime receives encrypted copy from Control Panel. Session-only lifetime. See protocol/005
Token Translation
Process where Runtime replaces IC Token with IP Token in LLM requests. Latency: <1ms. IP Token decrypted, used, then zeroed. See protocol/005
API Token
Persistent authentication token for Control Panel dashboard and automation scripts. Format: apitok_ prefix (e.g., apitok_xyz789abc123...). SAME-AS-USER scope (inherits user permissions). Primary use: Dashboard access. Secondary use: Admin automation. Token value shown only once at creation (GitHub pattern). Different from IC Token (agent auth) and User Token (session auth). See protocol/014
Budget Management
Term
Definition
Budget Allocation
Total budget admin assigns to agent (e.g., $100) in Control Panel. Tracked centrally in database.
Budget Portion
Incremental amount Runtime borrows from total (e.g., $10). Enables real-time control without upfront full budget transfer. Default: $10 per borrow.
Budget Borrowing
Protocol where Runtime requests budget portions from Control Panel. Borrows $10 chunks from total allocation. Trigger: remaining < $1. See protocol/005
Lease ID
Unique identifier for budget portion allocation. Tracks which $10 portion Runtime currently using. Changes with each borrow (lease-001, lease-002, etc.).
Budget Threshold
Remaining budget level triggering borrow request. Default: $1.00. When remaining < threshold, Runtime requests more.
Incremental Budget
Strategy of allocating budget in portions ($10) rather than full amount ($100) upfront. Benefits: Real-time enforcement, admin can stop mid-session, limits exposure if IC Token stolen.
Technique for splitting large result sets into pages using limit and offset query parameters. Standard across all list endpoints (e.g., GET /agents?limit=50&offset=100)
Rate Limiting
Request frequency control to prevent abuse. Configured per API token with max_requests_per_minute and max_requests_per_day limits. Enforced at API Gateway layer
API Gateway
Production service (gateway.ironcage.ai/v1) that routes runtime agent requests to appropriate providers, enforces budgets, and applies processing layers. Distinct from the general Gateway architectural concept