Skip to content

Latest commit

 

History

History
748 lines (578 loc) · 33.8 KB

File metadata and controls

748 lines (578 loc) · 33.8 KB

ADR-036: RuVector AGI Cognitive Container with Claude Code Orchestration

Status: Partially Implemented Date: 2026-02-15 (updated 2026-02-17) Decision owners: RuVector platform team, Claude Flow orchestration team, RVF runtime team Depends on: ADR-029 (RVF Canonical Format), ADR-030 (Cognitive Container), ADR-033 (Progressive Indexing Hardening), ADR-034 (QR Cognitive Seed), ADR-035 (Capability Report), ADR-039 (RVF Solver WASM AGI Integration) Affects: rvf-types/src/agi_container.rs, rvf-runtime, npm/packages/rvf-solver/, npm/packages/rvf/

Context

A state change into general intelligence can emerge when two conditions hold:

  1. Existential facilities -- a substrate that can persist identity, memory, constraints, health signals, and self-maintenance.
  2. Architectural organization -- a framework that can package the system, control execution, and enforce repeatability while enabling incremental self-reinforced feedback loops.

RuVector is the existential substrate. RVF is the organizational and packaging framework. Claude Code is the runtime orchestrator for planning and execution, using agent teams and tool connectivity via MCP.

The deliverable is a portable intelligence package that other teams can run and obtain the same graded outcomes, with replayable witness logs, policy controls, and deterministic environment capture.

Problem Statement

We need an architecture that can do all of the following in one system:

  1. Learn continuously from real-world event streams
  2. Maintain its own structural health and recover from corruption or drift
  3. Act through tools with governed authority
  4. Produce repeatable outcomes across machines and teams
  5. Package the full intelligence state so it can be shipped, audited, and replayed

Most LLM-centered architectures measure success by static accuracy, but this thesis needs longitudinal coherence under mutation. This ADR defines that system boundary explicitly.

Decision Drivers

  1. Repeatable outcomes, not just plausible responses
  2. Long-horizon coherence under continuous updates
  3. Governance by default, including proof trails for actions
  4. Minimal reliance on hidden model internals for learning
  5. Portability across environments, including edge and offline modes
  6. Strong separation of control plane and data plane
  7. Tool-use reliability, batching, and reduced context pollution

Claude Code is chosen as orchestrator because it is designed to read codebases, edit files, run commands, manage workflows, and integrate with external systems via MCP, including multi-agent teams coordinated by a lead.

Programmatic tool calling is used as the preferred high-reliability tool orchestration strategy because it makes control flow explicit in code and reduces repeated model round-trips and context bloat.

Definitions

Term Definition
RuVector substrate Persistent world model combining vectors, graphs, constraints, and signals. Supports graph querying via Cypher. Includes self-learning and graph neural embedding updates, with dynamic minimum-cut as a coherence signal.
RVF framework Cognitive container format that packages data, indexes, models, policies, and runtime into a single artifact. A single file that stores vectors and models, boots as a Linux microservice, accelerates queries using eBPF, branches at cluster granularity, and provides cryptographic witness chains.
Claude Code orchestrator Agentic coding and task execution environment that runs in terminal, IDE, desktop, and web. Connects external tools via MCP. Coordinates agent teams.
Claude Flow Multi-agent orchestration layer that turns Claude Code into a swarm-style coordinator with router, agents, shared memory, and learning loop.
Structural health Measurable invariants indicating world model integrity: coherence gates, contradiction tracking, memory integrity, policy compliance, rollback readiness.
Witness chain Cryptographic attestation trail linking each state change to inputs, decisions, and tool outputs. See ADR-035.
Same results Identical graded outcomes and artifacts for a benchmark run, not necessarily identical intermediate tokens. Enforced through replay mode and verification mode.

Considered Options

# Option Verdict
1 LLM-only agent with prompt history and ad-hoc logs Rejected: no structural health, no reversibility, no packaging
2 LLM + vector retrieval memory only Rejected: no coherence gating, no witness chains, no portable replay
3 LLM + RuVector world model + RVF cognitive container, orchestrated by Claude Code and Claude Flow Selected

Rationale: Options 1 and 2 cannot meet the thesis because they lack explicit structural health machinery, reversible state transitions, and portable replayable intelligence packaging.

Decision

Build the AGI system as a closed-loop cognitive container:

  1. Claude Code is the control plane orchestrator. It spawns an agent team and coordinates planning and execution.
  2. Claude Flow provides the swarm orchestration model, routing tasks to specialized agents and managing shared memory and learning loop semantics.
  3. RuVector is the existential substrate, storing world model state, typed memory, constraints, and coherence signals, queryable via graph queries and vector search.
  4. RVF is the portable intelligence package format. It encapsulates the agent runtime, RuVector state snapshot and deltas, policies, indexes, tool adapters, and the evaluation harness so others can reproduce the same graded results.
  5. Learning occurs primarily by structured memory mutation and skill promotion governed by coherence and evaluation gates, not by continuous weight updates.

Architecture Overview

System Boundary

Inside the boundary:

  1. Claude Code lead session
  2. Claude Flow router and swarm manager
  3. Tool adapters and execution sandbox
  4. RuVector database cluster (or embedded instance)
  5. RVF container runtime and witness chain engine (ADR-035)
  6. Evaluation harness and graders

Outside the boundary:

  1. External data sources (repos, ticketing, logs, sensors)
  2. External model provider infrastructure
  3. Human approvals (if policy requires)

High-Level Data Flow

Event Ingestion ──> World Model Update Proposal
       │                      │
       │               Structural Health Gate
       │                      │
       │              ┌───────┴───────┐
       │              │  Gate PASS?   │
       │              │  yes    no    │
       │              │   │     │     │
       │              │  Commit Reject│
       │              │   │     │     │
       │              └───┴─────┘     │
       │                  │           │
       ▼                  ▼           ▼
 Plan & Act Loop    Reflection    Rollback &
 (Claude Code +     & Compress    Quarantine
  Claude Flow)          │
       │                │
       ▼                ▼
 Commit & Witness (RVF ADR-035)
  1. Event ingestion: Real-world events arrive and are normalized into a canonical event schema.
  2. World model update proposal: The system proposes graph mutations and memory writes in RuVector.
  3. Structural health gating: Coherence checks, contradiction checks, and policy checks determine if the proposal can be committed.
  4. Plan and act loop: Claude Code and Claude Flow coordinate tool calls to act in the environment, using programmatic tool calling patterns.
  5. Reflection and compression: Results are summarized into stable facts, procedures, and counterexamples.
  6. Commit and witness: Deltas are committed into RuVector and sealed into the RVF witness chain (ADR-035).

Control Plane / Data Plane Separation

Aspect Control Plane Data Plane
Who Claude Code + Claude Flow RuVector + RVF
Does Decides what to do; generates proposed deltas and tool actions Executes storage, retrieval, graph ops, embeddings, coherence
Varies Internal reasoning may vary between runs Only gated commits become reality
Enforces Plans and policies Packaging, execution boundaries, attestations

This separation is the core of repeatability.

Components and Responsibilities

Component A: Claude Code Lead Agent

Inputs Outputs
Task description Plans
Current RVF container identity and policy Tool calls
RuVector retrieval results Proposed memory mutations
Tool outputs and environment observations Commit requests

Key capabilities: agent teams for parallel decomposition, MCP tool connectivity, project instruction loading for consistent behavior across runs.

Component B: Claude Flow Swarm Manager

Inputs Outputs
Lead agent goal graph Sub-agent tasks
System policy limits Consensus proposals
RuVector shared memory state Aggregated plan; learning loop updates

Architecture: router-to-swarm-to-agents with learning loop and shared memory.

Component C: RuVector Substrate

Inputs Outputs
Events, text, code, images, structured records Retrieved memories and facts
Embeddings, graph mutation deltas Graph query results (Cypher)
Health telemetry updates Embedding/ranking updates (self-learning)
Coherence signals (dynamic minimum-cut)

Component D: RVF Cognitive Container Runtime

Inputs Outputs
Container manifest Bootable runtime environment
Segmented data blobs Reproducible execution environment
Policy and permissions Signed witness records (ADR-035)
Cryptographic keys Branchable snapshots

Component E: Tool Execution Sandbox

Inputs Outputs
Tool call plans from Claude Code Tool results as structured objects
Programmatic tool calling scripts Tool receipts with hashes
Policy rules Failure modes and retry classifications

RuVector World Model Schema

Node Types

# Type Purpose
1 AgentIdentity Stable identity, keys, role, authority limits
2 Event Normalized external observation (timestamp, source, payload hash)
3 Claim Statement that may be true or false, linked to evidence
4 Evidence Pointer to tool output, document excerpt, test output, sensor observation
5 Plan Goal tree, constraints, success criteria, expected cost
6 Action Tool invocation request with preconditions and expected effect
7 Outcome Observed effects, pass/fail, test results, diffs, side effects
8 Skill Reusable procedure with applicability conditions, constraints, and tests
9 Policy Rules for permissions and safety boundaries
10 HealthSignal Coherence metrics, drift, contradiction density, memory integrity

Edge Types

Edge Semantics
CAUSED Event CAUSED Claim or Outcome
SUPPORTS Evidence SUPPORTS Claim
CONTRADICTS Claim CONTRADICTS Claim
DEPENDS_ON Plan DEPENDS_ON Skill or Evidence
EXECUTES Action EXECUTES Tool
PRODUCED Action PRODUCED Outcome
PROMOTED_FROM Skill PROMOTED_FROM repeated successful Plans
BLOCKED_BY Action BLOCKED_BY Policy
HEALTH_OF HealthSignal HEALTH_OF subsystem or memory region

Invariants

# Invariant Rule
1 Evidence binding Any externally testable claim must have at least one Evidence edge; otherwise tagged unverified and cannot justify irreversible actions
2 Contradiction locality A contradiction edge must reference the minimal conflicting claims, not a broad document blob
3 Action gating Any action that changes external state must reference the policy decision node that allowed it
4 Replay completeness Every tool output referenced by evidence must be hashable and stored or re-derivable from deterministic inputs

Structural Health and Coherence Gate Design

This is the mechanism that operationalizes the state-change thesis. It turns continuous learning into safe incremental commits.

Health Signals

# Signal Computation
1 Coherence score Dynamic minimum-cut on active working set subgraph. Measures separability between consistent clusters and contradiction boundaries.
2 Contradiction pressure Rate of new contradiction edges per unit time, weighted by claim criticality
3 Memory integrity Schema validation success, witness chain continuity, segment hash integrity
4 Tool reliability Error rates, retries, timeouts, drift in tool schemas
5 Cost stability Cost-per-solved-task trend, abnormal spikes

Coherence Gate Rules

Rule Trigger Action
1. Block unsafe commits Coherence score drops below threshold after proposed delta Reject and open repair plan
2. Require counterexample storage An outcome fails Counterexample must be created and linked before any new skill promotion
3. Limit graph churn Contradiction pressure exceeds threshold Freeze new skill promotion; focus on repair and consolidation
4. Quarantine volatile memories New claims arrive Enter volatile pool until reinforced by independent evidence or repeated success

Learning Loop Design

Learning Primitives

  1. Episodic capture: Store event, plan, action, outcome chain as an episode
  2. Reflection: Extract stable claims and failure causes, bind evidence
  3. Consolidation: Merge redundant claims, compress long traces into summaries plus pointers, maintain witness chain
  4. Skill promotion: Promote procedure into Skill node only when criteria met

Skill Promotion Criteria

A candidate becomes a skill when all of the following are true:

  1. It has succeeded K times on non-identical inputs
  2. It has at least one negative example recorded and bounded
  3. It has objective graders that validate outputs
  4. It does not increase policy violations or coherence degradation

Self-Reinforced Feedback Loops

A loop is self-reinforced when successful actions increase the system's future probability of selecting high-value plans, while structural health remains within bounds.

Mechanism:

  • Success produces evidence and updated skill priors
  • RuVector retrieval makes these skills easier to select
  • Coherence gates prevent runaway self-confirmation

Repeatability and Portable Intelligence Packaging

RVF Packaging Decision

One RVF artifact contains:

Segment Contents
Manifest and identity Container ID, build ID, model routing config, policy version, tool adapter registry
Runtime Claude Flow orchestrator config, agent role prompts, tool schemas, sandbox config
RuVector snapshot Base world model graph, indexes, embeddings, skill library, policy nodes
Delta journal Append-only commits with witness chain records (ADR-035)
Evaluation harness Task suite, graders, scoring rules, replay scripts

Two Execution Modes

Mode Goal Method Pass Condition
Replay Bit-identical artifact reproduction No external tool calls; use stored receipts and outputs All graders match exactly; witness chain matches
Verify Same graded outcomes under live tools Tools called live; outputs stored and hashed Outputs pass same tests; costs within expected bounds

This is how you claim "same results" without over-promising identical token sequences across different infrastructure.

Determinism Controls

  1. Pin model ID to a specific version in the container manifest
  2. Set sampling for maximum determinism in production runs
  3. Store prompt and instruction hashes for each run
  4. Virtualize time for tasks that depend on timestamps
  5. Freeze external dependencies by snapshotting repos and data sources
  6. Record all tool outputs with hashes and schema versions

AGI npm Package Distribution

The AGI capabilities of the cognitive container are distributed as npm packages, enabling JavaScript/TypeScript consumers to access the self-learning engine, witness chains, and HNSW index operations without a Rust toolchain.

Package Ecosystem

Package Version AGI Capabilities
@ruvector/rvf-solver 0.1.0 Thompson Sampling PolicyKernel, KnowledgeCompiler, three-loop adaptive solver, SHAKE-256 witness chains, 18 context buckets, speculative dual-path execution
@ruvector/rvf-node 0.1.6 HNSW index statistics, witness chain verification, store freeze (snapshot), distance metric introspection
@ruvector/rvf-wasm 0.1.5 Witness chain verification (rvf_witness_verify), WASM microkernel for browser/edge
@ruvector/rvf 0.1.8 Unified SDK re-exporting all of the above; single npm install for full AGI access

Self-Learning Solver API

import { RvfSolver } from '@ruvector/rvf';

const solver = await RvfSolver.create();

// Three-loop training: fast (solve) / medium (policy) / slow (compiler)
const result = solver.train({ count: 1000, minDifficulty: 1, maxDifficulty: 10 });

// Full acceptance test with A/B/C ablation modes
const manifest = solver.acceptance({ cycles: 5, holdoutSize: 100 });

// Inspect learned policy state
const policy = solver.policy();

// Export tamper-evident witness chain (73 bytes per entry)
const chain = solver.witnessChain();

solver.destroy();

AGI NAPI Methods

The native Node.js bindings expose AGI-relevant operations:

Method Returns AGI Purpose
indexStats() RvfIndexStats Introspect HNSW graph structure (layers, M, ef_construction) for coherence monitoring
verifyWitness() RvfWitnessResult Validate witness chain integrity for replay/verify modes
freeze() void Snapshot-freeze state for deterministic branching
metric() string Distance metric introspection for coherence signal computation

Integration with Cognitive Container

The npm packages map to cognitive container components:

Container Component npm Package Segment
Self-learning engine @ruvector/rvf-solver SOLVER_SEG (computed in WASM)
Witness chain attestation @ruvector/rvf-solver + @ruvector/rvf-wasm WITNESS_SEG (0x0A)
Vector storage & retrieval @ruvector/rvf-node VEC_SEG, INDEX_SEG
HNSW index inspection @ruvector/rvf-node INDEX_SEG
Browser-side verification @ruvector/rvf-wasm WITNESS_SEG verification

MCP Tools

Core MCP tools to implement:

Tool Purpose
ruvector_query Vector search and filtered retrieval
ruvector_cypher Graph query and traversal for claims, evidence, contradictions
ruvector_commit_delta Propose and commit world model deltas behind coherence gates
rvf_snapshot Create a branchable snapshot for experiments
rvf_witness_export Export witness chain proofs for audit (ADR-035)
rvf_solver_train Run self-learning solver training via @ruvector/rvf-solver
rvf_solver_acceptance Execute full A/B/C ablation acceptance test
eval_run Run the container's benchmark suite and return graded results

Security Model

Threat Model

  1. Prompt injection via untrusted content
  2. Tool abuse and unintended side effects
  3. Data exfiltration via tool channels
  4. Memory poisoning causing long-horizon drift
  5. Supply chain drift causing irreproducible results

Controls

# Control Mechanism
1 Capability-based permissions Each tool call requires explicit capability grants; high-risk actions require approvals
2 Policy as data Policies live in RuVector and are embedded in RVF manifest; policy cannot silently change between runs
3 Witnessed commits Every commit is attested with inputs, policy decision, and tool receipts (ADR-035)
4 Quarantine zone Untrusted inputs enter quarantine; cannot directly affect skill promotion
5 Sandboxed execution Tool scripts run in restricted environments; programmatic tool calling makes control flow explicit

Observability and Benchmarking

Required Metrics

  1. Success rate on task suite
  2. Policy violations count
  3. External side effects count
  4. Contradiction rate
  5. Coherence score trend
  6. Rollback frequency and success
  7. Dollars per solved task
  8. p50 and p95 latency per task
  9. Tool error rate

Benchmark Tiers

Tier Name Purpose
1 Deterministic replay suite Verifies packaging and witness integrity
2 Tool and memory suite Measures long-horizon stability and coherence gating
3 Production domain suite Measures real outcomes (repo issue fixes, compliance, deployments)

Proof Artifact per Run

Each run exports:

  1. Run manifest
  2. Task inputs and snapshots
  3. All tool receipts and hashes
  4. All committed deltas
  5. Witness chain export (ADR-035)
  6. Grader outputs and final scorecard

Consequences

Positive

  1. Clear system boundary for intelligence measurement -- the composite system is evaluated, not the model in isolation
  2. Repeatability as a product feature -- RVF container + witness chain + replay mode enables credible external validation
  3. Safety is structural -- policies and coherence gates are part of the substrate, not an afterthought
  4. Multi-agent scalability -- Claude Code agent teams + Claude Flow swarm routing supports parallel work and specialization

Negative / Risks

  1. Complexity risk -- system of systems; requires investment in harnesses and invariants early
  2. Non-determinism risk from model providers -- replay mode mitigates by recording outputs
  3. Memory poisoning risk -- powerful memory can amplify wrong beliefs if coherence gates are weak; bias toward evidence binding and counterexample capture
  4. Benchmark gaming risk -- weak graders will be exploited; build robust graders first

Implementation Plan

Phase 1: Foundation

Deliverables:

  1. RuVector schema and APIs for events, claims, evidence, contradictions
  2. RVF container manifest format for model, policy, tool registry, snapshots
  3. MCP server exposing RuVector and RVF operations to Claude Code
  4. Basic witness log and delta commit pipeline (ADR-035 -- done)

Exit criteria: Replay mode works on a small deterministic suite.

Phase 2: Coherence Gating

Deliverables:

  1. Structural health signals and thresholds
  2. Dynamic minimum-cut coherence metric integration
  3. Rollback and quarantine semantics
  4. Contradiction detection routines

Exit criteria: No irreversible external tool calls allowed when coherence is below threshold.

Phase 3: Learning and Skill Promotion

Deliverables:

  1. Skill nodes, promotion criteria, and tests
  2. Consolidation and compaction routines
  3. Counterexample-driven repair

Exit criteria: Skills improve success rate over time without increasing contradictions.

Phase 4: Portable Intelligence Distribution

Deliverables:

  1. One-RVF-file distribution pipeline
  2. Public evaluation harness packaged inside RVF
  3. Verification mode that produces same graded outcomes across machines

Exit criteria: Two independent teams run the same RVF artifact and achieve the same benchmark scorecard.

Resolved Design Questions

Q1: First domain for proving the state-change thesis

Decision: Repo automation (software engineering lifecycle).

Rationale: This domain provides the strongest combination of (a) verifiable outcomes (tests pass, code compiles, PR merges), (b) tool-rich environment (git, CI, code editors via Claude Code), (c) naturally occurring event streams (issues, commits, reviews), and (d) existing infrastructure in Claude Code + Claude Flow. The evaluation harness measures: issues solved, test success rate, regression introduction rate, cost per solved issue, and witness chain completeness.

Subsequent domains (incident triage, governance workflows, edge autonomy) are pursued after the repo automation scorecard achieves >= 60/100 solved with zero policy violations.

Q2: Authority levels

Decision: Four-level authority model, default ReadMemory.

#[repr(u8)]
pub enum AuthorityLevel {
    /// Read-only: query vectors, graphs, memories. No mutations.
    ReadOnly = 0,
    /// Write to internal memory: commit world model deltas behind
    /// coherence gates. No external tool calls.
    WriteMemory = 1,
    /// Execute tools: run sandboxed tools (file read/write, tests,
    /// code generation). External side effects are gated by policy.
    ExecuteTools = 2,
    /// Write external: push code, create PRs, send messages, modify
    /// infrastructure. Requires explicit policy grant per action class.
    WriteExternal = 3,
}

Each action in the world model must reference a policy decision node (invariant #3, "Action gating") that grants at least the required authority level. The container manifest declares the maximum authority level permitted for a given execution. Higher levels require explicit policy override.

Default for Replay mode: ReadOnly. Default for Verify mode: ExecuteTools. Default for Live mode: WriteMemory (escalation to higher levels requires policy grant per action class).

Q3: Resource budgets

Decision: Per-task resource budgets with hard caps.

Every task execution is bounded by:

Resource Default Cap Override
Wall-clock time per task 300 seconds Policy override, max 3600s
Total model tokens per task 200,000 Policy override, max 1,000,000
Total cost per task $1.00 Policy override, max $10.00
Tool calls per task 50 Policy override, max 500
External write actions per task 0 (ReadOnly) Requires WriteExternal authority

Budget exhaustion triggers graceful degradation: the task enters Skipped outcome with a BudgetExhausted postmortem in the witness bundle.

Q4: Coherence thresholds

Decision: Three configurable thresholds stored in the container header.

Threshold Default Effect when breached
min_coherence_score 0.70 Block all commits; enter repair mode
max_contradiction_rate 5.0 per 100 events Freeze skill promotion
max_rollback_ratio 0.20 Halt Live execution; require human review

These map to ADR-033's quality framework: the coherence score is analogous to ResponseQuality -- it signals whether the system's internal state is trustworthy enough to act on.

Wire Format

AgiContainerHeader (64 bytes, repr(C))

The AGI container is stored as a Meta segment (SegmentType::Meta = 0x07) in the RVF file, alongside the KERNEL_SEG, WASM_SEG, VEC_SEG, INDEX_SEG, WITNESS_SEG, and CRYPTO_SEG that hold the actual payload data.

Offset  Type        Field               Description
------  ----        -----               -----------
0x00    u32         magic               0x52564147 ("RVAG")
0x04    u16         version             Header format version (currently 1)
0x06    u16         flags               Bitfield (see below)
0x08    [u8; 16]    container_id        Unique container UUID
0x18    [u8; 16]    build_id            Build UUID (changes on repackaging)
0x28    u64         created_ns          Creation timestamp (nanos since epoch)
0x30    [u8; 8]     model_id_hash       SHA-256 of pinned model ID, truncated
0x38    [u8; 8]     policy_hash         SHA-256 of governance policy, truncated

Flags (u16 bitfield)

Bit   Name                    Description
---   ----                    -----------
0     AGI_HAS_KERNEL          KERNEL_SEG with micro Linux kernel present
1     AGI_HAS_WASM            WASM_SEG modules present
2     AGI_HAS_ORCHESTRATOR    Claude Code + Claude Flow config present
3     AGI_HAS_WORLD_MODEL     VEC_SEG + INDEX_SEG world model data present
4     AGI_HAS_EVAL            Evaluation harness (tasks + graders) present
5     AGI_HAS_SKILLS          Promoted skill library present
6     AGI_HAS_WITNESS         ADR-035 witness chain present
7     AGI_SIGNED              Container is cryptographically signed
8     AGI_REPLAY_CAPABLE      All tool outputs stored; supports replay mode
9     AGI_OFFLINE_CAPABLE     Container can run without network access
10    AGI_HAS_TOOLS           MCP tool adapter registry present
11    AGI_HAS_COHERENCE_GATES Coherence gate configuration present

TLV Manifest Tags

Following the header, a TLV (tag-length-value) manifest contains the container's configuration sections:

Tag Name Content
0x0100 CONTAINER_ID Container UUID
0x0101 BUILD_ID Build UUID
0x0102 MODEL_ID Pinned model identifier (UTF-8)
0x0103 POLICY Serialized governance policy
0x0104 ORCHESTRATOR Claude Code + Claude Flow config
0x0105 TOOL_REGISTRY MCP tool adapter registry
0x0106 AGENT_PROMPTS Agent role prompts
0x0107 EVAL_TASKS Evaluation task suite
0x0108 EVAL_GRADERS Grading rules
0x0109 SKILL_LIBRARY Promoted skill library
0x010A REPLAY_SCRIPT Replay automation script
0x010B KERNEL_CONFIG Kernel boot parameters
0x010C NETWORK_CONFIG Network configuration
0x010D COHERENCE_CONFIG Coherence gate thresholds and rules
0x010E PROJECT_INSTRUCTIONS Claude.md project instructions
0x010F DEPENDENCY_SNAPSHOT Dependency snapshot hashes
0x0110 AUTHORITY_CONFIG Authority level and resource budgets
0x0111 DOMAIN_PROFILE Target domain profile (RVText, etc.)

Unknown tags are ignored (forward-compatible).

Implementation

Types are fully implemented in rvf-types/src/agi_container.rs (972 lines, 24 tests).

Implemented types:

Type Size / Kind Description Tests
AgiContainerHeader 64 bytes (repr(C)) Wire-format header with magic "RVAG" (0x52564147), to_bytes()/from_bytes() serialization, compile-time size assertion 4
ExecutionMode u8 enum Replay (0), Verify (1), Live (2) with TryFrom<u8> 1
AuthorityLevel u8 enum ReadOnly (0), WriteMemory (1), ExecuteTools (2), WriteExternal (3) with TryFrom<u8>, PartialOrd/Ord, permits(), default_for_mode() 4
ResourceBudget struct Per-task resource caps with DEFAULT, EXTENDED, MAX presets and clamped() method 3
CoherenceThresholds struct Three configurable thresholds (min_coherence_score, max_contradiction_rate, max_rollback_ratio) with DEFAULT, STRICT presets and validate() method 5
ContainerSegments struct Segment presence tracker with validate(mode) and to_flags() 7
ContainerError enum 6 variants: MissingSegment, TooLarge, InvalidConfig, SignatureInvalid, InsufficientAuthority, BudgetExhausted with Display 1

Constants defined:

  • 13 flag constants (AGI_HAS_KERNEL through AGI_HAS_DOMAIN_EXPANSION, bits 0-12)
  • 22 TLV manifest tag constants (AGI_TAG_CONTAINER_ID 0x0100 through AGI_TAG_COUNTEREXAMPLES 0x0115)
  • Includes 4 domain expansion tags: AGI_TAG_TRANSFER_PRIOR (0x0112), AGI_TAG_POLICY_KERNEL (0x0113), AGI_TAG_COST_CURVE (0x0114), AGI_TAG_COUNTEREXAMPLES (0x0115)

Key design properties:

  • AuthorityLevel::permits() enables level comparison: WriteExternal permits all lower levels
  • AuthorityLevel::default_for_mode() maps Replay->ReadOnly, Verify->ExecuteTools, Live->WriteMemory
  • ResourceBudget::clamped() enforces hard ceilings (MAX preset) that cannot be overridden
  • CoherenceThresholds::validate() rejects out-of-range values
  • ContainerSegments::validate(mode) enforces mode-specific segment requirements
  • ContainerSegments::to_flags() computes the bitfield from present segments
  • All types are no_std compatible and exported from rvf-types/src/lib.rs

Acceptance Test

Run the same RVF artifact on two separate machines owned by two separate teams.

Suite: 100 tasks (30 requiring tool use, 70 internal reasoning/memory)

Pass criteria:

  1. Replay mode produces identical grader outputs for all 100 tasks
  2. Verify mode produces at least 95/100 passing on both machines
  3. Zero policy violations
  4. Every externally checkable claim has evidence pointers
  5. Witness chain verifies end-to-end

References

  • ADR-029: RVF Canonical Format (segment model, wire format, manifest)
  • ADR-030: Cognitive Container (KERNEL_SEG, EBPF_SEG, three-tier execution)
  • ADR-031: RVCOW Branching (COW branching, KernelBinding)
  • ADR-033: Progressive Indexing Hardening (quality framework, coherence gates, safety budgets)
  • ADR-034: QR Cognitive Seed (portable bootstrap, zero-dep crypto)
  • ADR-035: Capability Report (witness bundles, scorecards, governance)
  • RVF format specification (rvf-types, rvf-runtime, rvf-manifest)
  • RFC 8032: Ed25519
  • FIPS 180-4: SHA-256
  • Dynamic minimum-cut (arXiv preprint referenced in RuVector mincut crate)

Revision History

Version Date Author Changes
1.0 2026-02-15 ruv.io Initial proposal
1.1 2026-02-15 architecture review Resolved open questions (domain, authority, resource budgets, coherence thresholds). Added wire format section. Added cross-references to ADR-029/030/031/033. Added AuthorityLevel enum and resource budget types. Tightened ContainerSegments validation.
1.2 2026-02-16 implementation review Status updated to Partially Implemented. Documented full wire-format implementation in rvf-types/src/agi_container.rs (972 lines, 24 tests). All header types, enums, constants, and validators are implemented and exported. Domain expansion TLV tags (0x0112-0x0115) integrated.