Skip to content

Sentinel: CodingAgent + Demo Pipeline E2E (Claude Code → LoRA training)#277

Merged
joelteply merged 4 commits intomainfrom
feature/sentinel-claude-code
Mar 1, 2026
Merged

Sentinel: CodingAgent + Demo Pipeline E2E (Claude Code → LoRA training)#277
joelteply merged 4 commits intomainfrom
feature/sentinel-claude-code

Conversation

@joelteply
Copy link
Contributor

@joelteply joelteply commented Feb 28, 2026

Summary

Establishes sentinel-driven AI development: external coding agents (Claude Code) build real software inside pipelines, captured interactions LoRA-train local AI personas. Proven end-to-end on a 3-milestone Task Tracker project.

What This PR Adds

1. CodingAgent Step Type (new sentinel capability)

  • CodingAgentProvider interface with dynamic registry
  • ClaudeCodeProvider wrapping @anthropic-ai/claude-agent-sdk
  • Rust CodingAgent step type routing to TypeScript via execute_ts_json()
  • sentinel/coding-agent command with training data capture (user→assistant pairs → LoRA pipeline)
  • OAuth auth by default (strips ANTHROPIC_API_KEY, uses claude login Max subscription)

2. Demo Pipeline System (E2E proven)

  • DemoPipeline builder: project spec → sentinel pipeline (setup → milestone loop → training → emit)
  • genome/training-export command: bridges in-memory TrainingDataAccumulator → JSONL on disk → genome/train
  • genome/demo-run command: entry point (./jtag genome/demo-run --project=task-tracker --personaId=<uuid>)
  • Task Tracker project: 3 milestones (server+schema, CRUD, filter+paginate), 14+ deterministic tests
  • Shell allowFailure flag: test runners don't kill loops, exit codes flow to conditions
  • Interpolation fix: {{steps.N}} searches by step_index not array position (loop sub-steps shift shared results)

3. Startup Self-Healing (crash recovery)

  • Phase 2.5: Postgres health check — detects missing continuum DB, auto-starts Postgres, auto-creates DB
  • Phase 5.5: Auto-seed — detects empty rooms table, runs npm run data:seed
  • Handles full OOM crash recovery (Postgres dies → DB lost → startup restores everything)

4. Preflight System (developer prerequisites)

  • scripts/shared/preflight.sh — sourceable bash function library
  • scripts/shared/Preflight.ts — TypeScript equivalent
  • 7 scripts refactored to use shared preflight

5. Documentation Overhaul (8 docs updated)

  • Canonical SENTINEL-ARCHITECTURE.md rewritten: all 10 step types, real Rust structs, actual commands
  • SENTINEL-GAP-ANALYSIS.md: compared against 10 competing tools, 6 unique strengths, 7 gaps, 5-phase roadmap

E2E Pipeline Run (proven)

Step 0: Shell setup (20s) — temp dir, scaffold, npm install
Step 1: Loop × 3 milestones (386s) — CodingAgent builds + Shell tests + Condition retry
Step 2: genome/training-export (4ms) — accumulator → JSONL (1 example)
Step 3: genome/train (152s) — PEFT LoRA, loss=2.49, adapter saved
Step 4: Emit demo:complete — metrics + layerId

All steps succeeded. LoRA adapter produced. Total cost < $10.

Strategic Direction

"The field builds better hammers. We're building the blacksmith."

Use external agents as teachers → capture interactions → LoRA-train local personas → evaluate → iterate. Sentinels orchestrate the entire lifecycle.

Files Changed (62 files, +4575/-414)

Area Files Description
Demo pipeline 13 DemoPipeline builder, types, demo-run command, training-export command, task-tracker project
Coding agents 7 Provider interface, ClaudeCode impl, registry, command, Rust step
Rust engine 8 allowFailure, interpolation fix, step_index search, CodingAgent dispatch
Startup 1 Postgres health check + auto-seed in parallel-start.sh
Preflight 9 Shared bash/TS libraries + script refactoring
Docs 9 Architecture overhaul + gap analysis
Generated 5 Command schemas, browser/server registration

Test plan

  • TypeScript compiles clean (npm run build:ts)
  • Rust builds clean (cargo build --release)
  • Full pipeline E2E run succeeds (handle 43061122)
  • All 3 milestones pass with test verification
  • Training data exported and genome/train produces adapter
  • Interpolation fix: 36 existing tests pass + step_index resolution verified
  • System recovers from OOM crash via startup health check
  • Verify startup self-healing on cold start with missing DB

 Rust (Pipeline Engine)

  ┌────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────┐
  │                    File                    │                               Action                               │
  ├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
  │ workers/.../sentinel/types.rs              │ Added CodingAgent variant with 12 fields to PipelineStep enum      │
  ├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
  │ workers/.../sentinel/steps/coding_agent.rs │ New — delegates to TS via execute_ts_json("sentinel/coding-agent") │
  ├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
  │ workers/.../sentinel/steps/mod.rs          │ Added module + dispatch arm                                        │
  ├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
  │ shared/generated/sentinel/PipelineStep.ts  │ Auto-regenerated by ts-rs                                          │
  └────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────┘

  TypeScript (Provider Architecture)

  ┌──────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────┐
  │                         File                         │                                   Action                                    │
  ├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
  │ system/sentinel/coding-agents/CodingAgentProvider.ts │ Interface: CodingAgentProvider, CodingAgentConfig, CodingAgentResult        │
  ├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
  │ system/sentinel/coding-agents/ClaudeCodeProvider.ts  │ SDK wrapper — spawns child process, streams messages, captures interactions │
  ├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
  │ system/sentinel/coding-agents/CodingAgentRegistry.ts │ Dynamic registry — no switch, no enum, providers self-register              │
  ├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
  │ system/sentinel/coding-agents/index.ts               │ Barrel + auto-registration of built-in providers                            │
  └──────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┘

  Command (sentinel/coding-agent)

  ┌─────────────────────────────────────────────────────────────────────────────┬───────────────────────────────────────────────────────────────────┐
  │                                    File                                     │                              Action                               │
  ├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
  │ commands/sentinel/coding-agent/shared/SentinelCodingAgentTypes.ts           │ Params, Result, static executor                                   │
  ├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
  │ commands/sentinel/coding-agent/server/SentinelCodingAgentServerCommand.ts   │ Resolves provider, executes, emits events, captures training data │
  ├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
  │ commands/sentinel/coding-agent/browser/SentinelCodingAgentBrowserCommand.ts │ Delegates to server                                               │
  └─────────────────────────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────┘

  Bindings

  ┌──────────────────────────────────────────┬─────────────────────────────────────────┐
  │                   File                   │                 Action                  │
  ├──────────────────────────────────────────┼─────────────────────────────────────────┤
  │ workers/.../bindings/modules/sentinel.ts │ Added codingagent to PipelineStep union │
  └──────────────────────────────────────────┴─────────────────────────────────────────┘

  Verification

  - 109 Rust tests pass (0 failures)
  - TypeScript compiles clean (strict mode)
  - SDK installed: @anthropic-ai/claude-agent-sdk added to package.json
Canonical SENTINEL-ARCHITECTURE.md rewritten to match actual Rust/TS
implementation: all 10 step types documented (was 6), variable
interpolation syntax corrected from $variable to {{variable}},
runtime/safety/commands sections updated with real structs and
commands, CodingAgent step type added throughout.

7 supporting docs updated with status headers pointing to canonical
doc. Superseded docs marked. Academy step-type count fixed (6/10).

New SENTINEL-GAP-ANALYSIS.md compares our sentinel system against
Claude Code, Codex, Aider, OpenCode, GSD, SWE-agent, OpenHands,
Cline, Cursor, and Sweep. Identifies 7 gaps (codebase understanding,
context management, multi-agent isolation, quality scoring, multi-
provider support, developer UX, persona integration depth) and 6
unique strengths (pipeline composition, LoRA training, Academy
dual-sentinel, training capture, persona ownership, event-based
inter-agent communication). Includes 5-phase prioritized roadmap
and research references for distillation pipeline hardening.
Copilot AI review requested due to automatic review settings February 28, 2026 17:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Establishes a new Sentinel pipeline capability to run external coding agents (starting with Claude Code), adds shared preflight tooling for shell/TypeScript scripts, and updates Sentinel documentation to reflect the current Rust+TS implementation (including a new competitive gap analysis).

Changes:

  • Added a new codingagent pipeline step in Rust that delegates execution to a TypeScript sentinel/coding-agent command.
  • Introduced a TypeScript coding-agent provider architecture (registry + Claude Code provider) and registered the new command in generated registries/constants.
  • Added shared preflight libraries (preflight.sh, Preflight.ts) and refactored multiple scripts to use them; updated/annotated Sentinel documentation and added SENTINEL-GAP-ANALYSIS.md.

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/workers/continuum-core/src/modules/sentinel/types.rs Adds codingagent step variant and updates step type naming.
src/workers/continuum-core/src/modules/sentinel/steps/mod.rs Wires CodingAgent step dispatch into the executor.
src/workers/continuum-core/src/modules/sentinel/steps/coding_agent.rs Implements Rust-side delegation to TS sentinel/coding-agent.
src/workers/continuum-core/bindings/modules/sentinel.ts Extends TS pipeline step union to include codingagent.
src/system/sentinel/coding-agents/index.ts Barrel exports and side-effect self-registration for built-in providers.
src/system/sentinel/coding-agents/CodingAgentRegistry.ts Implements provider registry.
src/system/sentinel/coding-agents/CodingAgentProvider.ts Defines provider/config/result/progress interfaces.
src/system/sentinel/coding-agents/ClaudeCodeProvider.ts Adds Claude Code provider wrapper around the Agent SDK.
src/commands/sentinel/coding-agent/shared/SentinelCodingAgentTypes.ts Adds typed params/results and a typed executor wrapper for the new command.
src/commands/sentinel/coding-agent/server/SentinelCodingAgentServerCommand.ts Implements server-side command: provider resolve/execute, progress events, training capture.
src/commands/sentinel/coding-agent/browser/SentinelCodingAgentBrowserCommand.ts Adds browser-side command delegating to server execution.
src/shared/generated-command-constants.ts Registers SENTINEL_CODING_AGENT constant.
src/server/generated.ts Registers server command for sentinel/coding-agent.
src/browser/generated.ts Registers browser command for sentinel/coding-agent.
src/package.json Bumps package version and adds @anthropic-ai/claude-agent-sdk dependency.
src/package-lock.json Locks the new dependency.
src/shared/version.ts Updates generated version constant.
src/scripts/shared/preflight.sh Adds shared bash preflight functions and platform checks.
src/scripts/shared/Preflight.ts Adds shared TS preflight utilities for TS-based tools.
src/scripts/parallel-start.sh Refactors build flow to use preflight + improved cargo failure messaging.
src/scripts/setup-rust.sh Refactors setup to use preflight utilities.
src/scripts/system-stop.sh Refactors to use shared preflight colors/utilities.
src/scripts/install-livekit.sh Refactors to use shared preflight colors/utilities.
src/scripts/download-voice-models.sh Refactors to use shared preflight colors/utilities.
src/scripts/download-avatar-models.sh Refactors to use shared preflight colors/utilities.
src/generator/generate-rust-bindings.ts Switches to spawnSync + adds Preflight-based cargo failure detection.
src/generated-command-schemas.json Adds schema entry for sentinel/coding-agent.
src/docs/personas/ACADEMY_GENOMIC_DESIGN.md Adds status/superseded context pointing to current Academy implementation.
src/docs/personas/ACADEMY_ARCHITECTURE.md Adds status/superseded context pointing to current Academy implementation.
src/docs/personas/ACADEMY-DOJO-ARCHITECTURE.md Updates step-type counts to 10 (includes CodingAgent).
src/docs/SENTINEL-PIPELINE-ARCHITECTURE.md Marks historical doc superseded; updates phrasing to match implemented system.
src/docs/SENTINEL-LOGGING-PLAN.md Updates current-state section to reflect Rust logging implementation.
src/docs/SENTINEL-GAP-ANALYSIS.md Adds comprehensive gap analysis vs competing agentic coding tools.
src/docs/SENTINEL-ARCHITECTURE.md Updates canonical architecture doc for 10 step types, agentMode, CodingAgent, and lifecycle integration.
docs/SENTINEL-WORKERS.md Marks historical doc superseded and points to canonical docs.
docs/SENTINEL-ARCHITECTURE.md Marks historical doc superseded and points to canonical docs.
Files not reviewed (1)
  • src/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 351 to 356
"engines": {
"node": ">=16.0.0"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.62",
"@anthropic-ai/sdk": "^0.71.2",
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthropic-ai/claude-agent-sdk@0.2.62 declares engines.node >=18, but this repo’s package.json still advertises node >=16. That mismatch can break installs/CI in Node 16 environments (especially with engine-strict). Either bump the project engine requirement to >=18 or make the SDK truly optional (e.g., move to optionalDependencies or a separate integration package) so Node 16 users aren’t blocked.

Copilot uses AI. Check for mistakes.
import { Events } from '../../../../system/core/shared/Events';
import type { SentinelCodingAgentParams, SentinelCodingAgentResult } from '../shared/SentinelCodingAgentTypes';
import { CodingAgentRegistry } from '../../../../system/sentinel/coding-agents/CodingAgentRegistry';
import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider';
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built-in provider self-registration happens in system/sentinel/coding-agents/index.ts, but this command imports CodingAgentRegistry directly from CodingAgentRegistry.ts. Unless something elsewhere imports the barrel module for side effects, the registry will be empty and even the default claude-code provider will resolve as “unknown”. Consider importing the barrel once here (or moving registration into the registry module / adding an explicit init call) so providers are guaranteed to be registered before lookup.

Suggested change
import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider';
import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider';
import '../../../../system/sentinel/coding-agents';

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +174
const lastTool = toolCalls[toolCalls.length - 1];
const resultStr = typeof userMsg.tool_use_result === 'string'
? userMsg.tool_use_result
: JSON.stringify(userMsg.tool_use_result);
lastTool.output = resultStr;
lastTool.durationMs = Date.now() - (interactions[interactions.length - 1]?.timestamp || Date.now());

onProgress?.({
type: 'tool_end',
toolName: lastTool.toolName,
message: `${lastTool.toolName} completed`,
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool timing/association looks incorrect here: toolStart is computed but never used, and durationMs is derived from the timestamp of the last recorded interaction (which is not the tool start time and may be an assistant message unrelated to the tool). Also, completing tool calls by mutating toolCalls[toolCalls.length - 1] can misattribute results if multiple tool calls are in-flight/queued. Track a per-tool identifier/start timestamp (from the SDK message, if available) and compute duration from that, rather than from interactions.

Suggested change
const lastTool = toolCalls[toolCalls.length - 1];
const resultStr = typeof userMsg.tool_use_result === 'string'
? userMsg.tool_use_result
: JSON.stringify(userMsg.tool_use_result);
lastTool.output = resultStr;
lastTool.durationMs = Date.now() - (interactions[interactions.length - 1]?.timestamp || Date.now());
onProgress?.({
type: 'tool_end',
toolName: lastTool.toolName,
message: `${lastTool.toolName} completed`,
// Try to associate this result with the correct tool call using an ID if available.
const toolUseId = userMsg.tool_use_id;
let toolCall: CodingAgentToolCall | undefined;
if (toolUseId !== undefined && toolUseId !== null) {
toolCall = toolCalls.find(tc =>
(tc as any).toolUseId === toolUseId ||
(tc as any).id === toolUseId
);
}
// Fallback to the most recent tool call to preserve existing behavior.
if (!toolCall) {
toolCall = toolCalls[toolCalls.length - 1];
}
const resultStr = typeof userMsg.tool_use_result === 'string'
? userMsg.tool_use_result
: JSON.stringify(userMsg.tool_use_result);
toolCall.output = resultStr;
// Prefer a start timestamp stored on the tool call or provided by the SDK,
// and use the message timestamp or now as the end time.
const toolStartMs =
(toolCall as any).startTimestampMs ??
userMsg.tool_use_start_timestamp_ms ??
userMsg.tool_use_start_time_ms ??
null;
const toolEndMs = userMsg.timestamp ?? Date.now();
toolCall.durationMs =
toolStartMs !== null && toolStartMs !== undefined
? toolEndMs - toolStartMs
: 0;
onProgress?.({
type: 'tool_end',
toolName: toolCall.toolName,
message: `${toolCall.toolName} completed`,

Copilot uses AI. Check for mistakes.
*/

export const VERSION = '1.0.8384';
export const VERSION = '1.0.8394';
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared/version.ts says it’s auto-generated from package.json, but VERSION is 1.0.8394 while package.json/package-lock.json are 1.0.8396. This will cause inconsistent version reporting in UI/ping and can confuse build/version detection. Regenerate shared/version.ts after the final version bump (or ensure the generator runs as part of the versioning step) so all version sources match.

Suggested change
export const VERSION = '1.0.8394';
export const VERSION = '1.0.8396';

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +45
/// Execute a coding agent step via TypeScript provider architecture.
///
/// All string fields are interpolated before sending to TypeScript.
/// TypeScript resolves the provider from CodingAgentRegistry and calls provider.execute().
#[allow(clippy::too_many_arguments)]
pub async fn execute(
prompt: &str,
provider: Option<&str>,
working_dir: Option<&str>,
system_prompt: Option<&str>,
model: Option<&str>,
allowed_tools: Option<&Vec<String>>,
max_turns: Option<u32>,
max_budget_usd: Option<f64>,
permission_mode: Option<&str>,
resume_session_id: Option<&str>,
capture_training: Option<bool>,
persona_id: Option<&str>,
index: usize,
ctx: &mut ExecutionContext,
pipeline_ctx: &PipelineContext<'_>,
) -> Result<StepResult, String> {
use crate::runtime;
let log = runtime::logger("sentinel");
let start = Instant::now();

let interpolated_prompt = interpolation::interpolate(prompt, ctx);
let interpolated_system = system_prompt.map(|s| interpolation::interpolate(s, ctx));
let interpolated_working_dir = working_dir.map(|s| interpolation::interpolate(s, ctx));
let interpolated_persona_id = persona_id.map(|s| interpolation::interpolate(s, ctx));

let provider_name = provider.unwrap_or("claude-code");

Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says “All string fields are interpolated before sending to TypeScript”, but only prompt, system_prompt, working_dir, and persona_id are interpolated. provider, model, permission_mode, resume_session_id, and entries in allowed_tools are passed through unchanged. Either interpolate those too (including each tool name) or update the comment to match the actual behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +81
BUILD_OUTPUT=$(cargo build --release --quiet 2>&1)
RESULT=$?
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this script runs with set -e, a failing cargo build inside BUILD_OUTPUT=$(...) will cause the subshell to exit immediately (errexit triggers on the command substitution), so the later RESULT=$? / filtered output / preflight_check_cargo_xcode branch won’t run. Wrap the build in an if ! ...; then block (or temporarily disable set -e) so failures are captured and your actionable preflight messaging actually executes.

Suggested change
BUILD_OUTPUT=$(cargo build --release --quiet 2>&1)
RESULT=$?
BUILD_OUTPUT=""
RESULT=0
if ! BUILD_OUTPUT=$(cargo build --release --quiet 2>&1); then
RESULT=$?
fi

Copilot uses AI. Check for mistakes.
Sentinel demo system proven end-to-end: setup → milestones (CodingAgent
loop with test feedback) → training-export → genome/train. Pipeline ran
successfully on task-tracker project (3 milestones, all passed, adapter
trained with loss=2.49).

Key additions:
- DemoPipeline builder + DemoTypes (pipeline orchestration)
- genome/training-export command (accumulator buffer → JSONL → disk)
- genome/demo-run command (entry point for demo pipelines)
- task-tracker project (3 milestones, 14+ deterministic tests)
- Shell allowFailure flag (test runners don't kill loops)
- Interpolation fix: {{steps.N}} searches by step_index not array
  position (loop sub-steps shift positions in shared results array)
- ClaudeCodeProvider: strip ANTHROPIC_API_KEY for OAuth auth
- Startup self-healing: Postgres DB health check + auto-create + auto-seed
@joelteply joelteply changed the title Sentinel: CodingAgent step type, preflight system, docs overhaul + gap analysis Sentinel: CodingAgent + Demo Pipeline E2E (Claude Code → LoRA training) Mar 1, 2026
@joelteply joelteply merged commit bed366e into main Mar 1, 2026
2 of 5 checks passed
@joelteply joelteply deleted the feature/sentinel-claude-code branch March 1, 2026 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants