Sentinel: CodingAgent + Demo Pipeline E2E (Claude Code → LoRA training)#277
Sentinel: CodingAgent + Demo Pipeline E2E (Claude Code → LoRA training)#277
Conversation
Rust (Pipeline Engine)
┌────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────┐
│ File │ Action │
├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ workers/.../sentinel/types.rs │ Added CodingAgent variant with 12 fields to PipelineStep enum │
├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ workers/.../sentinel/steps/coding_agent.rs │ New — delegates to TS via execute_ts_json("sentinel/coding-agent") │
├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ workers/.../sentinel/steps/mod.rs │ Added module + dispatch arm │
├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ shared/generated/sentinel/PipelineStep.ts │ Auto-regenerated by ts-rs │
└────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────┘
TypeScript (Provider Architecture)
┌──────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────┐
│ File │ Action │
├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ system/sentinel/coding-agents/CodingAgentProvider.ts │ Interface: CodingAgentProvider, CodingAgentConfig, CodingAgentResult │
├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ system/sentinel/coding-agents/ClaudeCodeProvider.ts │ SDK wrapper — spawns child process, streams messages, captures interactions │
├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ system/sentinel/coding-agents/CodingAgentRegistry.ts │ Dynamic registry — no switch, no enum, providers self-register │
├──────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ system/sentinel/coding-agents/index.ts │ Barrel + auto-registration of built-in providers │
└──────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┘
Command (sentinel/coding-agent)
┌─────────────────────────────────────────────────────────────────────────────┬───────────────────────────────────────────────────────────────────┐
│ File │ Action │
├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ commands/sentinel/coding-agent/shared/SentinelCodingAgentTypes.ts │ Params, Result, static executor │
├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ commands/sentinel/coding-agent/server/SentinelCodingAgentServerCommand.ts │ Resolves provider, executes, emits events, captures training data │
├─────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ commands/sentinel/coding-agent/browser/SentinelCodingAgentBrowserCommand.ts │ Delegates to server │
└─────────────────────────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────┘
Bindings
┌──────────────────────────────────────────┬─────────────────────────────────────────┐
│ File │ Action │
├──────────────────────────────────────────┼─────────────────────────────────────────┤
│ workers/.../bindings/modules/sentinel.ts │ Added codingagent to PipelineStep union │
└──────────────────────────────────────────┴─────────────────────────────────────────┘
Verification
- 109 Rust tests pass (0 failures)
- TypeScript compiles clean (strict mode)
- SDK installed: @anthropic-ai/claude-agent-sdk added to package.json
Canonical SENTINEL-ARCHITECTURE.md rewritten to match actual Rust/TS
implementation: all 10 step types documented (was 6), variable
interpolation syntax corrected from $variable to {{variable}},
runtime/safety/commands sections updated with real structs and
commands, CodingAgent step type added throughout.
7 supporting docs updated with status headers pointing to canonical
doc. Superseded docs marked. Academy step-type count fixed (6/10).
New SENTINEL-GAP-ANALYSIS.md compares our sentinel system against
Claude Code, Codex, Aider, OpenCode, GSD, SWE-agent, OpenHands,
Cline, Cursor, and Sweep. Identifies 7 gaps (codebase understanding,
context management, multi-agent isolation, quality scoring, multi-
provider support, developer UX, persona integration depth) and 6
unique strengths (pipeline composition, LoRA training, Academy
dual-sentinel, training capture, persona ownership, event-based
inter-agent communication). Includes 5-phase prioritized roadmap
and research references for distillation pipeline hardening.
There was a problem hiding this comment.
Pull request overview
Establishes a new Sentinel pipeline capability to run external coding agents (starting with Claude Code), adds shared preflight tooling for shell/TypeScript scripts, and updates Sentinel documentation to reflect the current Rust+TS implementation (including a new competitive gap analysis).
Changes:
- Added a new
codingagentpipeline step in Rust that delegates execution to a TypeScriptsentinel/coding-agentcommand. - Introduced a TypeScript coding-agent provider architecture (registry + Claude Code provider) and registered the new command in generated registries/constants.
- Added shared preflight libraries (
preflight.sh,Preflight.ts) and refactored multiple scripts to use them; updated/annotated Sentinel documentation and addedSENTINEL-GAP-ANALYSIS.md.
Reviewed changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/workers/continuum-core/src/modules/sentinel/types.rs | Adds codingagent step variant and updates step type naming. |
| src/workers/continuum-core/src/modules/sentinel/steps/mod.rs | Wires CodingAgent step dispatch into the executor. |
| src/workers/continuum-core/src/modules/sentinel/steps/coding_agent.rs | Implements Rust-side delegation to TS sentinel/coding-agent. |
| src/workers/continuum-core/bindings/modules/sentinel.ts | Extends TS pipeline step union to include codingagent. |
| src/system/sentinel/coding-agents/index.ts | Barrel exports and side-effect self-registration for built-in providers. |
| src/system/sentinel/coding-agents/CodingAgentRegistry.ts | Implements provider registry. |
| src/system/sentinel/coding-agents/CodingAgentProvider.ts | Defines provider/config/result/progress interfaces. |
| src/system/sentinel/coding-agents/ClaudeCodeProvider.ts | Adds Claude Code provider wrapper around the Agent SDK. |
| src/commands/sentinel/coding-agent/shared/SentinelCodingAgentTypes.ts | Adds typed params/results and a typed executor wrapper for the new command. |
| src/commands/sentinel/coding-agent/server/SentinelCodingAgentServerCommand.ts | Implements server-side command: provider resolve/execute, progress events, training capture. |
| src/commands/sentinel/coding-agent/browser/SentinelCodingAgentBrowserCommand.ts | Adds browser-side command delegating to server execution. |
| src/shared/generated-command-constants.ts | Registers SENTINEL_CODING_AGENT constant. |
| src/server/generated.ts | Registers server command for sentinel/coding-agent. |
| src/browser/generated.ts | Registers browser command for sentinel/coding-agent. |
| src/package.json | Bumps package version and adds @anthropic-ai/claude-agent-sdk dependency. |
| src/package-lock.json | Locks the new dependency. |
| src/shared/version.ts | Updates generated version constant. |
| src/scripts/shared/preflight.sh | Adds shared bash preflight functions and platform checks. |
| src/scripts/shared/Preflight.ts | Adds shared TS preflight utilities for TS-based tools. |
| src/scripts/parallel-start.sh | Refactors build flow to use preflight + improved cargo failure messaging. |
| src/scripts/setup-rust.sh | Refactors setup to use preflight utilities. |
| src/scripts/system-stop.sh | Refactors to use shared preflight colors/utilities. |
| src/scripts/install-livekit.sh | Refactors to use shared preflight colors/utilities. |
| src/scripts/download-voice-models.sh | Refactors to use shared preflight colors/utilities. |
| src/scripts/download-avatar-models.sh | Refactors to use shared preflight colors/utilities. |
| src/generator/generate-rust-bindings.ts | Switches to spawnSync + adds Preflight-based cargo failure detection. |
| src/generated-command-schemas.json | Adds schema entry for sentinel/coding-agent. |
| src/docs/personas/ACADEMY_GENOMIC_DESIGN.md | Adds status/superseded context pointing to current Academy implementation. |
| src/docs/personas/ACADEMY_ARCHITECTURE.md | Adds status/superseded context pointing to current Academy implementation. |
| src/docs/personas/ACADEMY-DOJO-ARCHITECTURE.md | Updates step-type counts to 10 (includes CodingAgent). |
| src/docs/SENTINEL-PIPELINE-ARCHITECTURE.md | Marks historical doc superseded; updates phrasing to match implemented system. |
| src/docs/SENTINEL-LOGGING-PLAN.md | Updates current-state section to reflect Rust logging implementation. |
| src/docs/SENTINEL-GAP-ANALYSIS.md | Adds comprehensive gap analysis vs competing agentic coding tools. |
| src/docs/SENTINEL-ARCHITECTURE.md | Updates canonical architecture doc for 10 step types, agentMode, CodingAgent, and lifecycle integration. |
| docs/SENTINEL-WORKERS.md | Marks historical doc superseded and points to canonical docs. |
| docs/SENTINEL-ARCHITECTURE.md | Marks historical doc superseded and points to canonical docs. |
Files not reviewed (1)
- src/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "engines": { | ||
| "node": ">=16.0.0" | ||
| }, | ||
| "dependencies": { | ||
| "@anthropic-ai/claude-agent-sdk": "^0.2.62", | ||
| "@anthropic-ai/sdk": "^0.71.2", |
There was a problem hiding this comment.
@anthropic-ai/claude-agent-sdk@0.2.62 declares engines.node >=18, but this repo’s package.json still advertises node >=16. That mismatch can break installs/CI in Node 16 environments (especially with engine-strict). Either bump the project engine requirement to >=18 or make the SDK truly optional (e.g., move to optionalDependencies or a separate integration package) so Node 16 users aren’t blocked.
| import { Events } from '../../../../system/core/shared/Events'; | ||
| import type { SentinelCodingAgentParams, SentinelCodingAgentResult } from '../shared/SentinelCodingAgentTypes'; | ||
| import { CodingAgentRegistry } from '../../../../system/sentinel/coding-agents/CodingAgentRegistry'; | ||
| import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider'; |
There was a problem hiding this comment.
Built-in provider self-registration happens in system/sentinel/coding-agents/index.ts, but this command imports CodingAgentRegistry directly from CodingAgentRegistry.ts. Unless something elsewhere imports the barrel module for side effects, the registry will be empty and even the default claude-code provider will resolve as “unknown”. Consider importing the barrel once here (or moving registration into the registry module / adding an explicit init call) so providers are guaranteed to be registered before lookup.
| import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider'; | |
| import type { CodingAgentConfig, CodingAgentProgressEvent } from '../../../../system/sentinel/coding-agents/CodingAgentProvider'; | |
| import '../../../../system/sentinel/coding-agents'; |
| const lastTool = toolCalls[toolCalls.length - 1]; | ||
| const resultStr = typeof userMsg.tool_use_result === 'string' | ||
| ? userMsg.tool_use_result | ||
| : JSON.stringify(userMsg.tool_use_result); | ||
| lastTool.output = resultStr; | ||
| lastTool.durationMs = Date.now() - (interactions[interactions.length - 1]?.timestamp || Date.now()); | ||
|
|
||
| onProgress?.({ | ||
| type: 'tool_end', | ||
| toolName: lastTool.toolName, | ||
| message: `${lastTool.toolName} completed`, |
There was a problem hiding this comment.
Tool timing/association looks incorrect here: toolStart is computed but never used, and durationMs is derived from the timestamp of the last recorded interaction (which is not the tool start time and may be an assistant message unrelated to the tool). Also, completing tool calls by mutating toolCalls[toolCalls.length - 1] can misattribute results if multiple tool calls are in-flight/queued. Track a per-tool identifier/start timestamp (from the SDK message, if available) and compute duration from that, rather than from interactions.
| const lastTool = toolCalls[toolCalls.length - 1]; | |
| const resultStr = typeof userMsg.tool_use_result === 'string' | |
| ? userMsg.tool_use_result | |
| : JSON.stringify(userMsg.tool_use_result); | |
| lastTool.output = resultStr; | |
| lastTool.durationMs = Date.now() - (interactions[interactions.length - 1]?.timestamp || Date.now()); | |
| onProgress?.({ | |
| type: 'tool_end', | |
| toolName: lastTool.toolName, | |
| message: `${lastTool.toolName} completed`, | |
| // Try to associate this result with the correct tool call using an ID if available. | |
| const toolUseId = userMsg.tool_use_id; | |
| let toolCall: CodingAgentToolCall | undefined; | |
| if (toolUseId !== undefined && toolUseId !== null) { | |
| toolCall = toolCalls.find(tc => | |
| (tc as any).toolUseId === toolUseId || | |
| (tc as any).id === toolUseId | |
| ); | |
| } | |
| // Fallback to the most recent tool call to preserve existing behavior. | |
| if (!toolCall) { | |
| toolCall = toolCalls[toolCalls.length - 1]; | |
| } | |
| const resultStr = typeof userMsg.tool_use_result === 'string' | |
| ? userMsg.tool_use_result | |
| : JSON.stringify(userMsg.tool_use_result); | |
| toolCall.output = resultStr; | |
| // Prefer a start timestamp stored on the tool call or provided by the SDK, | |
| // and use the message timestamp or now as the end time. | |
| const toolStartMs = | |
| (toolCall as any).startTimestampMs ?? | |
| userMsg.tool_use_start_timestamp_ms ?? | |
| userMsg.tool_use_start_time_ms ?? | |
| null; | |
| const toolEndMs = userMsg.timestamp ?? Date.now(); | |
| toolCall.durationMs = | |
| toolStartMs !== null && toolStartMs !== undefined | |
| ? toolEndMs - toolStartMs | |
| : 0; | |
| onProgress?.({ | |
| type: 'tool_end', | |
| toolName: toolCall.toolName, | |
| message: `${toolCall.toolName} completed`, |
src/shared/version.ts
Outdated
| */ | ||
|
|
||
| export const VERSION = '1.0.8384'; | ||
| export const VERSION = '1.0.8394'; |
There was a problem hiding this comment.
shared/version.ts says it’s auto-generated from package.json, but VERSION is 1.0.8394 while package.json/package-lock.json are 1.0.8396. This will cause inconsistent version reporting in UI/ping and can confuse build/version detection. Regenerate shared/version.ts after the final version bump (or ensure the generator runs as part of the versioning step) so all version sources match.
| export const VERSION = '1.0.8394'; | |
| export const VERSION = '1.0.8396'; |
| /// Execute a coding agent step via TypeScript provider architecture. | ||
| /// | ||
| /// All string fields are interpolated before sending to TypeScript. | ||
| /// TypeScript resolves the provider from CodingAgentRegistry and calls provider.execute(). | ||
| #[allow(clippy::too_many_arguments)] | ||
| pub async fn execute( | ||
| prompt: &str, | ||
| provider: Option<&str>, | ||
| working_dir: Option<&str>, | ||
| system_prompt: Option<&str>, | ||
| model: Option<&str>, | ||
| allowed_tools: Option<&Vec<String>>, | ||
| max_turns: Option<u32>, | ||
| max_budget_usd: Option<f64>, | ||
| permission_mode: Option<&str>, | ||
| resume_session_id: Option<&str>, | ||
| capture_training: Option<bool>, | ||
| persona_id: Option<&str>, | ||
| index: usize, | ||
| ctx: &mut ExecutionContext, | ||
| pipeline_ctx: &PipelineContext<'_>, | ||
| ) -> Result<StepResult, String> { | ||
| use crate::runtime; | ||
| let log = runtime::logger("sentinel"); | ||
| let start = Instant::now(); | ||
|
|
||
| let interpolated_prompt = interpolation::interpolate(prompt, ctx); | ||
| let interpolated_system = system_prompt.map(|s| interpolation::interpolate(s, ctx)); | ||
| let interpolated_working_dir = working_dir.map(|s| interpolation::interpolate(s, ctx)); | ||
| let interpolated_persona_id = persona_id.map(|s| interpolation::interpolate(s, ctx)); | ||
|
|
||
| let provider_name = provider.unwrap_or("claude-code"); | ||
|
|
There was a problem hiding this comment.
The doc comment says “All string fields are interpolated before sending to TypeScript”, but only prompt, system_prompt, working_dir, and persona_id are interpolated. provider, model, permission_mode, resume_session_id, and entries in allowed_tools are passed through unchanged. Either interpolate those too (including each tool name) or update the comment to match the actual behavior.
| BUILD_OUTPUT=$(cargo build --release --quiet 2>&1) | ||
| RESULT=$? |
There was a problem hiding this comment.
Because this script runs with set -e, a failing cargo build inside BUILD_OUTPUT=$(...) will cause the subshell to exit immediately (errexit triggers on the command substitution), so the later RESULT=$? / filtered output / preflight_check_cargo_xcode branch won’t run. Wrap the build in an if ! ...; then block (or temporarily disable set -e) so failures are captured and your actionable preflight messaging actually executes.
| BUILD_OUTPUT=$(cargo build --release --quiet 2>&1) | |
| RESULT=$? | |
| BUILD_OUTPUT="" | |
| RESULT=0 | |
| if ! BUILD_OUTPUT=$(cargo build --release --quiet 2>&1); then | |
| RESULT=$? | |
| fi |
Sentinel demo system proven end-to-end: setup → milestones (CodingAgent
loop with test feedback) → training-export → genome/train. Pipeline ran
successfully on task-tracker project (3 milestones, all passed, adapter
trained with loss=2.49).
Key additions:
- DemoPipeline builder + DemoTypes (pipeline orchestration)
- genome/training-export command (accumulator buffer → JSONL → disk)
- genome/demo-run command (entry point for demo pipelines)
- task-tracker project (3 milestones, 14+ deterministic tests)
- Shell allowFailure flag (test runners don't kill loops)
- Interpolation fix: {{steps.N}} searches by step_index not array
position (loop sub-steps shift positions in shared results array)
- ClaudeCodeProvider: strip ANTHROPIC_API_KEY for OAuth auth
- Startup self-healing: Postgres DB health check + auto-create + auto-seed
Summary
Establishes sentinel-driven AI development: external coding agents (Claude Code) build real software inside pipelines, captured interactions LoRA-train local AI personas. Proven end-to-end on a 3-milestone Task Tracker project.
What This PR Adds
1. CodingAgent Step Type (new sentinel capability)
CodingAgentProviderinterface with dynamic registryClaudeCodeProviderwrapping@anthropic-ai/claude-agent-sdkCodingAgentstep type routing to TypeScript viaexecute_ts_json()sentinel/coding-agentcommand with training data capture (user→assistant pairs → LoRA pipeline)claude loginMax subscription)2. Demo Pipeline System (E2E proven)
DemoPipelinebuilder: project spec → sentinel pipeline (setup → milestone loop → training → emit)genome/training-exportcommand: bridges in-memory TrainingDataAccumulator → JSONL on disk → genome/traingenome/demo-runcommand: entry point (./jtag genome/demo-run --project=task-tracker --personaId=<uuid>)allowFailureflag: test runners don't kill loops, exit codes flow to conditions{{steps.N}}searches bystep_indexnot array position (loop sub-steps shift shared results)3. Startup Self-Healing (crash recovery)
continuumDB, auto-starts Postgres, auto-creates DBnpm run data:seed4. Preflight System (developer prerequisites)
scripts/shared/preflight.sh— sourceable bash function libraryscripts/shared/Preflight.ts— TypeScript equivalent5. Documentation Overhaul (8 docs updated)
SENTINEL-ARCHITECTURE.mdrewritten: all 10 step types, real Rust structs, actual commandsSENTINEL-GAP-ANALYSIS.md: compared against 10 competing tools, 6 unique strengths, 7 gaps, 5-phase roadmapE2E Pipeline Run (proven)
All steps succeeded. LoRA adapter produced. Total cost < $10.
Strategic Direction
Use external agents as teachers → capture interactions → LoRA-train local personas → evaluate → iterate. Sentinels orchestrate the entire lifecycle.
Files Changed (62 files, +4575/-414)
Test plan
npm run build:ts)cargo build --release)