fix(status,prompts): finish the #2432 residuals left by the merged #2580 family by rysweet · Pull Request #2595 · rysweet/Simard

rysweet · 2026-07-04T19:16:51Z

Context: the core #2432 work was superseded by the merged #2580 family

This branch began as a full parallel implementation of #2432 (eliminate silent
deterministic fallbacks + fix "zero active engineers"). While it was in flight,
main merged a comprehensive solution to the same problem:

fix(ooda-brain): eliminate deterministic-default reasoner outcomes on parse-failure (#2580) #2588 — eliminated deterministic-default reasoner outcomes on parse-failure
(explicit Err + brain_parse_error at the caller; parsers now consume a
{"decision": …} / {"adjusted_urgency": …} envelope).
fix(dashboard): report the TRUE live-engineer set so "Active Engineers" isn't stuck at zero (#2580) #2591 — reported the TRUE live-engineer set for the dashboard "Active
Engineers" panel via a new operator_commands_dashboard::live_engineers module.

Those supersede this branch's parallel reasoner/dashboard implementation, so the
merge in this PR adopts main's version wholesale for all overlapping
production code and drops this branch's now-redundant zero_fallback_tests.rs
(main ships its own zero_fallback_2580_tests). No duplicate/competing
implementation is shipped.

What this PR now contains (the residual gaps #2580 left)

Main touched neither of these files, so both merge cleanly and are purely
additive:

1. `status/provider.rs` — finish design G4/G5 for the `simard status` surface

#2591 fixed the dashboard engineer count but left the status snapshot
(resources.live_engineers) on the buggy pgrep -f 'simard-engineer' (hyphen)
pattern — which never matches the real simard engineer run single-process …
(space) argv, so it undercounts. Read-only live-daemon evidence at reconcile time:

Signal	Observed
real `…/simard engineer run single-process …` subprocesses	present
live worktree dispatch claims (`.simard-engineer-claim` + live PID)	3 live
old status pattern `pgrep 'simard-engineer'` (hyphen)	never matches the space argv

resources.live_engineers now derives from the authoritative
count_live_engineer_claims (the single source of truth, design G4) — matching
the dashboard surface — and the fragile pgrep pattern is retired (design G5).
Covered by a new test (live_engineers_derives_from_live_worktree_claims).

2. The four `.md` embedded-fallback prompts — consistency with #2588's parsers

#2588 updated the recipe YAMLs to emit structured output but left the .md
embedded fallbacks (used when the on-disk prompt is absent; also the RustyClawd
path) still mandating the DECISION: marker / "Do NOT output JSON" — which
now contradicts #2588's parsers that consume a {"decision": …} envelope.
Refreshed ooda_decide.md, ooda_brain.md, ooda_orient.md, and
merge_readiness_judge.md to mandate the fenced JSON envelope with a required
decision field, preserving all pinned sentences (STATUS: ACHIEVED gate, six
lifecycle variants, DECISION marker, churn/stuck-loop).

3. `parse_failure.rs` — structured tracing

Converted a residual eprintln! (the gh-issue-filed success branch) to
tracing::info!, per the structured-tracing-only directive.

Validation

Full cargo test --lib on the merged tree: 7031 passed, 0 failed (7 ignored
are main's; the base_type_copilot integration tests skip when copilot is off
PATH, as in CI).
cargo fmt --check, pre-commit clippy --release -D warnings, and pre-push
clippy --all-targets --all-features --locked -D warnings + the
memory_consolidation race-subset all pass — no --no-verify, no --admin.
git grep confirms no println!/eprintln! and no Bridge naming introduced.

Deploy

Does not touch the live daemon or ~/.simard (read-only inspection only). The
operator redeploys after merge.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

#2432 family) Step 5e — design consolidation. Fold the four investigation threads (parse-fail flow, extract.rs chokepoint coverage, distillation parser, active-engineers telemetry) into one grounded target architecture for eliminating silent deterministic fallbacks across the reasoning + telemetry paths. Grounded findings (file:line anchored in the doc): - strip_recipe_noise sanitizer chokepoint is universally adopted (thread closed). - The #2432 confidence-gated escalation ladder (run_brain_ladder, bounded by EscalationConfig: default 2 / hard cap 3) is wired for decide, orient, engineer-lifecycle, and merge-judge. - Residual gaps: progress-checker is off the ladder (G1); distillation runs a parallel failure-class retry (G2); confidence.rs (verbalized confidence / self-consistency / ECE) is built but unwired (G3). - Telemetry: three divergent live-engineer counts; count_live_engineers() greps "simard-engineer" (hyphen) but the real subprocess is "simard engineer" (space) — live 17 real vs 1 matched (G4/G5). Design promotes count_live_engineer_claims() to the single source of truth. Adds the doc to docs/index.md and the mkdocs nav; mkdocs --strict and scripts/verify-docs.sh both pass (15/0, 0 orphaned pages). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…2432) Write failing tests FIRST that pin the target behaviour for eliminating deterministic fallbacks in the Brain reasoners and fixing the dashboard "zero active engineers" reading. Implementation lands in later steps; RED contracts are #[ignore]d and un-ignored as each fix arrives. GREEN locks (guard current good behaviour): - retry recovers a real decision on schema-repair; retry budget bounded; exhaustion stays classified as a parse-failure (not a clean success) - shared recipe_output chokepoint strips copilot banner+ANSI+logs and recovers the JSON payload; extractor consumes a {"decision":...} envelope - every reasoner capture path routes through recipe_output:: (no bypass) - active engineers = live (un-ended) subagent sessions; live worktree dispatch claims are counted - distillation parses a banner/ANSI-polluted facts object - ladder core introduces no legacy phase-adapter naming RED (#[ignore], TDD-red until fix lands): - ladder exhaustion must emit a dashboard-visible brain_parse_error metric, never a silent deterministic default - a genuine take-no-action must be observably distinct from a parse-failure - reasoner prompts must mandate a fenced JSON envelope with a `decision` field - changed reasoner code must use structured tracing only (no stderr/stdout print macros) - workboard active-engineers gauge must union live dispatch claims with the subagent registry (roots the "zero active engineers" defect) - distillation must survive observed ~78%-failing capture shapes Test sources avoid the literal print-macro / legacy-adapter tokens (built via concat!) so the operator's git-grep contract cannot trip on the tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…s telemetry (#2432) Operator directive is ABSOLUTE: no silent deterministic fallback. This makes every unrecoverable reasoner parse-miss an EXPLICIT, dashboard-visible error and fixes the "zero active engineers" gauge. Implements the design in docs/design/eliminate-deterministic-fallbacks.md and un-ignores the TDD contracts in ooda_brain/zero_fallback_tests.rs. No silent default on ladder exhaustion (Contracts 1 + 7): - run_brain_ladder now emits a loud `brain_parse_error` self-metric + structured tracing::error! when the bounded escalation ladder is EXHAUSTED with no parseable decision. The deterministic floor still returns (it is the correct safety net when no LLM is configured — design §6), but it can no longer be silent or mistaken for a real decision. A genuine take-no-action (parsed continue_skipping) emits NO error metric, so the two paths are provably distinct. Threaded a `phase` label so the metric attributes to its reasoner. - The metric write is hermetic under cfg!(test): suppressed unless HOME is a build target/ dir, so `cargo test` never writes to the live ~/.simard. Structured JSON-envelope decision contract (Contract 3): - ROOT CAUSE: ooda_decide.md/ooda_brain.md mandated a `DECISION:` marker the RecipeBrain parsers reject (first word "DECISION:" → DefaultMalformed), a prompt/parser mismatch guaranteeing parse-fail→default. - parse_action_outcome / parse_lifecycle_outcome now consume a fenced JSON envelope `{"decision": "<variant>", ...}` via the shared recipe_output chokepoint FIRST, falling back to the legacy first-word parse (backward compatible — the ladder GREEN locks still pass). The extractor reads the STRUCTURED decision field, not free-prose keyword-sniffing. - The four reasoner prompts now mandate the fenced JSON envelope with a required `decision` field, preserving pinned sentences (STATUS: ACHIEVED gate, six lifecycle variants, DECISION marker, churn/stuck-loop). Structured tracing only (Contract 8): removed all eprintln! from the reasoner production paths (recipe_brain.rs, distillation.rs, parse_failure.rs); the paired tracing calls already carry the same fields. Zero active engineers — TELEMETRY defect, not a real stall (Contract 5): - Read-only live-daemon evidence: 3 live worktree dispatch claims + real `simard engineer run single-process` subprocesses exist, yet status used a pgrep pattern `simard-engineer` (hyphen) that never matches the real `simard engineer` (space) argv (undercount), and the workboard gauge read only the subagent registry (which diverges from / is empty on cold-start). - Promoted count_live_engineer_claims to the single source of truth (design G4): added live_engineer_claims(state_root); the workboard "Active Engineers" gauge now unions live dispatch claims with the subagent registry (dedup by PID) so an empty registry with a live engineer can never render 0; status resources.live_engineers now derives from the claim count and the buggy pgrep pattern is retired (design G5). Tests: un-ignored all five zero_fallback_tests red contracts + the workboard union red test; all pass. Full lib suite green except two pre-existing base_type_copilot integration tests that require `copilot` off PATH (they skip in CI). fmt + clippy --release -D warnings + memory_consolidation race-subset pass. Operator redeploys after merge (this change does not touch the live daemon or ~/.simard). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…#2591) While this branch was in flight, main merged a comprehensive #2432 solution: - #2588 eliminated deterministic-default reasoner outcomes on parse-failure (explicit Err + brain_parse_error at the caller; parsers consume {"decision":...}/{"adjusted_urgency":...}). - #2591 reported the TRUE live-engineer set for the dashboard "Active Engineers" via a new operator_commands_dashboard::live_engineers module. Those supersede this branch's parallel reasoner/dashboard implementation, so this merge adopts main's version wholesale for the overlapping production code (recipe_brain.rs, workboard.rs, distillation.rs, context.rs, ooda_brain/mod.rs, recipe_merge_judge.rs) and drops this branch's now-redundant zero_fallback_tests.rs (main ships its own zero_fallback_2580_tests). This branch is reduced to the two genuine residual gaps the #2580 family left, which merge cleanly (main touched neither file): - status/provider.rs: `resources.live_engineers` still used the buggy `pgrep 'simard-engineer'` (hyphen) pattern that never matches the real `simard engineer` (space) argv (design G5). Now derives from the authoritative count_live_engineer_claims (design G4), matching the dashboard surface. Covered by a new test. - The four .md embedded-fallback prompts still mandated the `DECISION:` marker / "Do NOT output JSON", contradicting #2588's parsers that now consume a {"decision":...} envelope. Refreshed to the structured JSON-envelope contract (Contract 3), preserving pinned sentences. - parse_failure.rs: convert a residual eprintln! (gh-issue-filed branch) to structured tracing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-07-04T19:56:54Z

📊 Coverage Summary

Generated by cargo llvm-cov --workspace --summary-only (nightly, excluding test files)

Module	Lines	Covered	Coverage
Total	130505	108080	82.8%

_{Coverage data from CI run. Test files matching tests?/ are excluded from line counts.}

The zero-fallback-reasoners narrative previously read as if every reasoner already satisfied the no-deterministic-fallback contract. Verified against the current tree, that is only partly true, so annotate each section with an honest status marker (✅ enforced today / ⏳ required end state): - Fix 1 (single sanitizing chokepoint, src/recipe_output/extract.rs): ✅ - Merge-judge verdict (src/stewardship/recipe_merge_judge.rs): parses {"verdict": …} JSON-first and fails closed to Verdict::Unclear, emitting brain_verdict_parsed_total{phase="merge_judge"}: ✅ (still a deterministic terminal outcome, not the propagated hard error the contract ultimately wants). - OODA Decide / engineer-lifecycle (src/ooda_brain/recipe_brain.rs parse_action_outcome / parse_lifecycle_outcome): still first-word prose extraction that returns default_advance_goal() / default_continue_skipping() on DefaultEmpty / DefaultMalformed: ⏳ - DeterministicLifecycleBrain on-Err floor (src/ooda_brain/fallback.rs, selected by build_act_brain in daemon/brains.rs, logged "DEGRADED mode"): ⏳ - simard status count_live_engineers() (src/status/provider.rs) shells out to pgrep and is not registry-pinned, unlike the test-pinned Workboard gauge: ⏳ Every claim is cited to the file that backs it. Reframes the page header as the spec to build to, not a description of shipped behaviour, matching the zero-BS stance. The residual code fixes are tracked separately in PR #2595. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rysweet and others added 4 commits July 4, 2026 17:18

rysweet changed the title ~~fix(brain): make deterministic floors LOUD + fix zero-active-engineers telemetry (#2432)~~ fix(status,prompts): finish the #2432 residuals left by the merged #2580 family Jul 4, 2026

rysweet merged commit f8ae3af into main Jul 4, 2026
17 checks passed

rysweet deleted the fix/brain-eliminate-deterministic-fallbacks-1783182659 branch July 4, 2026 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(status,prompts): finish the #2432 residuals left by the merged #2580 family#2595

fix(status,prompts): finish the #2432 residuals left by the merged #2580 family#2595
rysweet merged 4 commits into
mainfrom
fix/brain-eliminate-deterministic-fallbacks-1783182659

rysweet commented Jul 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rysweet commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context: the core #2432 work was superseded by the merged #2580 family

What this PR now contains (the residual gaps #2580 left)

1. status/provider.rs — finish design G4/G5 for the simard status surface

2. The four .md embedded-fallback prompts — consistency with #2588's parsers

3. parse_failure.rs — structured tracing

Validation

Deploy

Uh oh!

github-actions Bot commented Jul 4, 2026

📊 Coverage Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rysweet commented Jul 4, 2026 •

edited

Loading

1. `status/provider.rs` — finish design G4/G5 for the `simard status` surface

2. The four `.md` embedded-fallback prompts — consistency with #2588's parsers

3. `parse_failure.rs` — structured tracing