diff --git a/codev/plans/929-support-codex-and-gemini-clis-.md b/codev/plans/929-support-codex-and-gemini-clis-.md new file mode 100644 index 000000000..6c28bf8b9 --- /dev/null +++ b/codev/plans/929-support-codex-and-gemini-clis-.md @@ -0,0 +1,197 @@ +# PIR Plan: Support `codex` as an architect (codex-only revision) + +> **Rescoped 2026-06-19 — CODEX-ONLY.** `agy` is dropped from #929 and split into follow-up **#1063** +> (agy's only role-injection channel is its first *user* turn via `--prompt-interactive` — visible, +> weaker, and less durable than claude's `--append-system-prompt` / codex's `-c model_instructions_file=` +> out-of-band injection; not worth shipping a degraded architect). The retiring `gemini` CLI (#778) is +> **removed from architect support** but **kept as a builder harness**. PR #1059 (on this branch, not +> merged) already landed the harness-agnostic resume crash-loop fix and `codex` architect parity; this +> revision delivers the **codex-only delta**: strip the gemini-*architect* additions #1059 made, leaving +> claude + codex as the only supported architects. + +## Understanding + +PR #1059 (on `builder/pir-929`, not yet merged) delivered three things. Two are **kept exactly as-is**; +the third is **reverted**: + +**KEEP — engine-neutral, not gemini-specific:** +- **Harness-agnostic resume crash-loop fix** — the `HarnessProvider.buildResume` seam + (`harness.ts`). Claude returns the resume invocation; every other harness returns `null` → fresh, + role-injected launch. Applied at both the architect site (`tower-instances.ts` `launchInstance`) and + the builder `--resume` site (`spawn.ts` `discoverResumeSession` + `spawn-worktree.ts`). This fixes the + latent crash-loop where a non-Claude harness + a stale Claude `.jsonl` built an invalid + ` --resume ` and shellper restart-looped to death. **Untouched by this revision** — + it is the core deliverable and is independent of which non-Claude harnesses are supported. +- **`codex` architect parity** — codex is unaffected by the gemini retirement. Its role injection via + `-c model_instructions_file=<.architect-role.md>` (`CODEX_HARNESS.buildRoleInjection`), project + context read natively from `AGENTS.md`, and fresh-launch / reconnect paths all stay. + +**REVERT — the gemini-*architect* additions #1059 made (the codex-only delta):** +#1059 added gemini as a supported *architect*, which introduced an architect context-file seam +(`getArchitectFiles` + `writeArchitectContextFiles`) whose **only implementer is `GEMINI_HARNESS`** and +whose **only consumer is `buildArchitectArgs`**. With gemini removed as an architect, that entire seam +becomes dead code. Verified by grep — no other harness implements `getArchitectFiles`; claude and codex +do not (codex reads `AGENTS.md` natively). So the seam is deleted, not just gemini's implementation. + +**Gemini stays a builder harness.** `GEMINI_HARNESS.buildRoleInjection` / +`buildScriptRoleInjection` (via `GEMINI_SYSTEM_MD`) and all gemini *builder* tests are untouched. Only +gemini's *architect* surface is removed. + +### Why no `getArchitectFiles` seam survives + +Grep on the current branch: +- `getArchitectFiles` interface method (`harness.ts:74`) — declared optional. +- `GEMINI_HARNESS.getArchitectFiles` (`harness.ts:135`) — the **only** implementer. +- `writeArchitectContextFiles` (`tower-utils.ts:181`) — the **only** consumer of `getArchitectFiles`. +- `writeArchitectContextFiles` is itself called from exactly one place: `buildArchitectArgs` + (`tower-utils.ts:205`). + +After removing gemini's architect surface, nothing implements `getArchitectFiles`, so +`writeArchitectContextFiles` is a no-op for every harness. Per the brief ("remove the seam itself IF no +harness implements it anymore"), the whole chain is deleted: interface method, gemini impl, +`writeArchitectContextFiles`, and its `buildArchitectArgs` call site. `buildArchitectArgs` keeps its +`const harness = getArchitectHarness(...)` line — `harness` is still used immediately after for +`harness.buildRoleInjection(...)`. + +## Proposed Change + +### 1. Remove the gemini-architect context-file seam (`harness.ts`, `tower-utils.ts`) + +- **`harness.ts`**: delete the `getArchitectFiles?(workspacePath)` method from the `HarnessProvider` + interface (and its doc comment, ~lines 71–80), and delete `GEMINI_HARNESS.getArchitectFiles` + (~lines 132–138, including the "Gemini reads project context from .gemini/settings.json" comment). + Leave `GEMINI_HARNESS.buildRoleInjection` / `buildScriptRoleInjection` (the `GEMINI_SYSTEM_MD` builder + surface) intact. +- **`tower-utils.ts`**: delete `writeArchitectContextFiles` (lines 171–190, including its doc comment) + and its call in `buildArchitectArgs` (line 205). Update the `buildArchitectArgs` doc comment (line 196) + to drop the "Also writes any harness-specific context files (e.g. Gemini's .gemini/settings.json)" + sentence. Confirm `harness` (line 204) is still referenced by `buildRoleInjection` below (it is) — keep + the `getArchitectHarness` call. + +### 2. `doctor`: affirm codex, bar gemini as architect (`commands/doctor.ts:699–705`) + +Split the current `resolvedHarness === 'codex' || resolvedHarness === 'gemini'` affirmation branch: +- **`codex`**: keep the green "✓ codex is configured as architect shell — supported" affirmation, with + the "Conversation resume is Claude-main-only" + "select via `.codev/config.json`" gray notes. +- **`gemini`**: move into the warn/bar branch alongside `opencode` — print a yellow warning that gemini + is **unsupported as an architect** (the Gemini CLI is retiring, #778) but **supported for builders**, + recommending `claude` or `codex` for the architect. Add a matching `warningDetails` entry. + +### 3. Remove the `.gemini/settings.json` gitignore entry + +Nothing writes that file anymore (the seam is gone): +- **Root `.gitignore:11`** — remove the `.gemini/settings.json` line. +- **`lib/gitignore.ts:24`** (`CODEV_GITIGNORE_ENTRIES`) — remove the line. (`codev update`'s gitignore + backfill only *appends*, so an adopter's existing line is left in place; no churn, no removal logic + needed. No `codev-skeleton/` mirror exists — grep confirms.) + +### 4. Tests + +Drop gemini-*architect* cases; keep gemini *builder* cases. +- **`agent-farm/__tests__/tower-utils.test.ts`**: remove the `writeArchitectContextFiles (#929)` describe + block (lines 207–239) and the `writeArchitectContextFiles` import (line 21). (The block's claude case + "writes nothing" is moot once the function is deleted.) +- **`agent-farm/__tests__/tower-instances.test.ts:712,728`**: remove the two gemini + `.gemini/settings.json` architect tests (write-if-missing + no-clobber). **Keep** the codex architect + resume-skip regression guard (stale Claude `.jsonl` + codex → fresh `buildArchitectArgs`, no `--resume`) + and the claude-still-resumes case — these guard the kept crash-loop fix. +- **`agent-farm/__tests__/config.test.ts:131`**: remove the `expect(harness.getArchitectFiles).toBeDefined()` + assertion (the seam no longer exists). Keep the rest of that test's gemini *builder*-harness assertions. +- **`__tests__/gitignore.test.ts:140,211`** and **`__tests__/update.test.ts:530,550`**: drop + `.gemini/settings.json` from the expected `added` / `gitignoreAdded` arrays. +- **`af-architect.test.ts`**: keep codex arg-construction cases; remove any gemini-*architect* case if + present (grep at implement time). This file is the no-Tower command path; it does not guard the resume + regression (that's `tower-instances.test.ts`). +- Sweep for any remaining gemini-*architect* test reference; **keep** every gemini *builder* test + (`GEMINI_SYSTEM_MD` role injection, `getBuilderHarness` gemini cases). + +### 5. Docs (`codev/resources/arch.md` #929 subsection, lines 272–281) + +Rewrite to claude + codex architects only: +- Para 1 (line 274): "claude, codex, and gemini are all supported as architects" → "claude and codex are + supported as architects; **gemini is builder-only** (Gemini CLI retiring, #778)". Drop the trailing + "Gemini additionally gets a `.gemini/settings.json` … via `getArchitectFiles`" sentence entirely. + Adjust the override-aware-resolution examples to use codex (the architect path); the gemini override + example can be reframed as a *builder* example or dropped. +- Caveat para (line 276, #1062): keep — still valid. Reword "recognized codex/gemini cases" → "recognized + codex case" (architect scope); the mechanism is unchanged. +- Conversation-resume para (line 278): keep. Reword "Codex/gemini architects" → "Codex architects"; keep + "(and resumed codex/gemini **builders**, via `spawn.ts` `discoverResumeSession`)" — gemini builders + still exercise the `buildResume → null` path. +- `getArchitectFiles` para (line 280): **delete entirely** — the seam is gone. +- **`CLAUDE.md` / `AGENTS.md`**: check the architect-harness guidance for any gemini-architect claim; if + none needs changing, leave both untouched (they must stay byte-identical). The always-on context block + references agy only for *consult*, not architect — no change expected there. + +### 6. Stray comments referencing the removed seam + +Grep-and-fix the prose comments that mention "Gemini's .gemini/settings.json" in non-test code so no dead +reference survives: +- `agent-farm/commands/architect.ts:29` +- `agent-farm/servers/tower-instances.ts:533` +- (the `tower-utils.ts` comments are removed with the function in §1) + +## Files to Change + +- `packages/codev/src/agent-farm/utils/harness.ts` — remove `getArchitectFiles?` from the interface and + `GEMINI_HARNESS.getArchitectFiles`. (Keep `buildResume`, `CODEX_HARNESS`, gemini builder surface.) +- `packages/codev/src/agent-farm/servers/tower-utils.ts` — remove `writeArchitectContextFiles` + its + `buildArchitectArgs` call; update the doc comment. +- `packages/codev/src/commands/doctor.ts` — affirm codex; bar gemini as architect (builder-only warning). +- `.gitignore` (root) and `packages/codev/src/lib/gitignore.ts` — remove `.gemini/settings.json`. +- `packages/codev/src/agent-farm/commands/architect.ts`, + `packages/codev/src/agent-farm/servers/tower-instances.ts` — fix stray `.gemini/settings.json` comments. +- Tests: `tower-utils.test.ts`, `tower-instances.test.ts`, `config.test.ts`, `gitignore.test.ts`, + `update.test.ts`, and `af-architect.test.ts` if it carries a gemini-architect case. +- Docs: `codev/resources/arch.md` (#929 subsection); `CLAUDE.md` / `AGENTS.md` only if a gemini-architect + claim is present. + +**Explicitly NOT changing** (dropped from the prior agy plan): no `AGY_HARNESS`, no `lib/agy-bin.ts` +extraction, no `resolveBinary` seam, no agy tests/docs, no `message-write.ts` submit strategy. + +## Risks & Alternatives Considered + +- **Risk — deleting the seam breaks codex architect context.** It does not: codex reads `AGENTS.md` + natively (no pointer file), and the seam only ever served gemini. Regression covered by the kept codex + resume-skip + arg-construction tests, and verified live at `dev-approval`. +- **Risk — leftover gemini-architect references compile/pass tests but mislead.** Mitigated by the + grep-and-fix sweep (§4, §6) across both code and docs before signalling complete. +- **Risk — `codev update` adopters retain a stale `.gemini/settings.json` gitignore line.** Acceptable: + the backfill only appends, so existing adopter lines are inert and harmless; no removal migration is in + scope. New projects simply won't get the line. +- **Risk — `TOWER_ARCHITECT_CMD`/`--architect-cmd` override without matching `.codev/config.json`.** + Pre-existing #1059/#1062 caveat (unrecognized override commands default to the claude harness). Out of + scope; documented in arch.md's caveat para. Recommended selection remains `.codev/config.json`. +- **Alternative — keep gemini as an architect (status quo of #1059).** Rejected by the architect: the + Gemini CLI retires 2026-06-18 (#778); shipping it as an architect would ship a dying dependency. agy, + its intended successor, is deferred to #1063 because its role-injection channel is too weak. +- **Alternative — keep the `getArchitectFiles` seam "for future harnesses."** Rejected: no current + harness uses it (claude/codex read `AGENTS.md` natively), and the brief explicitly says delete dead + code. A future harness that needs a pointer file can reintroduce a purpose-built seam. + +## Test Plan + +**Unit (this worktree, gate-verifiable from the diff):** +- `pnpm --filter @cluesmith/codev build` clean (seam removal is purely subtractive in the interface). +- `pnpm --filter @cluesmith/codev test` — the kept codex architect resume-skip + claude-resume guards + green; gemini *builder* tests green; gitignore/update tests reflect the dropped `.gemini/settings.json` + entry; no test references the deleted `getArchitectFiles` / `writeArchitectContextFiles`. +- Grep-clean: no remaining `getArchitectFiles`, `writeArchitectContextFiles`, or + `.gemini/settings.json` reference anywhere except (a) gemini *builder* code/tests, which must remain. + +**Manual / empirical (reviewer at the `dev-approval` gate; needs `codex` installed + signed in). Set +`shell.architect: "codex …"` (and/or `shell.architectHarness: "codex"`) in `.codev/config.json`:** +- `afx architect` (no-Tower) launches codex with the role injected via `-c model_instructions_file=`. +- `afx workspace start` main architect launches on a **clean** cwd. +- `afx workspace start` main architect launches with a **stale Claude `.jsonl` present** in + `~/.claude/projects//` — must **not** crash-loop; confirm **no `--resume`** in the + launched command (the primary regression target, the kept crash-loop fix). +- `afx workspace add-architect` sibling codex launches with role injected. +- `afx send` delivers + submits: single-line, multi-line (>3 lines, no swallowed Enter), `--interrupt`, + and while codex's TUI is streaming. (Flakiness → triggers the deferred submit-strategy follow-up.) +- Tower stop→start reconnect and shellper auto-restart both relaunch the codex architect. +- A builder spawned by a codex architect preserves `CODEV_ARCHITECT_NAME` affinity. +- `afx spawn --resume` on a non-Claude builder → fresh launch + resume notice (not + `--resume `). +- `codev doctor` with `shell.architect: "codex …"` → affirms codex; with `gemini` → warns + builder-only-not-architect. diff --git a/codev/projects/929-support-codex-and-gemini-clis-/status.yaml b/codev/projects/929-support-codex-and-gemini-clis-/status.yaml new file mode 100644 index 000000000..cd0c9bb92 --- /dev/null +++ b/codev/projects/929-support-codex-and-gemini-clis-/status.yaml @@ -0,0 +1,38 @@ +id: '929' +title: support-codex-and-gemini-clis- +protocol: pir +phase: verified +plan_phases: [] +current_plan_phase: null +gates: + plan-approval: + status: approved + requested_at: '2026-06-17T02:45:09.085Z' + approved_at: '2026-06-19T19:46:12.660Z' + dev-approval: + status: approved + requested_at: '2026-06-19T19:53:44.136Z' + approved_at: '2026-06-19T20:25:50.053Z' + pr: + status: approved + requested_at: '2026-06-19T20:39:26.845Z' + approved_at: '2026-06-19T20:45:31.893Z' +iteration: 1 +build_complete: false +history: [] +started_at: '2026-05-31T19:04:30.567Z' +updated_at: '2026-06-19T20:46:12.163Z' +pr_history: + - phase: review + pr_number: 1 + branch: builder/pir-929 + created_at: '2026-06-16T19:10:45.900Z' + - phase: review + pr_number: 1059 + branch: builder/pir-929 + created_at: '2026-06-16T19:49:59.539Z' + - phase: review + pr_number: 1059 + branch: builder/pir-929 + created_at: '2026-06-19T20:31:25.040Z' +pr_ready_for_human: false diff --git a/codev/resources/arch.md b/codev/resources/arch.md index b6e866519..4bfe41821 100644 --- a/codev/resources/arch.md +++ b/codev/resources/arch.md @@ -253,7 +253,7 @@ All architect sessions (at all 3 creation points) receive a role prompt injected 1. Loads the architect role from `codev/roles/architect.md` (local) or `skeleton/roles/architect.md` (bundled fallback) via `loadRolePrompt()` 2. Writes the role content to `.architect-role.md` in the project directory -3. Appends `--append-system-prompt ` to the architect command args +3. Delegates the CLI-specific injection to the configured `HarnessProvider` (`agent-farm/utils/harness.ts`, Spec 591): claude `--append-system-prompt`, codex `-c model_instructions_file=`, gemini `GEMINI_SYSTEM_MD` env var **Three architect creation points** where role injection is applied: - `tower-instances.ts` → `launchInstance()` (new project activation) @@ -269,6 +269,16 @@ Framework files (protocol docs, templates) live in the package skeleton (resolve A `codev doctor` audit (`lib/framework-ref-audit.ts`) flags shell-fetch of framework files in a project's local `codev/` overrides; the shipped skeleton is guarded by a CI unit test. +#### Supported Architect Harnesses & Conversation Resume (#929) + +**Supported architect harnesses** (Issue #929): claude and codex are supported as architects, selected via `.codev/config.json` (`shell.architect` / `shell.architectHarness`) — the same config-driven mechanism builders use, and the *recommended* one. **Gemini is builder-only** — the Gemini CLI is retiring (#778), so it is not offered or affirmed as an architect (its `GEMINI_SYSTEM_MD` builder surface stays); `doctor` warns if `gemini` is configured as an architect. (agy, the gemini successor, is deferred as an architect to #1063 — its only role-injection channel is a visible first user turn.) Harness auto-detection is **override-aware**: `getArchitectHarness` / `getBuilderHarness` resolve the harness from the override-aware command (`getResolvedCommands` → `cliOverrides` / `TOWER_ARCHITECT_CMD` / config), so a `--architect-cmd codex` / `TOWER_ARCHITECT_CMD=codex` / `--builder-cmd gemini` with no matching harness config still resolves the *non-claude* harness, not claude. (Before #929 it auto-detected from the raw config value only — an override launched the non-claude CLI but resolved the claude harness, re-arming the resume crash-loop below.) An explicit `shell.architectHarness` / `shell.builderHarness` still wins over auto-detection. OpenCode remains builder-only (file-based injection needs an ephemeral worktree). Codex reads project context (`AGENTS.md`) natively, so no architect context-file seam is needed; the `getArchitectFiles` seam #1059 added for gemini was removed with gemini's architect support. + +> **Caveat — unrecognized override commands still default to the claude harness (tracked in cluesmith/codev#1062).** `#929`'s override-awareness only covers *recognized* harness commands (claude/codex/gemini/opencode, matched by `detectHarnessFromCommand`). An override command the detector does **not** recognize — e.g. `TOWER_ARCHITECT_CMD=bash`, a wrapper script, or any custom launcher — with **no** explicit `shell.architectHarness` / `shell.builderHarness` falls through `resolveHarness` to the **claude** harness (`harness.ts`, the final `return CLAUDE_HARNESS`). With a stale Claude `.jsonl` present, that can still build ` --resume ` for the unrecognized command. This is **pre-existing and narrow** (not a #929 regression — #929 strictly *improved* the recognized codex case) and separable. Mitigation today: set an explicit `shell.architectHarness` / `shell.builderHarness` when using an unrecognized launcher command. + +**Conversation resume is Claude-main-only.** `launchInstance` resumes a prior session only when the configured harness implements `HarnessProvider.buildResume` — currently just claude (its sessions live at `~/.claude/projects//*.jsonl`). Codex architects (and resumed codex/gemini *builders*, via `spawn.ts` `discoverResumeSession`) return `null` from `buildResume` and relaunch fresh with role injection. This gating fixes a latent crash-loop where a non-Claude harness + a stale Claude `.jsonl` built an invalid ` --resume ` invocation and shellper restart-looped to death. + +**Architect role injection is centralized in `buildArchitectArgs`** (`tower-utils.ts`), the shared helper every architect-launch path routes through — `launchInstance` (fresh), `add-architect` (sibling), shellper reconnect (×2), and the no-Tower `afx architect` (refactored in #929 to call `buildArchitectArgs` instead of duplicating injection). So the architect role is injected on **every** launch path, not just first-activation. (No architect context-file seam exists: claude/codex read project context natively; the gemini-only `getArchitectFiles` seam #1059 introduced was removed when gemini's architect support was dropped.) + #### Multi-Architect Support (Spec 755 / Spec 786) A workspace can host more than one architect terminal. Each architect has a stable name (`main` for the workspace's default; siblings via `afx workspace add-architect`). The primary use case is letting a sibling architect drive a focused workflow without monopolising `main`. diff --git a/codev/resources/lessons-learned.md b/codev/resources/lessons-learned.md index 90762a4af..b9af69264 100644 --- a/codev/resources/lessons-learned.md +++ b/codev/resources/lessons-learned.md @@ -66,6 +66,7 @@ Generalizable wisdom extracted from review documents, ordered by impact. Updated ## Architecture +- [From #929] When abstracting per-CLI (or per-provider) behavior behind a provider interface, audit **every** call site that builds a CLI invocation — including resume/restart paths, not just the obvious fresh-launch path. Spec 591's harness abstraction routed fresh-launch role injection through the provider but left the `--resume` seam reading Claude's session store unconditionally; a codex/gemini architect (or resumed builder) with any stale Claude `.jsonl` then built ` --resume ` and shellper restart-looped to death. The bug only fires on the resume branch (fresh launches were already correct), which is exactly why "builders already prove the path" didn't cover it — resume is a *different* call site. Fix: an optional capability (`buildResume`) that returns the resume invocation in both Node-argv and shell-escaped-fragment forms (mirroring `buildRoleInjection`/`buildScriptRoleInjection`), so no CLI-specific string survives at the call site and only the harness that has a cwd-keyed session store opts in (others return null → fresh launch). - [From 1052] Shared *shape* is not shared *substance* — resist extracting on resemblance. The VSCode replay buffer-and-flush and the web client's `flushInitialBuffer` do the same conceptual thing (hold replay → paint once at the settled size), but their triggers and bodies diverge: web uses a fixed 500ms delay entangled with `FitAddon`/`ScrollController`; VSCode uses a `setDimensions` debounce + PTY-resize + paced `writeChunked`. The only common part is a ~10-line "accumulate then emit once" skeleton; the valuable logic (trigger policy, flush body) differs. Extracting would couple two independently-evolving strategies behind one leaky name — a net negative at two call sites. Contrast the primitives that *are* genuinely identical and correctly centralized in core: `reconnect-policy` and `escape-buffer` (the dashboard's `escapeBuffer.ts` is a 5-line re-export of core's, not a copy). Corollary, painfully relearned here: a claim of "duplication" from an *import path* (`../lib/escapeBuffer.js`) is not evidence — read the file before filing a dedup; it was a re-export shim all along. - [From #907] A workspace package consumed as *source* by one toolchain and as *built output* by another resolves through different `exports` conditions: tsc and vite read `exports.types` (`./src/index.ts`), while esbuild (the VS Code extension bundler) reads the runtime `exports.default` (`./dist/index.js`). `@cluesmith/codev-types` had no `dist` in a fresh worktree, so `tsc --noEmit`, the codev build, and the full test suite all passed while only the extension's esbuild bundle failed with `Could not resolve "@cluesmith/codev-types"`. A green type-check/test run does **not** prove every bundler can resolve a package — if a workspace package ships a `default → ./dist` condition, it must be built before any esbuild consumer. Root `pnpm build` now builds `types` first (types → core → codev). - [From #907] Derived projection fields need a fallback when their source is transiently unreadable. `OverviewBuilder.area` is re-resolved every refresh from the open-only `gh issue list`; once an issue closes (merge), is torn down mid-cleanup, or the fetch fails, the lookup misses and the field snapped back to its `Uncategorized` default while the builder was still listed — a visible mis-render once #818 made `area` control tree placement. Fix: cache the last *resolved* value (`ResolvedEnrichmentCache`), gated on **source reachability, not value emptiness**, so a reachable-but-unlabeled issue still records a genuine `Uncategorized` and a stale entry can't mask a live label change. The value is a stable per-item fact, so this is a cache bridging a read gap, not history of a changing value — which is why it lives on the process-lifetime cache singleton, not the per-refresh builder DTO. diff --git a/codev/reviews/929-support-codex-and-gemini-clis-.md b/codev/reviews/929-support-codex-and-gemini-clis-.md new file mode 100644 index 000000000..67a395492 --- /dev/null +++ b/codev/reviews/929-support-codex-and-gemini-clis-.md @@ -0,0 +1,104 @@ +# PIR Review: Support `codex` as an architect (codex-only) + +Fixes #929 + +## Summary + +Brings the OpenAI `codex` CLI to parity with `claude` as a Codev **architect**, selectable via `.codev/config.json` (`shell.architect` / `shell.architectHarness`). The core fix routes session-discovery + `--resume` argument construction behind a new optional `HarnessProvider.buildResume` capability, eliminating a latent crash-loop where a non-Claude architect (or resumed builder) with any stale Claude `.jsonl` built an invalid ` --resume ` invocation and shellper restart-looped to death. **Gemini is builder-only** — the Gemini CLI is retiring (#778), so the originally-scoped gemini-*architect* support was removed; gemini's `GEMINI_SYSTEM_MD` *builder* surface is untouched. (`agy`, the gemini successor, is deferred as an architect to follow-up #1063 — its only role-injection channel is a visible first user turn.) + +## Scope note (codex-only rescope, 2026-06-19) + +This branch was originally scoped as codex **+** gemini architects (the earlier #1059 implementation). It was rescoped mid-review to **codex-only**: the gemini-architect additions (the `getArchitectFiles` context-file seam, `writeArchitectContextFiles`, the doctor affirmation, the `.gemini/settings.json` gitignore entry, and gemini-architect tests) were reverted in a surgical, subtractive pass. The engine-neutral `buildResume` crash-loop fix and codex architect parity — the durable deliverables — are fully intact. The net diff vs `main` therefore reflects codex architect parity + the crash-loop fix, with no gemini-architect surface. + +## Files Changed + +Net vs `main` (`git diff --stat `): + +- `packages/codev/src/agent-farm/utils/harness.ts` (+27) — `buildResume?` on the `HarnessProvider` interface; `CLAUDE_HARNESS.buildResume` (delegates to `findLatestSessionId`, returns bundled `{ sessionId, args, scriptFragment }`); codex/gemini/opencode leave it undefined → fresh launch +- `packages/codev/src/agent-farm/utils/config.ts` (+16) — `getArchitectHarness`/`getBuilderHarness` resolve from the *override-aware* command (`getResolvedCommands`); `getResolvedCommands.architect` honors `TOWER_ARCHITECT_CMD` +- `packages/codev/src/agent-farm/servers/tower-instances.ts` (+35/-…) — architect resume gated on `getArchitectHarness(...).buildResume?.()`, composed with the pre-existing `safeToResume` sibling-collision guard; WARN gated on resume support +- `packages/codev/src/agent-farm/servers/tower-utils.ts` (+3/-1) — `buildArchitectArgs` injects the architect role via the shared helper (the gemini-only `writeArchitectContextFiles` was added then removed in the rescope) +- `packages/codev/src/agent-farm/commands/spawn.ts` (+46/-…) — `discoverResumeSession` takes the builder harness, returns the bundled resume object; both call sites pass `getBuilderHarness(...)`; distinct "harness does not support resume" log +- `packages/codev/src/agent-farm/commands/spawn-worktree.ts` (+25/-…) — `startBuilderSession`'s `resumeSessionId?: string` → `resume?: { sessionId, scriptFragment }`; script emits the pre-escaped fragment +- `packages/codev/src/agent-farm/commands/architect.ts` (+…/-…) — no-Tower path calls the shared `buildArchitectArgs` (was duplicating role injection) +- `packages/codev/src/commands/doctor.ts` (+24) — affirm codex architect support; warn that gemini is builder-only (not an architect) +- `.gitignore`, `packages/codev/src/lib/gitignore.ts` — net unchanged (the `.gemini/settings.json` entry was added then removed in the rescope) + +Tests: +- `packages/codev/src/agent-farm/__tests__/tower-instances.test.ts` (+102) — architect resume-skip regression guard (codex + stale Claude jsonl → fresh, no `--resume`; claude still resumes) +- `packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts` (+75) — builder resume script uses escaped `scriptFragment`; codex/gemini builders → fresh script, no `--resume` +- `packages/codev/src/agent-farm/__tests__/discover-resume-session.test.ts` (+45) — harness-arg threading; codex/gemini null-return + claude bundled-object cases +- `packages/codev/src/agent-farm/__tests__/config.test.ts` (+53) — override-aware harness resolution (`TOWER_ARCHITECT_CMD` / `--architect-cmd` / `--builder-cmd` → non-claude harness, no `buildResume`) +- `packages/codev/src/agent-farm/__tests__/af-architect.test.ts` (+17) — `buildArchitectArgs` delegation mocks +- `packages/codev/src/agent-farm/__tests__/tower-utils.test.ts` (+1) — minor + +Docs / artifacts: +- `codev/resources/arch.md` — "Supported Architect Harnesses & Conversation Resume (#929)" subsection: claude + codex architects, gemini builder-only, Claude-only resume, override-aware resolution, #1062 caveat +- `codev/resources/lessons-learned.md` — one Architecture lesson (audit *all* invocation seams when extending a provider abstraction) +- `codev/plans/...`, `codev/reviews/...`, `codev/state/pir-929_thread.md`, `codev/projects/929-*/status.yaml` + +## Commits + +Net vs `main` (excluding porch chores + the main-merge commit): + +- `53c9f037` [PIR #929] Thread: implement phase complete (codex-only) +- `be84f75a` [PIR #929] Tests + arch.md: claude+codex architects only (drop gemini-architect cases) +- `0d920884` [PIR #929] doctor: bar gemini as architect (builder-only); drop .gemini/settings.json gitignore +- `a6b42daa` [PIR #929] Remove gemini-architect context-file seam (codex-only) +- `db88349c` [PIR #929] Plan revised: codex-only (agy dropped → #1063, gemini builder-only) +- `1f4bfd30` [PIR #929] Doc: caveat — unrecognized override commands default to claude harness (#1062) +- `9b615cdf` [PIR #929] Address architect integration review: override-aware harness + centralized context files +- `69cf20de` [PIR 929][Phase: implement] feat: harness-gated session resume for codex/gemini architects + +## Test Results + +- `pnpm build`: ✓ pass (clean TS types) +- `pnpm test` (full suite): ✓ **3332 passed, 48 skipped, 0 failures** +- Manual verification: empirical codex architect lifecycle validation (clean + stale-jsonl launch, add-architect, `afx send`, reconnect, affinity, builder `--resume`) was exercised by the human at the `dev-approval` gate against the running worktree — the reason PIR was chosen over AIR/BUGFIX. + +## Architecture Updates + +**COLD (`codev/resources/arch.md`)** — updated. The "Supported Architect Harnesses & Conversation Resume (#929)" subsection documents: (1) claude + codex are supported architects selected via `.codev/config.json`; gemini is **builder-only** (Gemini CLI retiring, #778); (2) conversation resume is Claude-main-only via `HarnessProvider.buildResume`, and the crash-loop it fixes; (3) override-aware harness resolution and the #1062 unrecognized-command caveat; (4) no architect context-file seam exists — claude/codex read `AGENTS.md` natively, and the gemini-only `getArchitectFiles` seam was removed with gemini's architect support. + +No **HOT** (`arch-critical.md`) change: this PR extends an existing abstraction (Spec 591) rather than introducing a new always-on invariant, so a cold-tier reference detail is the correct routing. + +## Lessons Learned Updates + +**COLD (`codev/resources/lessons-learned.md`, Architecture section)** — one lesson retained: when abstracting per-CLI behavior behind a provider, every call site that builds a CLI invocation must route through the provider — including resume/restart paths, not just the obvious fresh-launch path. The resume seam was the one path Spec 591 left harness-blind, and it only crash-loops on the `--resume` branch (fresh launches were already correct), which is why "builders already prove the path" didn't cover it. + +No **HOT** (`lessons-critical.md`) change: this is a spec-narrow recipe better suited to the cold archive. + +## Things to Look At During PR Review + +- **The `buildResume` bundling decision** (`harness.ts`): one method returns both the Node-argv `args` (for the `spawn()` architect site) and a shell-escaped `scriptFragment` (for the builder bash generator), mirroring `buildRoleInjection`/`buildScriptRoleInjection`. Avoids a second independently-optional method (which would force a `!` non-null assertion) and avoids `.join(' ')`-ing a raw argv into bash (word-split/quoting bug). +- **The `safeToResume` interaction** (`tower-instances.ts`): the new harness gate composes with the pre-existing sibling-collision guard (`safeToResume`, #832) — resume happens only when *both* the harness implements `buildResume` *and* no persisted siblings exist. +- **Override-aware harness resolution precedence (BLOCKER B fix)**: `getResolvedCommands.architect` is `cliOverrides.architect || TOWER_ARCHITECT_CMD || config`. Within the Tower process `cliOverrides` is empty (set in the spawning `afx` process, not the long-lived server), so this matches the launch site's `TOWER_ARCHITECT_CMD || config`. An explicit `shell.architectHarness` / `shell.builderHarness` still wins over auto-detection by design. +- **Codex-only seam removal**: the gemini-architect `getArchitectFiles` / `writeArchitectContextFiles` seam was deleted (gemini was its only implementer; `buildArchitectArgs` was its only consumer). Confirm `buildArchitectArgs` still resolves `getArchitectHarness(...)` (used by `buildRoleInjection`) and that codex needs no context file (reads `AGENTS.md` natively). Grep-verified: zero residual `getArchitectFiles` / `writeArchitectContextFiles` / `.gemini/settings.json` references. +- **Known caveat — unrecognized override commands still default to the claude harness (follow-up: cluesmith/codev#1062).** The override-awareness covers *recognized* commands (claude/codex/gemini/opencode); `resolveHarness` still falls through to `CLAUDE_HARNESS` for an **unrecognized** override (e.g. `TOWER_ARCHITECT_CMD=bash`) with no explicit `shell.architectHarness`. Pre-existing, narrow, separable (not a #929 regression). Mitigation: set an explicit harness. Documented in `arch.md`; tracked in #1062. + +## Consultation Findings & Dispositions + +A **full 3-way advisory CMAP** (gemini, codex, claude) ran on this codex-only PR as a single pass, instructed to read every changed file **in full** plus callers/dependencies and assess the whole PR (not the unified diff). Verdicts: + +- **codex: APPROVE** — no blocking issues. Confirmed the crash-loop fix is correct and the Claude-only resume gating is properly centralized behind `buildResume`. Three non-blocking coverage nits (below). +- **claude: APPROVE** — no blocking issues. Traced every changed file + callers; verified a codex/gemini harness can never reach a `--resume ` at either the architect or builder site, that the gemini-architect seam removal is complete (zero dangling refs), and that the crash-loop regression is pinned from **four independent angles** (`discover-resume-session`, `tower-instances`, `config`, `spawn-worktree` tests). Four non-blocking observations (below). +- **gemini (agy): no usable verdict** — the Antigravity CLI returned a generic greeting in 8.1s rather than performing the review (the structural agy limitation — no durable task/system-prompt channel — that motivated deferring agy as an architect to #1063). Not a review of the code; recorded as unavailable, not as an APPROVE or REQUEST_CHANGES. + +**Net: 2/2 substantive reviewers APPROVE, zero REQUEST_CHANGES, zero blocking defects → no code change required.** + +**Non-blocking nits (consensus, accepted as a low-priority follow-up):** both codex and claude noted that the `doctor` architect-shell branch (the codex-affirm / gemini-builder-only-warning logic) and the no-Tower `afx architect` codex path lack *direct* unit tests, and that the "explicit `shell.architectHarness` wins" rule isn't pinned. Disposition: **deferred.** The branch logic is trivial and both reviewers independently verified it reads correctly; there is currently **no** test harness for the `doctor` `shell.architect` block at all (the pre-existing opencode-architect branch is likewise untested), so adding the first test for it is net-new infrastructure beyond this PR's subtractive scope — tracked as a follow-up rather than expanded here. claude also re-flagged the known `doctor`-reads-config-not-overrides cosmetic discrepancy (same class as the #1062 caveat) and the `safeToResume` stale-removed-sibling jsonl gap (documented "acceptable until #832"). No action; both pre-existing and separable. + +Full reviewer outputs: `/tmp/cmap-929-{codex,claude,gemini}.md`. + +## How to Test Locally + +For reviewers pulling the branch: + +- **View diff**: VSCode sidebar → right-click builder `pir-929` → **Review Diff** +- **Run dev server**: `afx dev pir-929` +- **What to verify** (needs codex installed; set `shell.architect: "codex"`): + - `afx workspace start` main architect launches with a **stale Claude `.jsonl` present** in `~/.claude/projects//` — must NOT crash-loop, and no `--resume` in the launched command (primary regression target) + - `afx architect` (no-Tower) + `afx workspace add-architect` launch with role injected + - `afx send` single-line / multi-line (>3 lines) / `--interrupt` / while streaming + - `afx spawn --resume` on a non-Claude builder → fresh launch + resume notice, inspect `.builder-start.sh` for no `--resume ` + - `codev doctor` with `shell.architect: "codex"` → affirms codex; with `gemini` → warns builder-only-not-architect diff --git a/codev/state/pir-929_thread.md b/codev/state/pir-929_thread.md new file mode 100644 index 000000000..2f8f6b8a8 --- /dev/null +++ b/codev/state/pir-929_thread.md @@ -0,0 +1,208 @@ +# PIR #929 — Support codex & gemini CLIs as architects + +## Builder thread + +### 2026-06-17 — RESCOPE: agy replaces gemini for architects (porch rewound to PLAN) + +Architect rewound porch to the plan phase. Issue #929 rescoped (#778): the Gemini CLI retires +2026-06-18; `agy` (Antigravity CLI) replaces it as the non-Claude **architect**. PR #1059 stays +open/unmerged; we extend it. Keep from #1059: the harness-agnostic `buildResume` seam, codex parity, +the `getArchitectFiles` seam. Delta: add `AGY_HARNESS`, swap gemini out of architect support. + +**Key investigation finding (flagged to architect):** agy's role/context mechanism is NOT gemini's — +it's codex-like. +- agy reads `AGENTS.md` **natively** (binary strings: "append to AGENTS.md in the Workspace + Customizations Root"; global root = `~/.gemini`). So agy needs **no** `getArchitectFiles` pointer + file (the retired gemini CLI needed `.gemini/settings.json` → context.fileName; agy doesn't). +- agy has **no** `--append-system-prompt` flag / no `GEMINI_SYSTEM_MD` env. Role injection rides + `-i`/`--prompt-interactive ""` (agy --help; consult's "hermes precedent"), folding the role + into the interactive launch prompt. So `AGY_HARNESS.buildRoleInjection` → `['--prompt-interactive', + roleContent]`. +- Binary resolution reuses consult's `resolveAgyBin`/`isRealAgyCli`/`agyRespondsToVersion` (bare `agy` + on PATH may be the IDE launcher symlink). Plan: extract those to `lib/agy-bin.ts`, add optional + `HarnessProvider.resolveBinary?`, apply at the ~6 architect executable-determination sites. + +Swap details: doctor affirms codex/`agy` (bars gemini as architect like opencode); remove +`GEMINI_HARNESS.getArchitectFiles` (architect-only dead code) + the `.gemini/settings.json` gitignore +entry; tests/docs cover agy instead of gemini; GEMINI_HARNESS stays for builders. `AGY_HARNESS.buildResume` +undefined → fresh launch (resume deferred). Plan rewritten; committing and sitting at plan-approval. + +### 2026-05-31 — Plan phase + +Investigated the resume crash-loop bug. Root cause confirmed across two sites, one mechanism: + +- **Architect**: `tower-instances.ts:500` calls `findLatestSessionId(workspacePath)` (reads only `~/.claude/projects//*.jsonl`) without gating on the configured harness, then builds `[...cmdParts.slice(1), '--resume', resumeSessionId]`. A codex/gemini architect + any stale Claude jsonl ⇒ `codex --resume ` ⇒ shellper restart-loop to death. +- **Builder**: `spawn.ts:83 discoverResumeSession()` (called at 459 + 838) has the same Claude-only assumption; `spawn-worktree.ts:746` then bakes `${baseCmd} --resume "${id}"` into the launch script. `workspace-recover.ts` re-enters via `afx spawn --resume`, inheriting it. + +**Fix direction**: add an optional `discoverResumeSession?` capability to `HarnessProvider`. Only `CLAUDE_HARNESS` implements it (delegates to `findLatestSessionId`); codex/gemini/opencode leave it undefined ⇒ null ⇒ fresh launch via `buildArchitectArgs` / role-injection script path. Gate both call sites on the harness. The downstream `--resume` arg/script construction is reached only when claude returned a non-null id, so it stays claude-shaped safely. + +**Scoping decision**: MVP fixes the **config-driven** selection path (`.codev/config.json` `shell.architect`/`architectHarness`), which is what every acceptance criterion is phrased around. The `TOWER_ARCHITECT_CMD`/`--architect-cmd` env/CLI-override-without-matching-config edge stays a documented nice-to-have (issue lists it as such). + +**Empirical validation** (launch on stale-jsonl cwd, add-architect, `afx send` multiline/interrupt/streaming, reconnect, affinity, dashboard scrollback) needs codex/gemini actually installed — that happens at the `dev-approval` gate when the human runs the worktree. That's precisely why PIR was chosen. + +Architect PR-target note: no push access to cluesmith/codev — push branch to fork `mohidmakhdoomi/codev`, open PR with `gh pr create --repo mohidmakhdoomi/codev`. + +Plan written, awaiting `plan-approval`. + +### 2026-05-31 — Plan revision (architect feedback, 5 verified issues) + +Revised after a 2-way consult + architect verification: +1+2. **Seam redesign**: collapsed the two paired methods (`discoverResumeSession` + `buildResumeInvocation!`) into a single bundled `buildResume?(cwd, opts) → { sessionId, args, scriptFragment } | null`, mirroring `buildRoleInjection`/`buildScriptRoleInjection`. Node-argv `args` for the architect `spawn()` site; **shell-escaped** `scriptFragment` for the builder bash generator. Kills the `!` non-null assertion and the raw-argv `.join(' ')` word-split/quoting bug. +3. **Test layering fix**: regression guards moved to where the bugs live — `tower-instances.test.ts` (architect launch, no `--resume` for codex/gemini + stale jsonl) and `spawn-worktree.test.ts` (builder script uses escaped fragment). `af-architect.test.ts` (no-Tower command only) noted as NOT guarding the real regression. +4. **Gemini context promoted to MVP**: add `getArchitectFiles?` hook; `GEMINI_HARNESS` writes `.gemini/settings.json` `context.fileName → AGENTS.md` (write-if-missing). Codex already reads AGENTS.md natively; gemini shipped context-blind otherwise. +5. **Builder override caveat documented**: `--builder-cmd`/env vs config-only `getBuilderHarness` is the exact analog of the architect `TOWER_ARCHITECT_CMD` trap — added to Risks alongside it. + +Re-signalled, awaiting architect re-review. + +### 2026-05-31 — Implement phase + +Plan approved. Implemented all six tasks: + +1. **harness.ts**: added `buildResume?` (bundled discovery + Node-argv + escaped script fragment) and `getArchitectFiles?` to `HarnessProvider`. `CLAUDE_HARNESS.buildResume` delegates to `findLatestSessionId`; `GEMINI_HARNESS.getArchitectFiles` writes `.gemini/settings.json` → AGENTS.md. codex/gemini/opencode leave `buildResume` undefined. +2. **tower-instances.ts**: architect resume now gated on `getArchitectHarness(...).buildResume?.()` (preserving the `safeToResume` sibling guard); writes `getArchitectFiles?()` if-missing on launch. +3. **spawn.ts / spawn-worktree.ts**: `discoverResumeSession` takes the builder harness and returns the bundled object; `startBuilderSession`'s `resumeSessionId?: string` → `resume?: { sessionId, scriptFragment }`, script emits `${baseCmd} ${resume.scriptFragment}`. +4. **doctor.ts / arch.md**: codex/gemini architect affirmation + "resume is Claude-main-only" docs. +5. **Tests**: `discover-resume-session.test.ts` (harness arg + codex/gemini null-return guards), `tower-instances.test.ts` (architect resume-skip for codex/gemini + stale jsonl; claude still resumes; gemini `.gemini/settings.json` write-if-missing + no-clobber), `spawn-worktree.test.ts` (resume script uses escaped fragment, no prompt/role injection; fresh path has no `--resume`). + +**Test-infra notes**: worktree had no `node_modules` — ran `pnpm install` + built `@cluesmith/codev-core` (vitest loads source which imports the core package's subpath exports). Two mock gaps surfaced and were fixed in the test files: `tower-instances.test.ts` db mock needed `getDb`/`closeDb` (state.getArchitects uses `getDb()`, not the already-mocked `getGlobalDb` — without it the `safeToResume` guard threw and silently skipped the claude resume); `spawn-worktree.test.ts` needed a `../lib/tower-client.js` mock (getTowerClient stub) so `startBuilderSession`→`createPtySession` runs without a live Tower. + +The three affected files: 143 passed. Full suite running. Build green. + +### 2026-06-16 — Review phase + +dev-approval gate approved (human validated the running worktree). Re-confirmed build green + 146 tests pass across the 3 affected files. Wrote `codev/reviews/929-support-codex-and-gemini-clis-.md` (Summary / Files / Architecture Updates / Lessons Learned / Things to Look At). arch.md was already updated in the implement commit; routed one cold-tier lesson to `lessons-learned.md` (Architecture): audit **all** invocation seams — including resume/restart — when extending a provider abstraction; the resume branch is a separate call site Spec 591 left harness-blind. + +PR target: branch lives on fork `mohidmakhdoomi/codev` (origin push → fork; `builder/pir-929` confirmed on fork, absent on cluesmith). Opening PR within the fork (base `main`). Recording with porch, then `porch done` runs the single-pass CMAP-2 verify. + +CMAP-2 verdicts (iter1): gemini=APPROVE, codex=APPROVE, no issues. (`pr_exists` check first failed because `gh` had no default repo → defaulted to cluesmith where the branch doesn't live; fixed with `gh repo set-default mohidmakhdoomi/codev`.) Gate fired pending; architect notified. + +### 2026-06-16 — Architect integration review: REQUEST CHANGES (2 blockers + 3 nits) + +Architect reviewed the running integration and returned a punch list. All addressed: + +- **BLOCKER B (override re-arms crash-loop)**: `getArchitectHarness`/`getBuilderHarness` auto-detected the harness from the *raw config* command, ignoring `TOWER_ARCHITECT_CMD` / `--architect-cmd` / `--builder-cmd`. So `TOWER_ARCHITECT_CMD=gemini` launched gemini but resolved claude → buildResume injects `--resume ` → the exact crash-loop. Fix: both resolvers now source the command from `getResolvedCommands` (override-aware), and `getResolvedCommands.architect` now includes `TOWER_ARCHITECT_CMD`. Regression tests in `config.test.ts`. +- **BLOCKER A (gemini context path-dependent)**: `getArchitectFiles` was written inline only in `launchInstance`, so sibling/no-Tower/reconnect gemini launches were context-blind. Fix: centralized into `buildArchitectArgs` (the shared helper all 4 Tower paths route through) as exported `writeArchitectContextFiles`; refactored the no-Tower `architect.ts` to call `buildArchitectArgs` instead of duplicating injection. Now every launch path writes it. Regression tests in `tower-utils.test.ts`. +- **NIT 1**: `safeToResume` WARN now gated on `architectHarness.buildResume` (no noise for codex/gemini, which never resume). +- **NIT 2**: `discoverResumeSession` logs "This harness does not support conversation resume" when `buildResume` is undefined, distinct from "No prior conversation found". +- **NIT 3**: `.gemini/settings.json` added to both root `.gitignore` and the managed adopter `CODEV_GITIGNORE_ENTRIES` (gitignore.ts); gitignore/update test expectations updated. + +Full suite: 3338 passed, 48 skipped, 0 failures. Build green. Re-running `porch done` for a fresh single CMAP-2 pass, then re-requesting the pr gate. + +Porch quirk: re-running `porch done` landed straight back at `gate_pending` (iter1) without re-issuing consult tasks — because the architect's REQUEST_CHANGES arrived via `afx send`, not a porch gate rejection, so the iteration never advanced. Ran a fresh advisory 2-way consult manually on the updated diff (iter2 outputs) since porch wouldn't re-trigger it. + +iter2 verdicts: gemini=APPROVE (HIGH), codex=REQUEST_CHANGES (MEDIUM, false positive: "plan lacks approval frontmatter"). Rebutted — the frontmatter convention is for architect-pre-approved artifacts; this plan was builder-authored and approved via porch's `plan-approval` gate (status.yaml), and no merged plan in codev/plans (0001–0009) carries that frontmatter. Documented the rebuttal in the review's "Consultation Findings & Dispositions" section; escalating to the human at the pr gate. Gate remains pending. + +### 2026-06-16 — Architect re-review: APPROVE + PR retargeted upstream + +Architect verified all 5 integration-review items at source (affected tests 193/193 green) and confirmed codex's frontmatter finding is a false positive (rebuttal sound). **APPROVE.** + +PR retargeting: this is an **upstream contribution to cluesmith/codev**. Architect opened the correct cross-fork PR → **cluesmith/codev#1059** (head `mohidmakhdoomi:builder/pir-929` → base `cluesmith/codev:main`). Fork PR #1 CLOSED as superseded (wrong base = fork main). + +**MERGE POLICY: pull-only on cluesmith — do NOT self-merge. A cluesmith maintainer merges #1059.** The porch pr gate is the human's to approve as bookkeeping only (no `gh pr merge` from me). Bookkeeping done: reset gh default repo → cluesmith, recorded PR #1059 in porch pr_history. Nothing further to push unless review feedback lands on #1059. Waiting at the pr gate; on gate approval I will NOT merge — I'll record/close out per the no-self-merge policy. + +### 2026-06-17 — 3-way re-review: APPROVE-with-caveat (doc-only addendum) + +Architect ran a 3-way re-review: gemini APPROVE, claude APPROVE, codex REQUEST_CHANGES (2/3). Verdict: **APPROVE with one documented caveat**. + +Codex's lone (verified) finding: `resolveHarness` (harness.ts) defaults an *unrecognized* override command (e.g. `TOWER_ARCHITECT_CMD=bash`, a wrapper script) to `CLAUDE_HARNESS` when no explicit `shell.architectHarness`/`builderHarness` is set → can still build ` --resume ` against a stale claude jsonl. **Pre-existing, narrow, separable — not a #929 regression** (#929 strictly improved the recognized codex/gemini cases). Disposition: document + follow-up (cluesmith/codev#1062). + +Doc-only addendum (no code change, per architect instruction): added the caveat to arch.md's #929 subsection and the review's caveat section, noting the unrecognized-command default + mitigation (set explicit harness) + the #1062 follow-up. Commit + push flows into PR #1059; no gate re-request (doc addendum to an already-approved PR). Still no self-merge — maintainer merges #1059. + +### 2026-06-16/17 — pr gate APPROVED (human-instructed); protocol complete; NOT self-merged + +Human instructed `git pull` + `porch approve 929 pr --a-human-explicitly-approved-this`. `git pull` failed to fast-forward the local branch (origin fetch=cluesmith lacks builder/pir-929, which lives on the fork) but did fetch updates. `porch approve` passed all checks and committed the gate-approved state locally, but its `git push -u origin HEAD` was **rejected**: the architect had merged `main` into the fork's builder/pir-929 (commit 0fd70b72) to keep #1059 current, so the fork was ahead. Resolved by rebasing the single unpushed porch chore commit onto `fork/builder/pir-929` (only touches 929 status.yaml → no conflict) and pushing — fork now at `f337b61f`. Gate status: **approved**; `porch next 929` → status: complete, phase: verified. Per the standing pull-only instruction I did **NOT** run `gh pr merge` / pr-merge.sh — a cluesmith maintainer merges #1059. Architect notified. Project complete pending the upstream merge + architect-driven cleanup. + +### 2026-06-19 — RE-RESCOPE: CODEX-ONLY (agy dropped → #1063); plan revised at plan-approval gate + +Resumed session. Architect re-scoped 929 to **codex-only**: agy is dropped entirely (split to +follow-up #1063 — agy's only role-injection channel is its first user turn `--prompt-interactive`, +weaker/visible vs claude's `--append-system-prompt` / codex's `-c model_instructions_file=`). The plan +file on disk was the STALE agy version (commit 3082c186 "agy replaces gemini"); rewrote it codex-only. + +Verified against `gh issue view 929` (codex-only banner) + architect brief. Confirmed actual branch +state via grep before planning: +- `getArchitectFiles` seam: only implementer = `GEMINI_HARNESS` (harness.ts:135); only consumer = + `writeArchitectContextFiles` (tower-utils.ts:181), itself called only from `buildArchitectArgs` + (:205). ⇒ removing gemini-architect makes the whole seam dead ⇒ delete it (interface method + + gemini impl + writeArchitectContextFiles + its call). `buildArchitectArgs` keeps `getArchitectHarness` + (still used by `buildRoleInjection` at :213). +- `.gemini/settings.json` gitignore: root `.gitignore:11` + `lib/gitignore.ts:24` + test expectations + in gitignore.test.ts (140,211) & update.test.ts (530,550). No codev-skeleton mirror. +- doctor.ts:699 affirms `codex || gemini` → split: affirm codex, bar gemini as architect (builder-only). + +**KEEP intact** (engine-neutral, the core deliverable): `buildResume` crash-loop seam + codex architect +parity. **REVERT**: #1059's gemini-architect additions. **KEEP**: gemini *builder* surface +(`GEMINI_SYSTEM_MD`) + gemini builder tests. + +Plan rewritten, committing + pushing to `builder/pir-929` (PR #1059 auto-updates). Parked at +plan-approval — NOT implementing until architect approves. + +### 2026-06-19 — Implement phase (codex-only delta) + +Implemented the subtractive gemini-architect removal. Three commits: +1. `a6b42daa` — removed the `getArchitectFiles` HarnessProvider method (gemini = only impl) + + `writeArchitectContextFiles` (only consumer) + their `buildArchitectArgs` wiring; dropped the now- + unused `HarnessProvider` import in tower-utils.ts; tidied codex/gemini→codex comments at the + architect resume site. buildResume seam + codex parity untouched. +2. `0d920884` — doctor: split codex||gemini affirmation → affirm codex / warn gemini-builder-only-not- + architect (#778). Removed `.gemini/settings.json` from root `.gitignore` + `CODEV_GITIGNORE_ENTRIES`. +3. `be84f75a` — tests + arch.md: removed writeArchitectContextFiles + gemini settings.json architect + tests; retargeted the `TOWER_ARCHITECT_CMD` env-override resolution test to codex (kept the + override-aware-resolution coverage, dropped the dead getArchitectFiles assertion); dropped gemini + resume-skip architect test (codex case already guards the regression); gitignore-backfill + expectations now `['.architect-role.md']`. arch.md #929 subsection → claude+codex architects, + gemini builder-only, seam removed. CLAUDE.md/AGENTS.md untouched (no gemini-architect claim there). + +Grep-clean: 0 refs to `getArchitectFiles` / `writeArchitectContextFiles` / `.gemini/settings.json` +anywhere. Diff is purely subtractive (38 ins / 166 del). Build green. Full suite: 3332 passed, 48 +skipped, 0 failures. Pushing to fork; signalling dev-approval. + +### 2026-06-19 — Review phase: codex-only review file + full 3-way CMAP + +Refreshed `codev/reviews/929-*.md` to the net codex-only delivered state (PR body), pushed, updated +PR #1059 title/body (cluesmith/codev#1059), recorded PR in porch pr_history. + +Per architect instruction, running a FULL 3-way CMAP (not the protocol-default 2-way) over the whole +PR — whole-file review, not diff hunks. Used `--prompt-file /tmp/cmap-929-prompt.md` (since `--type pr` +is diff-centric and `--prompt`/`--type` are mutually exclusive) instructing each reviewer to read every +changed file in full + callers/deps and assess the full PR as it stands. All three (gemini/codex/claude) +launched in parallel + background → /tmp/cmap-929-{gemini,codex,claude}.md. Advisory single pass. +Awaiting verdicts; will report all three to architect leading with any REQUEST_CHANGES + disposition. +Will NOT self-merge — stop at pr gate. + +### 2026-06-19 — Full 3-way CMAP verdicts (advisory, single pass) + +- **codex: APPROVE** — no blockers; crash-loop fix correct, resume gating centralized behind buildResume. + 3 non-blocking coverage nits. +- **claude: APPROVE** — no blockers; whole-file trace confirms codex/gemini can never reach + `--resume ` at either site, seam removal clean (0 dangling refs), regression pinned 4 ways. + 4 non-blocking obs. +- **gemini (agy): no usable verdict** — returned a greeting in 8.1s, didn't review (agy structural + limitation, the #1063 reason). Recorded as unavailable. + +Net: 2/2 substantive reviewers APPROVE, zero REQUEST_CHANGES, zero blocking defects → no code change. +Consensus non-blocking nit (doctor architect-shell branch + codex no-Tower path lack direct tests): +deferred — trivial branch logic verified-correct by both reviewers, and there's no existing test harness +for the doctor shell.architect block (even the long-standing opencode branch is untested), so it's +net-new infra beyond this subtractive PR. Recorded dispositions in review's Consultation section, +committing + pushing. Then advancing porch to the pr gate; NOT self-merging. + +### 2026-06-19 — pr gate PENDING (parked, not self-merging) + +`porch done 929` → structural checks green (pr_exists, review_has_arch_updates, review_has_lessons_updates); +`porch next 929` → gate_pending on **pr**. Porch's recorded verify verdicts: gemini=APPROVE, codex=APPROVE +(porch's own block / cached). My authoritative LIVE manual 3-way (reported to architect): codex=APPROVE, +claude=APPROVE, gemini(agy)=unavailable. Zero REQUEST_CHANGES, zero blocking defects either way. + +PR body synced to PR #1059. Parked at the pr gate. Per the standing upstream policy (cluesmith/codev#1059 +is a cross-fork contribution: head mohidmakhdoomi:builder/pir-929 → base cluesmith:main) and this session's +architect instruction ("the pr gate is mine/the maintainer's; do NOT self-merge"), I will NOT run +`gh pr merge`. Waiting for human pr-gate approval + maintainer merge. Architect notified with all verdicts. + +### 2026-06-19 — pr gate APPROVED → protocol COMPLETE (not self-merged) + +Human approved the pr gate (bookkeeping) and posted the architect integration review on #1059 → APPROVE +for merge. The gate approval advanced porch to `status: complete, phase: verified`. Ran `porch done 929` +per architect instruction → "Project 929 already verified — nothing to do." NO `--merged` recorded and +NO `gh pr merge` run: #1059 is a cross-fork PR (mohidmakhdoomi:builder/pir-929 → cluesmith:main); a +cluesmith maintainer merges it. Protocol complete. Awaiting upstream merge + architect-driven cleanup. diff --git a/packages/codev/src/agent-farm/__tests__/af-architect.test.ts b/packages/codev/src/agent-farm/__tests__/af-architect.test.ts index 8941b15cc..e55f52320 100644 --- a/packages/codev/src/agent-farm/__tests__/af-architect.test.ts +++ b/packages/codev/src/agent-farm/__tests__/af-architect.test.ts @@ -14,10 +14,12 @@ vi.mock('node:child_process', () => ({ spawn: (...args: unknown[]) => mockSpawn(...args), })); -// Mock fs.writeFileSync -vi.mock('node:fs', () => ({ - writeFileSync: vi.fn(), -})); +// Mock fs — buildArchitectArgs (tower-utils) uses a default `import fs from 'node:fs'`, +// architect.ts historically used the named export, so provide both. +vi.mock('node:fs', () => { + const fns = { writeFileSync: vi.fn(), existsSync: vi.fn(() => false), mkdirSync: vi.fn() }; + return { ...fns, default: fns }; +}); // Mock config — include getArchitectHarness vi.mock('../utils/index.js', () => ({ @@ -34,6 +36,13 @@ vi.mock('../utils/index.js', () => ({ getArchitectHarness: () => CLAUDE_HARNESS, })); +// architect() now delegates role injection to buildArchitectArgs (tower-utils), +// which resolves the harness via config.js directly (not the index.js barrel +// mocked above). Mock that seam too so the unit test stays filesystem-free. +vi.mock('../utils/config.js', () => ({ + getArchitectHarness: () => CLAUDE_HARNESS, +})); + // Mock role loading vi.mock('../utils/roles.js', () => ({ loadRolePrompt: vi.fn(() => ({ diff --git a/packages/codev/src/agent-farm/__tests__/config.test.ts b/packages/codev/src/agent-farm/__tests__/config.test.ts index 98e471b8e..295091e9a 100644 --- a/packages/codev/src/agent-farm/__tests__/config.test.ts +++ b/packages/codev/src/agent-farm/__tests__/config.test.ts @@ -3,7 +3,13 @@ */ import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; -import { getConfig, ensureDirectories } from '../utils/config.js'; +import { + getConfig, + ensureDirectories, + getArchitectHarness, + getBuilderHarness, + setCliOverrides, +} from '../utils/config.js'; import { existsSync } from 'node:fs'; import { rm, mkdir } from 'node:fs/promises'; import { resolve } from 'node:path'; @@ -89,3 +95,48 @@ describe('ensureDirectories', () => { await expect(ensureDirectories(testConfig)).resolves.not.toThrow(); }); }); + +// Issue #929 — harness resolution must be override-aware. The mocked config +// above resolves both shells to `claude` with NO explicit *Harness, so the +// harness is auto-detected from the resolved command. A command override +// (TOWER_ARCHITECT_CMD / --architect-cmd / --builder-cmd) without a matching +// harness config previously still resolved the CLAUDE harness, whose buildResume +// would inject `--resume ` into the non-claude command and +// crash-loop. `buildResume` being undefined is the precise property that makes +// codex/gemini relaunch fresh — so it's the regression assertion. +describe('getArchitectHarness / getBuilderHarness override-awareness (#929)', () => { + const savedArchitectCmd = process.env.TOWER_ARCHITECT_CMD; + + afterEach(() => { + setCliOverrides({}); + if (savedArchitectCmd === undefined) { + delete process.env.TOWER_ARCHITECT_CMD; + } else { + process.env.TOWER_ARCHITECT_CMD = savedArchitectCmd; + } + }); + + it('resolves the claude harness (buildResume defined) with no overrides', () => { + delete process.env.TOWER_ARCHITECT_CMD; + setCliOverrides({}); + expect(getArchitectHarness().buildResume).toBeDefined(); + expect(getBuilderHarness().buildResume).toBeDefined(); + }); + + it('TOWER_ARCHITECT_CMD=codex → codex architect harness (no claude resume)', () => { + process.env.TOWER_ARCHITECT_CMD = 'codex'; + const harness = getArchitectHarness(); + expect(harness.buildResume).toBeUndefined(); + }); + + it('--architect-cmd codex → codex architect harness (no claude resume)', () => { + delete process.env.TOWER_ARCHITECT_CMD; + setCliOverrides({ architect: 'codex' }); + expect(getArchitectHarness().buildResume).toBeUndefined(); + }); + + it('--builder-cmd gemini → gemini builder harness (no claude resume)', () => { + setCliOverrides({ builder: 'gemini' }); + expect(getBuilderHarness().buildResume).toBeUndefined(); + }); +}); diff --git a/packages/codev/src/agent-farm/__tests__/discover-resume-session.test.ts b/packages/codev/src/agent-farm/__tests__/discover-resume-session.test.ts index c7c57bcaa..e44c47e72 100644 --- a/packages/codev/src/agent-farm/__tests__/discover-resume-session.test.ts +++ b/packages/codev/src/agent-farm/__tests__/discover-resume-session.test.ts @@ -1,7 +1,12 @@ /** * Tests for the discoverResumeSession helper — the spawn-CLI wrapper that - * gates findLatestSessionId on the --resume flag and surfaces a user-facing - * log line. (Issues #829 / #831.) + * gates the harness's buildResume on the --resume flag and surfaces a + * user-facing log line. (Issues #829 / #831 / #929.) + * + * Issue #929: resume is now gated on the builder harness, not the Claude + * session store directly. Only the Claude harness implements buildResume; + * codex/gemini return undefined even when a stale Claude jsonl exists (the + * regression guard against `codex --resume ` crash-loops). */ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; @@ -11,6 +16,7 @@ import { join } from 'node:path'; import { discoverResumeSession } from '../commands/spawn.js'; import { encodeClaudeProjectDir } from '../utils/claude-session-discovery.js'; +import { CLAUDE_HARNESS, CODEX_HARNESS, GEMINI_HARNESS } from '../utils/harness.js'; // discoverResumeSession reads from $HOME via os.homedir() through // findLatestSessionId. Override the env var for the duration of the test so @@ -54,7 +60,7 @@ describe('discoverResumeSession', () => { const worktree = '/Users/x/repo/.builders/spir-1'; writeSession(projectsRoot, worktree, 'should-not-pick', 1_700_000_000_000); pinHome(fakeHome, () => { - expect(discoverResumeSession(worktree, false)).toBeUndefined(); + expect(discoverResumeSession(worktree, false, CLAUDE_HARNESS)).toBeUndefined(); }); }); @@ -62,23 +68,46 @@ describe('discoverResumeSession', () => { const worktree = '/Users/x/repo/.builders/spir-2'; writeSession(projectsRoot, worktree, 'should-not-pick', 1_700_000_000_000); pinHome(fakeHome, () => { - expect(discoverResumeSession(worktree, undefined)).toBeUndefined(); + expect(discoverResumeSession(worktree, undefined, CLAUDE_HARNESS)).toBeUndefined(); }); }); it('returns undefined when isResume is true but no jsonl exists', () => { const worktree = '/Users/x/repo/.builders/spir-3-no-jsonl'; pinHome(fakeHome, () => { - expect(discoverResumeSession(worktree, true)).toBeUndefined(); + expect(discoverResumeSession(worktree, true, CLAUDE_HARNESS)).toBeUndefined(); }); }); - it('returns the newest jsonl UUID when isResume is true and jsonls exist', () => { + it('returns the newest jsonl resume object (claude) when isResume is true and jsonls exist', () => { const worktree = '/Users/x/repo/.builders/pir-1661'; writeSession(projectsRoot, worktree, 'older-uuid', 1_500_000_000_000); writeSession(projectsRoot, worktree, 'newest-uuid', 1_700_000_000_000); pinHome(fakeHome, () => { - expect(discoverResumeSession(worktree, true)).toBe('newest-uuid'); + const resume = discoverResumeSession(worktree, true, CLAUDE_HARNESS); + expect(resume).toEqual({ + sessionId: 'newest-uuid', + args: ['--resume', 'newest-uuid'], + scriptFragment: "--resume 'newest-uuid'", + }); + }); + }); + + it('returns undefined for codex even when a stale Claude jsonl exists (regression guard)', () => { + // The crash-loop bug: a codex builder must NOT pick up a Claude session id + // and build `codex --resume `. CODEX_HARNESS has no buildResume. + const worktree = '/Users/x/repo/.builders/pir-codex'; + writeSession(projectsRoot, worktree, 'stale-claude-uuid', 1_700_000_000_000); + pinHome(fakeHome, () => { + expect(discoverResumeSession(worktree, true, CODEX_HARNESS)).toBeUndefined(); + }); + }); + + it('returns undefined for gemini even when a stale Claude jsonl exists (regression guard)', () => { + const worktree = '/Users/x/repo/.builders/pir-gemini'; + writeSession(projectsRoot, worktree, 'stale-claude-uuid', 1_700_000_000_000); + pinHome(fakeHome, () => { + expect(discoverResumeSession(worktree, true, GEMINI_HARNESS)).toBeUndefined(); }); }); @@ -86,7 +115,7 @@ describe('discoverResumeSession', () => { // Negative case: isResume=false short-circuits before any filesystem // access happens. Tests pass even if HOME points at /nonexistent. pinHome('/nonexistent-home-path', () => { - expect(discoverResumeSession('/some/worktree', false)).toBeUndefined(); + expect(discoverResumeSession('/some/worktree', false, CLAUDE_HARNESS)).toBeUndefined(); }); }); }); diff --git a/packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts b/packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts index d042ddc59..8d3809dec 100644 --- a/packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts +++ b/packages/codev/src/agent-farm/__tests__/spawn-worktree.test.ts @@ -7,7 +7,7 @@ import { describe, it, expect, vi, beforeEach } from 'vitest'; import { - slugify, buildWorktreeLaunchScript, + slugify, buildWorktreeLaunchScript, startBuilderSession, checkDependencies, createWorktree, createWorktreeFromBranch, validateBranchName, validateRemoteName, detectForkRemote, symlinkConfigFiles, @@ -17,6 +17,9 @@ import { validateResumeWorktree, initPorchInWorktree, type GitHubIssue, } from '../commands/spawn-worktree.js'; import { DEFAULT_TOWER_PORT } from '../lib/tower-client.js'; +// node:fs is mocked below; import writeFileSync so resume-script assertions can +// read its captured calls without a per-test dynamic import. +import { writeFileSync } from 'node:fs'; // Mock dependencies vi.mock('node:fs', async (importOriginal) => { @@ -78,6 +81,18 @@ vi.mock('glob', () => ({ globSync: (...args: unknown[]) => globSyncMock(...args), })); +// Mock the Tower REST client so startBuilderSession can run without a live +// Tower (createPtySession → getTowerClient().createTerminal). DEFAULT_TOWER_PORT +// is preserved from the real module (asserted elsewhere in this file). +const createTerminalMock = vi.fn().mockResolvedValue({ id: 'term-test' }); +vi.mock('../lib/tower-client.js', async (importOriginal) => { + const actual = await importOriginal(); + return { + ...actual, + getTowerClient: () => ({ createTerminal: createTerminalMock }), + }; +}); + describe('spawn-worktree', () => { beforeEach(() => { vi.clearAllMocks(); @@ -294,6 +309,64 @@ describe('spawn-worktree', () => { }); }); + // ========================================================================= + // startBuilderSession — resume script shape (Issue #929) + // + // The crash-loop lived in the builder restart-loop script: a resumed builder + // baked ` --resume ""` into .builder-start.sh. The fix threads a + // harness-provided, pre-escaped `scriptFragment` through instead of + // re-deriving the flag (which risked a comma-joined / word-split argv), and + // codex/gemini builders never reach this branch (resume === undefined → fresh + // role-injected script, no --resume). + // ========================================================================= + + describe('startBuilderSession resume script (Issue #929)', () => { + function findScript(): string | undefined { + const writeCalls = vi.mocked(writeFileSync).mock.calls; + const scriptCall = writeCalls.find( + call => typeof call[0] === 'string' && call[0].endsWith('.builder-start.sh'), + ); + return scriptCall ? (scriptCall[1] as string) : undefined; + } + + it('resume → script uses the escaped scriptFragment, no prompt/role injection', async () => { + const resume = { sessionId: 'abc-1234-uuid', scriptFragment: "--resume 'abc-1234-uuid'" }; + await startBuilderSession( + { workspaceRoot: '/tmp/ws' } as any, + 'pir-1', '/tmp/worktree', 'claude', + 'PROMPT', 'ROLE', 'codev', resume, + ); + + const script = findScript(); + expect(script).toBeDefined(); + // Exact escaped fragment appended verbatim — not an unquoted or + // comma-joined argv form. + expect(script).toContain("claude --resume 'abc-1234-uuid'"); + expect(script).not.toContain('--resume,'); + expect(script).toContain('while true'); + // Resume path skips role injection and the prompt file. + expect(script).not.toContain('--append-system-prompt'); + const writeCalls = vi.mocked(writeFileSync).mock.calls; + const promptCall = writeCalls.find( + call => typeof call[0] === 'string' && call[0].endsWith('.builder-prompt.txt'), + ); + expect(promptCall).toBeUndefined(); + }); + + it('no resume + role → fresh role-injected script, no --resume', async () => { + await startBuilderSession( + { workspaceRoot: '/tmp/ws' } as any, + 'pir-2', '/tmp/worktree', 'claude', + 'PROMPT', 'ROLE {PORT}', 'codev', + ); + + const script = findScript(); + expect(script).toBeDefined(); + expect(script).not.toContain('--resume'); + expect(script).toContain('--append-system-prompt'); + }); + }); + // ========================================================================= // Collision Detection (unit-level) // ========================================================================= diff --git a/packages/codev/src/agent-farm/__tests__/tower-instances.test.ts b/packages/codev/src/agent-farm/__tests__/tower-instances.test.ts index 50b852a89..597a23a13 100644 --- a/packages/codev/src/agent-farm/__tests__/tower-instances.test.ts +++ b/packages/codev/src/agent-farm/__tests__/tower-instances.test.ts @@ -24,6 +24,7 @@ import { waitForTerminalExit, type InstanceDeps, } from '../servers/tower-instances.js'; +import { encodeClaudeProjectDir } from '../utils/claude-session-discovery.js'; // --------------------------------------------------------------------------- // Mocks (vi.hoisted ensures these exist before vi.mock factories run) @@ -48,6 +49,11 @@ const { vi.mock('../db/index.js', () => ({ getGlobalDb: () => ({ prepare: mockDbPrepare }), + // state.getArchitects() (used by launchInstance's safeToResume guard) calls + // getDb(), not getGlobalDb — mock it too so the guard doesn't throw and + // default to skip-resume. (Issue #929 resume regression tests depend on it.) + getDb: () => ({ prepare: mockDbPrepare }), + closeDb: () => {}, })); vi.mock('../servers/tower-utils.js', async () => { @@ -593,6 +599,102 @@ describe('tower-instances', () => { }); }); + // ========================================================================= + // Issue #929 — architect resume is gated on the configured harness + // + // Regression guard for the crash-loop: a codex/gemini architect with a stale + // Claude jsonl in ~/.claude/projects// must NOT launch + // ` --resume `. Only the Claude harness implements + // buildResume, so codex/gemini relaunch fresh (role-injected) instead. + // ========================================================================= + + describe('Issue #929 — architect resume gated on harness', () => { + function writeStaleClaudeSession(fakeHome: string, cwd: string, uuid: string): void { + const dir = path.join(fakeHome, '.claude', 'projects', encodeClaudeProjectDir(cwd)); + fs.mkdirSync(dir, { recursive: true }); + fs.writeFileSync(path.join(dir, `${uuid}.jsonl`), `{"sessionId":"${uuid}"}\n`); + } + + function makeCapturingDeps(): { deps: InstanceDeps; createSession: ReturnType } { + const createSession = vi.fn().mockResolvedValue({ id: 'test-session', pid: 1234 }); + const deps = makeDeps({ + getTerminalManager: vi.fn().mockReturnValue({ + getSession: vi.fn(), + killSession: vi.fn(), + createSession, + createSessionRaw: vi.fn(), + listSessions: vi.fn().mockReturnValue([]), + }) as any, + }); + return { deps, createSession }; + } + + // Pin HOME so the stale-jsonl lookup (os.homedir() inside + // findLatestSessionId) reads our fake store, not the real user's. + function withSetup( + configJson: Record | null, + fn: (tmpDir: string, fakeHome: string, uuid: string) => Promise, + ): () => Promise { + return async () => { + const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'tower-929-')); + const fakeHome = fs.mkdtempSync(path.join(os.tmpdir(), 'tower-929-home-')); + const originalHome = process.env.HOME; + const uuid = 'stale-claude-uuid-1234'; + try { + fs.mkdirSync(path.join(tmpDir, 'codev')); + if (configJson) { + fs.mkdirSync(path.join(tmpDir, '.codev'), { recursive: true }); + fs.writeFileSync( + path.join(tmpDir, '.codev', 'config.json'), + JSON.stringify(configJson), + ); + } + writeStaleClaudeSession(fakeHome, tmpDir, uuid); + process.env.HOME = fakeHome; + await fn(tmpDir, fakeHome, uuid); + } finally { + if (originalHome === undefined) delete process.env.HOME; + else process.env.HOME = originalHome; + fs.rmSync(tmpDir, { recursive: true, force: true }); + fs.rmSync(fakeHome, { recursive: true, force: true }); + } + }; + } + + it('codex architect + stale Claude jsonl → launches fresh, no --resume', withSetup( + { shell: { architect: 'codex' } }, + async (tmpDir, _fakeHome, uuid) => { + const { deps, createSession } = makeCapturingDeps(); + initInstances(deps); + + const result = await launchInstance(tmpDir); + expect(result.success).toBe(true); + expect(createSession).toHaveBeenCalled(); + + const callStr = JSON.stringify(createSession.mock.calls[0]); + expect(callStr).not.toContain('--resume'); + expect(callStr).not.toContain(uuid); + }, + )); + + it('claude architect (default) + stale Claude jsonl → resumes with --resume ', withSetup( + null, + async (tmpDir, _fakeHome, uuid) => { + const { deps, createSession } = makeCapturingDeps(); + initInstances(deps); + + const result = await launchInstance(tmpDir); + expect(result.success).toBe(true); + expect(createSession).toHaveBeenCalled(); + + const callStr = JSON.stringify(createSession.mock.calls[0]); + expect(callStr).toContain('--resume'); + expect(callStr).toContain(uuid); + }, + )); + + }); + // ========================================================================= // killTerminalWithShellper // ========================================================================= diff --git a/packages/codev/src/agent-farm/__tests__/tower-utils.test.ts b/packages/codev/src/agent-farm/__tests__/tower-utils.test.ts index d78480000..f8e3de77f 100644 --- a/packages/codev/src/agent-farm/__tests__/tower-utils.test.ts +++ b/packages/codev/src/agent-farm/__tests__/tower-utils.test.ts @@ -197,4 +197,5 @@ describe('tower-utils', () => { } }); }); + }); diff --git a/packages/codev/src/agent-farm/commands/architect.ts b/packages/codev/src/agent-farm/commands/architect.ts index 419d11db1..c0811b364 100644 --- a/packages/codev/src/agent-farm/commands/architect.ts +++ b/packages/codev/src/agent-farm/commands/architect.ts @@ -7,10 +7,8 @@ */ import { spawn } from 'node:child_process'; -import { writeFileSync } from 'node:fs'; -import { resolve } from 'node:path'; -import { getConfig, getResolvedCommands, getArchitectHarness } from '../utils/index.js'; -import { loadRolePrompt } from '../utils/roles.js'; +import { getConfig, getResolvedCommands } from '../utils/index.js'; +import { buildArchitectArgs } from '../servers/tower-utils.js'; export interface ArchitectOptions { args?: string[]; @@ -23,25 +21,14 @@ export async function architect(options: ArchitectOptions = {}): Promise { const config = getConfig(); const commands = getResolvedCommands(); - const args = [...(options.args || [])]; - let env: Record = {}; - - // Load and inject the architect role prompt via harness - const role = loadRolePrompt(config, 'architect'); - if (role) { - const roleFilePath = resolve(config.workspaceRoot, '.architect-role.md'); - writeFileSync(roleFilePath, role.content); - - const harness = getArchitectHarness(config.workspaceRoot); - const injection = harness.buildRoleInjection(role.content, roleFilePath); - args.push(...injection.args); - env = injection.env; - } - // Split command string into executable + initial args (supports e.g. "claude --dangerously-skip-permissions") const cmdParts = commands.architect.split(/\s+/); const cmd = cmdParts[0]; - const allArgs = [...cmdParts.slice(1), ...args]; + + // Inject the architect role via the shared launch helper, so the no-Tower + // path matches every Tower launch path (Issue #929). + const baseArgs = [...cmdParts.slice(1), ...(options.args || [])]; + const { args: allArgs, env } = buildArchitectArgs(baseArgs, config.workspaceRoot); return new Promise((resolve, reject) => { const child = spawn(cmd, allArgs, { diff --git a/packages/codev/src/agent-farm/commands/spawn-worktree.ts b/packages/codev/src/agent-farm/commands/spawn-worktree.ts index a0eb5e91f..afe3a24c1 100644 --- a/packages/codev/src/agent-farm/commands/spawn-worktree.ts +++ b/packages/codev/src/agent-farm/commands/spawn-worktree.ts @@ -715,11 +715,13 @@ function writeWorktreeFiles( /** * Start a terminal session for a builder. * - * When `resumeSessionId` is provided, the launch script invokes - * `claude --resume ` instead of a fresh prompt+role invocation. The - * saved Claude conversation contains the system prompt / role context - * already, so role injection and the initial prompt are intentionally - * skipped on that path. + * When `resume` is provided, the launch script invokes the harness's resume + * form (e.g. `claude --resume `) via the pre-escaped `scriptFragment` + * instead of a fresh prompt+role invocation. The saved conversation contains + * the system prompt / role context already, so role injection and the initial + * prompt are intentionally skipped on that path. Only the Claude harness + * produces a resume object (Issue #929); codex/gemini pass `undefined` here + * and take the fresh role-injection path. */ export async function startBuilderSession( config: Config, @@ -729,21 +731,22 @@ export async function startBuilderSession( prompt: string, roleContent: string | null, roleSource: string | null, - resumeSessionId?: string, + resume?: { sessionId: string; scriptFragment: string }, ): Promise<{ terminalId: string }> { logger.info('Creating terminal session...'); const scriptPath = resolve(worktreePath, '.builder-start.sh'); let scriptContent: string; - if (resumeSessionId) { - // Resume path: load the prior Claude conversation by UUID. No prompt file, - // no role injection — both are already part of the saved conversation. - logger.info(`Resuming Claude session ${resumeSessionId.slice(0, 8)}…`); + if (resume) { + // Resume path: load the prior conversation via the harness-provided, + // shell-escaped resume fragment. No prompt file, no role injection — both + // are already part of the saved conversation. + logger.info(`Resuming session ${resume.sessionId.slice(0, 8)}…`); scriptContent = `#!/bin/bash cd "${worktreePath}" while true; do - ${baseCmd} --resume "${resumeSessionId}" + ${baseCmd} ${resume.scriptFragment} echo "" echo "Agent exited. Restarting in 2 seconds... (Ctrl+C to quit)" sleep 2 diff --git a/packages/codev/src/agent-farm/commands/spawn.ts b/packages/codev/src/agent-farm/commands/spawn.ts index 3f1f86957..217398f43 100644 --- a/packages/codev/src/agent-farm/commands/spawn.ts +++ b/packages/codev/src/agent-farm/commands/spawn.ts @@ -17,7 +17,8 @@ import { resolve, basename } from 'node:path'; import { existsSync, writeFileSync, readdirSync } from 'node:fs'; import type { SpawnOptions, BuilderType, Config } from '../types.js'; -import { getConfig, ensureDirectories, getResolvedCommands } from '../utils/index.js'; +import { getConfig, ensureDirectories, getResolvedCommands, getBuilderHarness } from '../utils/index.js'; +import type { HarnessProvider } from '../utils/harness.js'; import { logger, fatal } from '../utils/logger.js'; import { run } from '../utils/shell.js'; import { hasUncommittedTrackedChanges } from '../utils/git.js'; @@ -37,7 +38,6 @@ const SPAWNING_ARCHITECT_NAME = (process.env.CODEV_ARCHITECT_NAME && process.env.CODEV_ARCHITECT_NAME.trim()) || DEFAULT_ARCHITECT_NAME; import { loadRolePrompt } from '../utils/roles.js'; import { buildAgentName, stripLeadingZeros } from '../utils/agent-names.js'; -import { findLatestSessionId } from '../utils/claude-session-discovery.js'; import { fetchIssue as fetchIssueNonFatal } from '../../lib/github.js'; import { type TemplateContext, @@ -74,20 +74,32 @@ import { executeForgeCommand, loadForgeConfig } from '../../lib/forge.js'; // ============================================================================= /** - * On --resume, look up the prior Claude conversation jsonl for the worktree - * so the revived builder can pick up the saved conversation via - * `claude --resume ` instead of starting fresh with a resume-notice - * prompt. (Issue #831.) Returns undefined when not resuming or when no - * jsonl exists; callers fall back to the fresh-spawn path in that case. + * On --resume, ask the builder's harness for a resumable prior session for the + * worktree so the revived builder can pick up the saved conversation (e.g. + * `claude --resume `) instead of starting fresh with a resume-notice + * prompt. (Issues #831 / #929.) Returns undefined when not resuming, when the + * harness has no resumable session, or when the harness doesn't support resume + * (codex/gemini → buildResume undefined); callers fall back to the fresh-spawn + * path in that case. Returning the harness's bundled resume object (with the + * shell-escaped `scriptFragment`) keeps the resume flag-shape owned by the + * provider rather than the bash-script generator. */ -export function discoverResumeSession(worktreePath: string, isResume: boolean | undefined): string | undefined { +export function discoverResumeSession( + worktreePath: string, + isResume: boolean | undefined, + harness: HarnessProvider, +): { sessionId: string; args: string[]; scriptFragment: string } | undefined { if (!isResume) return undefined; - const found = findLatestSessionId(worktreePath); - if (found) { - logger.kv('Claude session', `${found.slice(0, 8)}… (resuming conversation)`); - return found; + if (!harness.buildResume) { + logger.info('This harness does not support conversation resume; starting a fresh session.'); + return undefined; } - logger.info('No prior Claude conversation found for this worktree; starting a fresh session.'); + const resume = harness.buildResume(worktreePath); + if (resume) { + logger.kv('Session', `${resume.sessionId.slice(0, 8)}… (resuming conversation)`); + return resume; + } + logger.info('No prior conversation found for this worktree; starting a fresh session.'); return undefined; } @@ -456,7 +468,7 @@ async function spawnSpec(options: SpawnOptions, config: Config): Promise { templateContext.existing_branch = options.branch; } - const resumeSessionId = discoverResumeSession(worktreePath, options.resume); + const resume = discoverResumeSession(worktreePath, options.resume, getBuilderHarness(config.workspaceRoot)); const initialPrompt = buildPromptFromTemplate(config, protocol, templateContext); const resumeNotice = options.resume ? `\n${buildResumeNotice(projectId)}\n` : ''; @@ -470,7 +482,7 @@ async function spawnSpec(options: SpawnOptions, config: Config): Promise { const { terminalId } = await startBuilderSession( config, builderId, worktreePath, commands.builder, builderPrompt, role?.content ?? null, role?.source ?? null, - resumeSessionId, + resume, ); upsertBuilder({ @@ -835,13 +847,13 @@ async function spawnIssueDrivenBuilder( : ''; const builderPrompt = `You are a Builder. Read codev/roles/builder.md for your full role definition.\n${resumeNotice}${branchNotice}\n${prompt}`; - const resumeSessionId = discoverResumeSession(worktreePath, options.resume); + const resume = discoverResumeSession(worktreePath, options.resume, getBuilderHarness(config.workspaceRoot)); const role = options.noRole ? null : loadRolePrompt(config, 'builder'); const commands = getResolvedCommands(); const { terminalId } = await startBuilderSession( config, builderId, worktreePath, commands.builder, builderPrompt, role?.content ?? null, role?.source ?? null, - resumeSessionId, + resume, ); upsertBuilder({ diff --git a/packages/codev/src/agent-farm/servers/tower-instances.ts b/packages/codev/src/agent-farm/servers/tower-instances.ts index b6dab2837..dfcbb799a 100644 --- a/packages/codev/src/agent-farm/servers/tower-instances.ts +++ b/packages/codev/src/agent-farm/servers/tower-instances.ts @@ -13,7 +13,7 @@ import { exec } from 'node:child_process'; import { promisify } from 'node:util'; import { homedir } from 'node:os'; import { encodeWorkspacePath } from '../lib/tower-client.js'; -import { findLatestSessionId } from '../utils/claude-session-discovery.js'; +import { getArchitectHarness } from '../utils/config.js'; import { loadConfig } from '../../lib/config.js'; const execAsync = promisify(exec); @@ -466,10 +466,18 @@ export async function launchInstance(workspacePath: string): Promise<{ success: const cmdParts = architectCmd.split(/\s+/); const cmd = cmdParts[0]; - // Issue #830 (main architect only): if a prior Claude session exists - // for this workspace cwd, resume it instead of starting fresh. Role - // injection is skipped on the resume path — the saved conversation - // already contains the role/system prompt. + // Issue #830 (main architect only): if the configured harness exposes a + // resumable prior session for this workspace cwd, resume it instead of + // starting fresh. Role injection is skipped on the resume path — the + // saved conversation already contains the role/system prompt. + // + // Issue #929: resume is gated on the harness (via buildResume), not the + // Claude session store directly. Only the Claude harness implements + // buildResume (its sessions live at ~/.claude/projects//*.jsonl); + // codex returns null → fresh, role-injected launch. Previously this + // read findLatestSessionId() unconditionally, so a codex architect with + // any stale Claude jsonl built `codex --resume ` and shellper + // restart-looped to death. // // Lookup is unconditional here (unlike builders, where spawn.ts gates // discovery behind `options.resume`). The asymmetry is intentional: @@ -505,17 +513,24 @@ export async function launchInstance(workspacePath: string): Promise<{ success: // an unrelated jsonl. safeToResume = false; } - const resumeSessionId = safeToResume ? findLatestSessionId(workspacePath) : null; - if (!safeToResume) { + const architectHarness = getArchitectHarness(workspacePath); + const resume = safeToResume ? (architectHarness.buildResume?.(workspacePath) ?? null) : null; + // Only warn about a *skipped* resume when this harness actually supports + // resume (buildResume defined → claude). For codex, resume was never on + // the table, so the sibling-collision warning is just noise. + if (!safeToResume && architectHarness.buildResume) { _deps.log('WARN', `Skipping main architect conversation resume for ${workspacePath}: persisted sibling architects detected (or state.db unreadable); cannot disambiguate jsonl by cwd. See #832.`); } + let cmdArgs: string[]; let harnessEnv: Record; - if (resumeSessionId) { - cmdArgs = [...cmdParts.slice(1), '--resume', resumeSessionId]; + if (resume) { + cmdArgs = [...cmdParts.slice(1), ...resume.args]; harnessEnv = {}; - _deps.log('INFO', `Resuming main architect Claude session ${resumeSessionId.slice(0, 8)}… for ${workspacePath}`); + _deps.log('INFO', `Resuming main architect session ${resume.sessionId.slice(0, 8)}… for ${workspacePath}`); } else { + // Fresh launch — buildArchitectArgs injects the architect role. The + // resume path above is claude-only, which needs no role injection. const built = buildArchitectArgs(cmdParts.slice(1), workspacePath); cmdArgs = built.args; harnessEnv = built.env; diff --git a/packages/codev/src/agent-farm/servers/tower-utils.ts b/packages/codev/src/agent-farm/servers/tower-utils.ts index 3cd777d54..5f9ccbf7f 100644 --- a/packages/codev/src/agent-farm/servers/tower-utils.ts +++ b/packages/codev/src/agent-farm/servers/tower-utils.ts @@ -178,13 +178,14 @@ export function buildArchitectArgs(baseArgs: string[], workspacePath: string): { const bundledRolesDir = path.resolve(import.meta.dirname, '../../../skeleton/roles'); const config: RoleConfig = { codevDir, bundledRolesDir, workspaceRoot: workspacePath }; + const harness = getArchitectHarness(workspacePath); + const role = loadRolePrompt(config, 'architect'); if (!role) return { args: baseArgs, env: {} }; const roleFile = path.join(workspacePath, '.architect-role.md'); fs.writeFileSync(roleFile, role.content); - const harness = getArchitectHarness(workspacePath); const injection = harness.buildRoleInjection(role.content, roleFile); return { diff --git a/packages/codev/src/agent-farm/utils/config.ts b/packages/codev/src/agent-farm/utils/config.ts index c5ca82d22..6fd65656e 100644 --- a/packages/codev/src/agent-farm/utils/config.ts +++ b/packages/codev/src/agent-farm/utils/config.ts @@ -237,6 +237,7 @@ export function getResolvedCommands(workspaceRoot?: string): ResolvedCommands { return { architect: cliOverrides.architect || + process.env.TOWER_ARCHITECT_CMD || resolveCommand(userConfig?.shell?.architect, DEFAULT_COMMANDS.architect), builder: cliOverrides.builder || resolveCommand(userConfig?.shell?.builder, DEFAULT_COMMANDS.builder), @@ -248,11 +249,18 @@ export function getResolvedCommands(workspaceRoot?: string): ResolvedCommands { /** * Get the resolved harness provider for the architect shell. * Resolution: explicit architectHarness → auto-detect from architect command → default claude. + * + * The command auto-detect source is the *override-aware* resolved command + * (getResolvedCommands → cliOverrides / TOWER_ARCHITECT_CMD / config), not the + * raw config value. Without this, `--architect-cmd gemini` / `TOWER_ARCHITECT_CMD=gemini` + * with no matching harness config would launch gemini but resolve the claude + * harness, re-arming the #929 resume crash-loop (claude buildResume injecting + * `--resume ` into a non-claude command). */ export function getArchitectHarness(workspaceRoot?: string): HarnessProvider { const root = workspaceRoot || findWorkspaceRoot(); const userConfig = loadUserConfig(root); - const architectCmd = resolveCommand(userConfig?.shell?.architect, DEFAULT_COMMANDS.architect); + const architectCmd = getResolvedCommands(root).architect; return resolveHarness( userConfig?.shell?.architectHarness, userConfig?.harness as Record | undefined, @@ -263,11 +271,15 @@ export function getArchitectHarness(workspaceRoot?: string): HarnessProvider { /** * Get the resolved harness provider for the builder shell. * Resolution: explicit builderHarness → auto-detect from builder command → default claude. + * + * Like getArchitectHarness, the command auto-detect source is the override-aware + * resolved command (so `--builder-cmd gemini` with no matching harness config + * resolves the gemini harness, not claude — see #929). */ export function getBuilderHarness(workspaceRoot?: string): HarnessProvider { const root = workspaceRoot || findWorkspaceRoot(); const userConfig = loadUserConfig(root); - const builderCmd = resolveCommand(userConfig?.shell?.builder, DEFAULT_COMMANDS.builder); + const builderCmd = getResolvedCommands(root).builder; return resolveHarness( userConfig?.shell?.builderHarness, userConfig?.harness as Record | undefined, diff --git a/packages/codev/src/agent-farm/utils/harness.ts b/packages/codev/src/agent-farm/utils/harness.ts index bfb7f3b67..d5f9d64bc 100644 --- a/packages/codev/src/agent-farm/utils/harness.ts +++ b/packages/codev/src/agent-farm/utils/harness.ts @@ -12,6 +12,8 @@ * @see codev/specs/591-af-workspace-failure-with-code.md */ +import { findLatestSessionId } from './claude-session-discovery.js'; + // ============================================================================= // Types // ============================================================================= @@ -45,6 +47,22 @@ export interface HarnessProvider { relativePath: string; content: string; }>; + + /** + * Optional: discover a resumable prior session for the given working dir and + * return how to resume it — in BOTH forms, mirroring buildRoleInjection / + * buildScriptRoleInjection: + * - args: Node argv for spawn() call sites (architect launch) + * - scriptFragment: shell-escaped fragment for bash script generation (builder) + * Returns null when no resumable session exists or this harness has no + * cwd-keyed session store → callers fall back to a fresh launch. Only Claude + * implements it (store: ~/.claude/projects//.jsonl). + */ + buildResume?(absolutePath: string, opts?: { homeDir?: string }): { + sessionId: string; + args: string[]; + scriptFragment: string; + } | null; } /** Custom harness definition from .codev/config.json */ @@ -68,6 +86,15 @@ export const CLAUDE_HARNESS: HarnessProvider = { fragment: `--append-system-prompt "$(cat '${shellEscapeSingleQuote(filePath)}')"`, env: {}, }), + buildResume: (absolutePath, opts) => { + const sessionId = findLatestSessionId(absolutePath, opts); + if (!sessionId) return null; + return { + sessionId, + args: ['--resume', sessionId], + scriptFragment: `--resume '${shellEscapeSingleQuote(sessionId)}'`, + }; + }, }; export const CODEX_HARNESS: HarnessProvider = { diff --git a/packages/codev/src/commands/doctor.ts b/packages/codev/src/commands/doctor.ts index 8e3c434c5..f39587419 100644 --- a/packages/codev/src/commands/doctor.ts +++ b/packages/codev/src/commands/doctor.ts @@ -682,8 +682,9 @@ export async function doctor(): Promise { const architectCmd = Array.isArray(shell?.architect) ? (shell.architect as string[]).join(' ') : (shell?.architect as string ?? ''); - const isOpencode = architectHarness === 'opencode' || - (architectCmd && detectHarnessFromCommand(architectCmd) === 'opencode'); + const resolvedHarness = architectHarness || + (architectCmd ? detectHarnessFromCommand(architectCmd) : undefined); + const isOpencode = resolvedHarness === 'opencode'; if (isOpencode) { console.log(''); console.log(chalk.yellow(' ⚠') + ' OpenCode is configured as architect shell — this is unsupported.'); @@ -695,6 +696,25 @@ export async function doctor(): Promise { issue: 'OpenCode configured as architect shell (unsupported)', recommendation: 'Set shell.architect to "claude --dangerously-skip-permissions" in .codev/config.json', }); + } else if (resolvedHarness === 'gemini') { + // Issue #929: gemini is builder-only; the Gemini CLI is retiring (#778), + // so it is no longer supported as an architect. + console.log(''); + console.log(chalk.yellow(' ⚠') + ' Gemini is configured as architect shell — this is unsupported.'); + console.log(chalk.yellow(' ') + 'The Gemini CLI is retiring (#778); gemini is supported for builders, not architects.'); + console.log(chalk.yellow(' ') + 'Use codex or claude for the architect (e.g., "codex" or "claude --dangerously-skip-permissions").'); + warnings++; + warningDetails.push({ + name: 'Shell config', + issue: 'Gemini configured as architect shell (builder-only, not architect)', + recommendation: 'Set shell.architect to "codex" or "claude --dangerously-skip-permissions" in .codev/config.json', + }); + } else if (resolvedHarness === 'codex') { + // Issue #929: codex is a supported architect (config-driven). + console.log(''); + console.log(chalk.green(' ✓') + ' codex is configured as architect shell — supported.'); + console.log(chalk.gray(' ') + 'Conversation resume is Claude-main-only; codex architects relaunch fresh with role injection.'); + console.log(chalk.gray(' ') + 'Select the architect harness via .codev/config.json (shell.architect / shell.architectHarness).'); } } } catch {