Skip to content

Commit 53d2246

Browse files
authored
feat: Code Tour — guided PR walkthrough as a third agent provider (#569)
* feat(server): add Code Tour agent as third review provider Adds a "tour" provider alongside "claude" and "codex" that generates a guided walkthrough of a changeset from a product-minded colleague's perspective. Reuses the entire agent-jobs infrastructure (process lifecycle, SSE broadcasting, live logs, kill support, capability detection) so the only new server plumbing is the prompt/schema module, the GET endpoint for the result, and checklist persistence. - tour-review.ts: prompt, JSON schema, Claude (stream-json) and Codex (file output) command builders, and parsers for both. Prompt frames the agent as a colleague giving a casual tour, orders stops by reading flow (not impact), chunks by logical change (not per-file), and writes QA questions a human can answer by reading code or using the product. - review.ts: tour buildCommand with per-engine model defaults (fixes a bug where Codex was defaulting to "sonnet", a Claude model), an onJobComplete handler that parses and stores the tour, plus GET /api/tour/:jobId and PUT /api/tour/:jobId/checklist endpoints. - agent-jobs.ts: tour capability detection and config threading from the POST body through to the provider's buildCommand. - shared/agent-jobs.ts: engine/model fields on AgentJobInfo so the UI can render tour jobs with their chosen provider + model. For provenance purposes, this commit was AI assisted. * feat(review-editor): Code Tour dialog with animated walkthrough Adds the tour overlay surface: a full-screen dialog with three animated pages (Overview, Walkthrough, Checklist) that renders the CodeTourOutput from the server-side agent. Opens automatically when a tour job completes and can be dismissed with Escape or backdrop click. - components/tour/TourDialog.tsx: three-page animated dialog. Intro page uses a composition cascade (greeting, Intent/Before/After cards with one-shot color-dot pulse, key takeaways table) with a pinned Start Tour button and a bottom fade so content scrolls cleanly under it. Page slides between Overview/Walkthrough/Checklist are ease-out springs with direction auto-derived from tab index. - components/tour/TourStopCard.tsx: per-stop accordion using motion's AnimatePresence for height + opacity springs with staggered children for detail text and anchor blocks. Anchors lazy-mount DiffHunkPreview on first open to keep mount costs low. Callout labels (Important, Warning, Note) are deterministic typography, no emoji. - components/tour/QAChecklist.tsx: always-open list of verification questions with per-item spring entrance and persistent checkbox state that PUTs to the server. - hooks/useTourData.ts: fetch + checklist persistence with a dev-mode short-circuit when jobId === DEMO_TOUR_ID so UI iteration doesn't require a live agent run. - demoTour.ts: realistic demo data (multi-stop, multiple takeaways, real diff hunks) for dev iteration. - App.tsx: tour dialog state, auto-open on job completion, and a dev-only "Demo tour" floating button + Cmd/Ctrl+Shift+T shortcut guarded by import.meta.env.DEV. - index.css: all tour animations (dialog enter/exit, page slides, stop reveal cascade, intro exit) with reduced-motion fallbacks. Dark-mode contrast tuning across structural surfaces (borders, surface tints, callout backgrounds) so the design works in both themes. - package.json: adds motion@12.38.0 for spring-driven accordion physics and the intro composition cascade. For provenance purposes, this commit was AI assisted. * feat(review-editor): wire Code Tour into agent jobs UI + polish DiffHunkPreview Connects the tour provider to the existing agent jobs sidebar and detail panel so the user can launch a tour the same way they launch Claude or Codex reviews. Also tightens DiffHunkPreview (used inside tour anchors) to render correctly on first mount and handle every diff hunk shape the agent might produce. - ui/components/AgentsTab.tsx: tour engine (Claude/Codex) + model select when launching a tour job. Resets model to blank when falling back to Codex-only so Codex uses its own default instead of inheriting "sonnet" (a Claude model). - ui/hooks/useAgentJobs.ts: forward engine/model config from the UI through to the POST /api/agents/jobs body so the server's tour provider can pick the right command shape. - dock/panels/ReviewAgentJobDetailPanel.tsx: tour-aware detail panel. Replaces the "Findings" tab with a "Status" card for tour jobs and surfaces an "Open Tour" button that opens the dialog overlay when the tab is already in the dock. - components/DiffHunkPreview.tsx: synchronous pierre theme init via a useState lazy initializer so the first render (inside a tooltip) is already themed; robust hunk parser that handles bare @@ hunks, file- level --- hunks, and full git diffs; "Diff not available" fallback for broken hunks. - components/ReviewSidebar.tsx, dock/ReviewStateContext.tsx: small wiring changes to expose openTourPanel from the review state context. Also removes two dead components (TourHeader, DiffAnchorChip) that were superseded by the inline title bar in TourDialog and the inline AnchorBlock in TourStopCard. For provenance purposes, this commit was AI assisted. * refactor(server): extract Code Tour lifecycle into shared createTourSession factory + Pi parity The route-parity test suite requires the Pi server (apps/pi-extension/) to expose the same routes as the Bun server (packages/server/). After Code Tour shipped in the prior 3 commits, Pi was missing /api/tour/:jobId (GET) and /api/tour/:jobId/checklist (PUT). A naive mirror would duplicate ~100 lines of provider-branch logic (buildCommand, onJobComplete, in-memory maps) into Pi's serverReview.ts, perpetuating the existing claude/codex duplication problem. Instead, extract the pure runtime-agnostic tour lifecycle into a createTourSession() factory that both servers consume. Route handlers stay per-server (different HTTP primitives) but are ~5 lines each. Net effect: Pi port is ~25 lines instead of ~100. Future providers that adopt the same pattern cost ~15 lines per server. - tour-review.ts: new createTourSession() at the bottom of the module. Encapsulates tourResults + tourChecklists maps, buildCommand (with the Claude-vs-Codex model-default fix baked in), onJobComplete (parse/store/summarize), plus getTour/saveChecklist lookup helpers for route handlers. - review.ts (Bun): tour branch in buildCommand, tour branch in onJobComplete, and both route handlers collapse to one-line calls into the factory. Drops ~70 lines. - vendor.sh: add tour-review to the review-agent loop so Pi regenerates generated/tour-review.ts on every build:pi. - serverReview.ts (Pi): import createTourSession from ../generated/tour-review.js; add tour branch to buildCommand (one line), tour branch to onJobComplete (three lines), and GET/PUT route handlers using Pi's json() helper. ~25 lines added. - agent-jobs.ts (Pi): extend buildCommand interface to accept config and return engine/model; thread config from POST body; extend spawnJob to persist engine/model on AgentJobInfo; add tour to capability list. Claude and Codex branches are intentionally left in the old pattern; they can migrate to the factory approach when next touched to keep this change's blast radius contained. Tests: 518/518 passing (previous 3 route-parity failures resolved, plus 2 extra assertions passing since tour is now in both servers' route tables). For provenance purposes, this commit was AI assisted. * refactor(tour): self-review cleanup — Pi route match parity + remove setOpen shim Two small cleanups surfaced by a self-review pass: - Pi's GET /api/tour/:jobId route used `endsWith("/checklist")` to block the checklist sub-route, while Bun uses `includes("/", ...)`. The two are not equivalent: a URL like /api/tour/abc/extra would be accepted by Pi (jobId becomes "abc/extra") but correctly rejected by Bun. Align Pi to Bun's pattern. - TourStopCard had a leftover `setOpen` shim from when open state was local to the card. State is now lifted to TourDialog, so the shim just aliases onToggle and ignores its argument. Replace with a direct `onClick={onToggle}` on the trigger button. 520/0 tests still pass. For provenance purposes, this commit was AI assisted. * feat(review): per-provider model + effort controls across all review agents Exposes model, effort/reasoning, and fast-mode (Codex) as per-job knobs for Claude, Codex, and Code Tour via the Agents tab. Threads the config from the UI through the POST body into each provider's buildCommand, which emits the corresponding CLI flags (--model/--effort for Claude; -m / -c model_reasoning_effort / -c service_tier for Codex). Bun and Pi servers stay in parity. UI: option catalogs (CLAUDE_MODELS, CLAUDE_EFFORT, CODEX_MODELS, CODEX_REASONING, TOUR_CLAUDE_MODELS) are defined once and mapped into every dropdown so there's a single source of truth for the choices. Tour's Claude/Codex settings are kept separate from the standalone provider's state so toggling the tour engine no longer overwrites the provider's last choice. Job badge now surfaces the model/reasoning/fast selection for both Tour and the standalone Codex provider. For provenance purposes, this commit was AI assisted. * docs: add Prompts reference page Documents the three-layer structure every review call travels through (CLI system prompt, our user message of review-prompt + user-prompt, output schema flag), names each constant + its file, and calls out that the Claude/Codex review prompts are the upstream ones from those projects (only Code Tour's prompt is original to Plannotator). Linked from the Code Review command page. For provenance purposes, this commit was AI assisted. * feat(review): persist per-agent, per-model settings in a single cookie Drops the hidden "Default" options from the agent dropdowns, locks in explicit sensible defaults (Opus 4.7/High for Claude review, gpt-5.3-codex/High for Codex, Sonnet/Medium for Tour Claude, gpt-5.3-codex/Medium for Tour Codex), and remembers the last-used effort/reasoning/fast-mode per (agent job × model) so switching models reveals the choices you made last time. Backed by a single `plannotator.agents` cookie holding the whole settings tree — one read on mount, one mirror write per change, all mutations funnel through a single React state owner to avoid stale-read or lost-write races across rapid successive updates. For provenance purposes, this commit was AI assisted. * fix(tour): unblock reduced-motion nav + flush pending checklist save on close Two P2 issues surfaced in review: - Under prefers-reduced-motion, tour page animations are suppressed, so onAnimationEnd never fires and exitingPage stuck on 'intro' kept the walkthrough/checklist gated out. navigate() now swaps pages directly when reduced motion is on, mirroring the pattern already used in the wrapper. - Checklist toggles are debounced 500ms before the PUT, but unmount only cleared the timer — checking an item and closing within the window dropped the save. Cleanup now flushes the pending payload with keepalive: true. For provenance purposes, this commit was AI assisted. * feat(review): surface Claude model + effort in job badge, trim redundant labels Claude effort was never persisted on AgentJobInfo, so the job badge had nothing to render beyond "Claude". Plumbs `effort` through the shared type, both server build-command pipelines (Bun + Pi), and spawnJob, then teaches the ProviderBadge to display Claude and Tour Claude model + effort with the same shape Codex already had. Labels resolve via the dropdown catalogs so the badge shows "Opus 4.7" / "High" instead of raw ids. Server labels for both Claude and Codex reviews collapse to plain "Code Review" (matching tour's "Code Tour"), since the badge now carries the provider + model + settings — the title only needs to name the action. For provenance purposes, this commit was AI assisted. * polish(review): prefix agent dropdown options with the action name The Run launcher listed "Claude Code", "Codex CLI", and "Code Tour" side by side — the CLI's detection name was doing double duty as the action label. Prefixes the review entries with "Code Review · " so scanning three options surfaces two reviews + one tour instead of three raw provider names. For provenance purposes, this commit was AI assisted. * fix(review): drop invalid codex reasoning option + fail tour on empty output Two P2 issues from PR review: - Tour jobs exited 0 but returned null output would still be marked "done", and the auto-open watcher would greet the user with a 404 "Tour not found" dialog. onJobComplete now flips the job to failed with an error message when nothing was stored, so the card reflects the real state. - The codex reasoning dropdown offered "None", but codex-rs only accepts minimal/low/medium/high/xhigh. Picking None sent `-c model_reasoning_effort=none` and launched a broken job. Removed the option; added a one-shot cookie migration so users who already saved "none" don't keep shipping it. For provenance purposes, this commit was AI assisted. * fix(tour): format Claude-engine logs + allow reading linked issues Two P2/P3 issues from PR review: - The stdout formatter keyed only on provider === "claude", so Code Tour jobs running on the Claude engine streamed raw JSONL to the log panel while Claude review jobs got formatted text. Widened the check to also catch spawnOptions.engine === "claude" and mirrored the fix into Pi. - The tour Claude allowlist permitted PR/MR commands but not issue reads, yet the prompt explicitly asks the agent to open `Fixes #123` / `Closes owner/repo#456` targets for deeper context. Claude was being denied those commands mid-tour. Added gh issue view + gh api issues + glab issue view to the allowlist. For provenance purposes, this commit was AI assisted. * cleanup(review): e2e punchlist — Pi parity + dedup + tests - Pi now logs Claude parse failures with the same diagnostic as Bun - Tour route match uses a regex instead of an includes-offset trick - Drop `any` cast in Pi's Codex blocking-findings predicate - Extract `patchClaude` twin of `patchCodex` in useAgentSettings - Factor `resolveAgentCwd` helper in Bun review to match Pi - AgentsTab launch payload becomes a per-provider dispatch table - Wrap TourDialog in MotionConfig reducedMotion="user" so motion.* children honor prefers-reduced-motion alongside the CSS keyframes - Tests for parseTourStreamOutput/parseTourFileOutput and the Codex perModel sanitizer For provenance purposes, this commit was AI assisted. * test(tour-review): use real TourStop shape in fixtures The parsers don't validate stop fields so the original made-up shape passed fine, but future readers would be misled. Use the real CodeTourOutput / TourStop / TourDiffAnchor shapes from tour-review.ts. For provenance purposes, this commit was AI assisted. * cleanup(tour): hoist shared types + subfolder server module + dedup - Hoist tour output types to packages/shared/tour.ts so server and UI share one source of truth (prevents silent drift when schema changes). - Move tour-review into packages/server/tour/ subfolder alongside the existing packages/review-editor/components/tour/ convention. - Extend tour.buildCommand to accept tour context directly (patch, diffType, options, prMetadata), so Bun and Pi no longer duplicate the buildTourUserMessage call site. Export TOUR_EMPTY_OUTPUT_ERROR from the tour module to prevent string drift between servers. - Early-return the tour branch in buildCommand so the review userMessage is built once on the shared non-tour path instead of unconditionally (dead work on tour launches) or twice (codex+claude dup). - Regex capture group for tour checklist jobId in both servers — replaces a second url.pathname.split("/")[3] fragile fallback. - Drop existsSync guard in parseTourFileOutput (TOCTOU + duplicate syscall); the outer try/catch already handles ENOENT. - Dedupe two matchMedia reads in TourDialog behind a prefersReducedMotion helper; memoize intro-page card array so unrelated state churn doesn't recompute it. - Comment hygiene on tour files — drop WHAT/change-referencing comments added by this PR; keep non-obvious WHY. For provenance purposes, this commit was AI assisted. * fix(tour): hoist introCards useMemo above early returns + subfolder hook The memoized intro-card array was placed after the loading/error early returns in TourDialogContent, so the first successful render called one more hook than the initial loading render — React threw "Rendered more hooks than during the previous render" (error #310) the moment a tour finished loading. Move the useMemo above the early returns and use optional chaining on `tour` so it's safe to run during loading. Also move useTourData.ts into a hooks/tour/ subfolder to start a standard for feature-scoped hooks (was flat among siblings). For provenance purposes, this commit was AI assisted.
1 parent 3a7f115 commit 53d2246

31 files changed

+3333
-164
lines changed

apps/marketing/src/content/docs/commands/code-review.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,10 @@ When multiple providers are available, set your default in **Settings → AI**.
106106

107107
If only one provider is installed, it's used automatically with no configuration needed.
108108

109+
## How review agents prompt the CLI
110+
111+
The review agents (Claude, Codex, Code Tour) shell out to external CLIs. Plannotator controls the user message and output schema; the CLI's own harness owns the system prompt. See the [Prompts reference](/docs/reference/prompts/) for the full breakdown of what each provider sends, how the pieces join, and which knobs you can tune per job.
112+
109113
## Submitting feedback
110114

111115
- **Send Feedback** formats your annotations and sends them to the agent
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: "Prompts"
3+
description: "How Plannotator's review agents structure their prompts, what we control, what the CLI harness owns, and how the pieces fit together."
4+
sidebar:
5+
order: 33
6+
section: "Reference"
7+
---
8+
9+
Plannotator's review agents (Claude, Codex, and Code Tour) all shell out to an external CLI. This page maps what those CLIs receive on every invocation: which parts Plannotator controls, and which parts are owned by the CLI's own agent harness.
10+
11+
Importantly, **we don't invent our own review prompts**. The Claude review prompt is derived from Claude Code's published open-source review prompt, and the Codex review prompt is copied verbatim from [`codex-rs/core/review_prompt.md`](https://github.com/openai/codex). You get the same review behavior those tools ship with. Code Tour is the one exception: it's a Plannotator-original workflow, so its prompt is ours.
12+
13+
## The three layers
14+
15+
Every review call is shaped by three layers:
16+
17+
1. **System prompt.** Owned by the CLI (Claude Code or codex-rs). Plannotator never sets or touches this.
18+
2. **User message.** What Plannotator sends. Always a single concatenated string of two parts: a static **review prompt** plus a dynamic **user prompt**.
19+
3. **Output schema.** A JSON schema passed to the CLI as a flag, forcing the final assistant message to match a known shape.
20+
21+
## What's in the user message
22+
23+
The user message Plannotator sends is always:
24+
25+
```
26+
<review prompt>
27+
28+
---
29+
30+
<user prompt>
31+
```
32+
33+
**Review prompt** is a long, static review instruction that lives in the repo as a TypeScript constant. It's distinct per provider.
34+
35+
**User prompt** is a short, dynamic line built per call from the diff type (`uncommitted`, `staged`, `last-commit`, `branch`, PR URL, and so on). The same builder is used for all providers.
36+
37+
## Matrix
38+
39+
| | Claude review | Codex review | Code Tour (Claude or Codex) |
40+
|---|---|---|---|
41+
| **System prompt** | Owned by `claude` CLI. We don't touch it. | Owned by `codex` CLI. We don't touch it. | Same as whichever engine runs. |
42+
| **Review prompt (static, ours)** | `CLAUDE_REVIEW_PROMPT` in `packages/server/claude-review.ts` | `CODEX_REVIEW_SYSTEM_PROMPT` in `packages/server/codex-review.ts` (misnamed; it's user content) | `TOUR_REVIEW_PROMPT` in `packages/server/tour-review.ts` |
43+
| **User prompt (dynamic, ours)** | `buildCodexReviewUserMessage(patch, diffType, …)` | same function | same function |
44+
| **Full user message** | `review prompt + "\n\n---\n\n" + user prompt` | same | same |
45+
| **Delivered via** | stdin | last positional argv | stdin (Claude engine) or positional argv (Codex engine) |
46+
| **Output schema flag** | `--json-schema <inline JSON>` | `--output-schema <file path>` | same as engine |
47+
| **Schema shape** | severity findings (`important`, `nit`, `pre_existing`) | priority findings (P0 through P3) | stops plus QA checklist |
48+
49+
## Why the schema matters
50+
51+
The schema flag is a terminal constraint, not a per-turn one. The agent reasons freely across N turns, reading files, grepping, running tests, and only the final assistant message is forced to deserialize against the schema. Everything upstream is unconstrained exploration.
52+
53+
That's why this pattern works for review. You get agentic exploration (the whole point of using Claude Code or Codex over a raw LLM call), plus a machine-readable payload the UI can render without any scraping.
54+
55+
## What you can tune per job
56+
57+
From the **Agents** tab in the code-review UI, each provider exposes these settings:
58+
59+
| Setting | Claude | Codex | Tour |
60+
|---|---|---|---|
61+
| Model | yes (`--model`) | yes (`-m`) | yes (per engine) |
62+
| Reasoning effort | yes (`--effort`) | yes (`-c model_reasoning_effort=…`) | yes (per engine) |
63+
| Fast mode | no | yes (`-c service_tier=fast`) | Codex engine only |
64+
65+
None of these change the review prompt or user prompt. They only change how the underlying CLI executes the same user message.
66+
67+
## Relationship to code review
68+
69+
See [Code Review](/docs/commands/code-review/) for the end-to-end flow this feeds into.

apps/pi-extension/server/agent-jobs.ts

Lines changed: 44 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,15 +54,25 @@ export interface AgentJobHandlerOptions {
5454
mode: "plan" | "review" | "annotate";
5555
getServerUrl: () => string;
5656
getCwd: () => string;
57-
/** Server-side command builder for known providers (codex, claude). */
58-
buildCommand?: (provider: string) => Promise<{
57+
/** Server-side command builder for known providers (codex, claude, tour). */
58+
buildCommand?: (provider: string, config?: Record<string, unknown>) => Promise<{
5959
command: string[];
6060
outputPath?: string;
6161
captureStdout?: boolean;
6262
stdinPrompt?: string;
6363
cwd?: string;
6464
prompt?: string;
6565
label?: string;
66+
/** Underlying engine used (e.g., "claude" or "codex"). Stored on AgentJobInfo for UI display. */
67+
engine?: string;
68+
/** Model used (e.g., "sonnet", "opus"). Stored on AgentJobInfo for UI display. */
69+
model?: string;
70+
/** Claude --effort level. */
71+
effort?: string;
72+
/** Codex reasoning effort level. */
73+
reasoningEffort?: string;
74+
/** Whether Codex fast mode was enabled. */
75+
fastMode?: boolean;
6676
} | null>;
6777
/** Called when a job completes successfully — parse results and push annotations. */
6878
onJobComplete?: (job: AgentJobInfo, meta: { outputPath?: string; stdout?: string; cwd?: string }) => void | Promise<void>;
@@ -81,6 +91,7 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
8191
const capabilities: AgentCapability[] = [
8292
{ id: "claude", name: "Claude Code", available: whichCmd("claude") },
8393
{ id: "codex", name: "Codex CLI", available: whichCmd("codex") },
94+
{ id: "tour", name: "Code Tour", available: whichCmd("claude") || whichCmd("codex") },
8495
];
8596
const capabilitiesResponse: AgentCapabilities = {
8697
mode,
@@ -107,7 +118,7 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
107118
command: string[],
108119
label: string,
109120
outputPath?: string,
110-
spawnOptions?: { captureStdout?: boolean; stdinPrompt?: string; cwd?: string; prompt?: string },
121+
spawnOptions?: { captureStdout?: boolean; stdinPrompt?: string; cwd?: string; prompt?: string; engine?: string; model?: string; effort?: string; reasoningEffort?: string; fastMode?: boolean },
111122
): AgentJobInfo {
112123
const id = crypto.randomUUID();
113124
const source = jobSource(id);
@@ -121,6 +132,11 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
121132
startedAt: Date.now(),
122133
command,
123134
cwd: getCwd(),
135+
...(spawnOptions?.engine && { engine: spawnOptions.engine }),
136+
...(spawnOptions?.model && { model: spawnOptions.model }),
137+
...(spawnOptions?.effort && { effort: spawnOptions.effort }),
138+
...(spawnOptions?.reasoningEffort && { reasoningEffort: spawnOptions.reasoningEffort }),
139+
...(spawnOptions?.fastMode && { fastMode: spawnOptions.fastMode }),
124140
};
125141

126142
let proc: ChildProcess | null = null;
@@ -169,7 +185,8 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
169185
const lines = text.split('\n');
170186
for (const line of lines) {
171187
if (!line.trim()) continue;
172-
if (provider === "claude") {
188+
// Tour jobs with the Claude engine also stream Claude JSONL.
189+
if (provider === "claude" || spawnOptions?.engine === "claude") {
173190
const formatted = formatClaudeLogEvent(line);
174191
if (formatted !== null) {
175192
broadcast({ type: "job:log", jobId: id, delta: formatted + '\n' });
@@ -397,8 +414,20 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
397414
let stdinPrompt: string | undefined;
398415
let spawnCwd: string | undefined;
399416
let promptText: string | undefined;
417+
let jobEngine: string | undefined;
418+
let jobModel: string | undefined;
419+
let jobEffort: string | undefined;
420+
let jobReasoningEffort: string | undefined;
421+
let jobFastMode: boolean | undefined;
400422
if (options.buildCommand) {
401-
const built = await options.buildCommand(provider);
423+
// Thread config from POST body to buildCommand
424+
const config: Record<string, unknown> = {};
425+
if (typeof body.engine === "string") config.engine = body.engine;
426+
if (typeof body.model === "string") config.model = body.model;
427+
if (typeof body.reasoningEffort === "string") config.reasoningEffort = body.reasoningEffort;
428+
if (typeof body.effort === "string") config.effort = body.effort;
429+
if (body.fastMode === true) config.fastMode = true;
430+
const built = await options.buildCommand(provider, Object.keys(config).length > 0 ? config : undefined);
402431
if (built) {
403432
command = built.command;
404433
outputPath = built.outputPath;
@@ -407,6 +436,11 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
407436
spawnCwd = built.cwd;
408437
promptText = built.prompt;
409438
if (built.label) label = built.label;
439+
jobEngine = built.engine;
440+
jobModel = built.model;
441+
jobEffort = built.effort;
442+
jobReasoningEffort = built.reasoningEffort;
443+
jobFastMode = built.fastMode;
410444
}
411445
}
412446

@@ -420,6 +454,11 @@ export function createAgentJobHandler(options: AgentJobHandlerOptions) {
420454
stdinPrompt,
421455
cwd: spawnCwd,
422456
prompt: promptText,
457+
engine: jobEngine,
458+
model: jobModel,
459+
effort: jobEffort,
460+
reasoningEffort: jobReasoningEffort,
461+
fastMode: jobFastMode,
423462
});
424463
json(res, { job }, 201);
425464
} catch {

apps/pi-extension/server/serverReview.ts

Lines changed: 72 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ import {
7171
parseClaudeStreamOutput,
7272
transformClaudeFindings,
7373
} from "../generated/claude-review.js";
74+
import { createTourSession, TOUR_EMPTY_OUTPUT_ERROR } from "../generated/tour-review.js";
7475

7576
/** Detect if running inside WSL (Windows Subsystem for Linux) */
7677
function detectWSL(): boolean {
@@ -194,7 +195,6 @@ export async function startReviewServer(options: {
194195

195196
// Agent jobs — background process manager (late-binds serverUrl via getter)
196197
let serverUrl = "";
197-
// Worktree-aware cwd resolver — shared by getCwd, buildCommand, and onJobComplete
198198
function resolveAgentCwd(): string {
199199
if (options.agentCwd) return options.agentCwd;
200200
if (currentDiffType.startsWith("worktree:")) {
@@ -203,32 +203,47 @@ export async function startReviewServer(options: {
203203
}
204204
return options.gitContext?.cwd ?? process.cwd();
205205
}
206+
const tour = createTourSession();
207+
206208
const agentJobs = createAgentJobHandler({
207209
mode: "review",
208210
getServerUrl: () => serverUrl,
209211
getCwd: resolveAgentCwd,
210212

211-
async buildCommand(provider) {
213+
async buildCommand(provider, config) {
212214
const cwd = resolveAgentCwd();
213215
const hasAgentLocalAccess = !!options.agentCwd || !!options.gitContext;
214-
const userMessage = buildCodexReviewUserMessage(
215-
currentPatch,
216-
currentDiffType,
217-
{ defaultBranch: options.gitContext?.defaultBranch, hasLocalAccess: hasAgentLocalAccess },
218-
options.prMetadata,
219-
);
216+
const userMessageOptions = { defaultBranch: options.gitContext?.defaultBranch, hasLocalAccess: hasAgentLocalAccess };
217+
218+
if (provider === "tour") {
219+
return tour.buildCommand({
220+
cwd,
221+
patch: currentPatch,
222+
diffType: currentDiffType,
223+
options: userMessageOptions,
224+
prMetadata: options.prMetadata,
225+
config,
226+
});
227+
}
228+
229+
const userMessage = buildCodexReviewUserMessage(currentPatch, currentDiffType, userMessageOptions, options.prMetadata);
220230

221231
if (provider === "codex") {
232+
const model = typeof config?.model === "string" && config.model ? config.model : undefined;
233+
const reasoningEffort = typeof config?.reasoningEffort === "string" && config.reasoningEffort ? config.reasoningEffort : undefined;
234+
const fastMode = config?.fastMode === true;
222235
const outputPath = generateOutputPath();
223236
const prompt = CODEX_REVIEW_SYSTEM_PROMPT + "\n\n---\n\n" + userMessage;
224-
const command = await buildCodexCommand({ cwd, outputPath, prompt });
225-
return { command, outputPath, prompt, label: "Codex Review" };
237+
const command = await buildCodexCommand({ cwd, outputPath, prompt, model, reasoningEffort, fastMode });
238+
return { command, outputPath, prompt, label: "Code Review", model, reasoningEffort, fastMode: fastMode || undefined };
226239
}
227240

228241
if (provider === "claude") {
242+
const model = typeof config?.model === "string" && config.model ? config.model : undefined;
243+
const effort = typeof config?.effort === "string" && config.effort ? config.effort : undefined;
229244
const prompt = CLAUDE_REVIEW_PROMPT + "\n\n---\n\n" + userMessage;
230-
const { command, stdinPrompt } = buildClaudeCommand(prompt);
231-
return { command, stdinPrompt, prompt, cwd, label: "Claude Code Review", captureStdout: true };
245+
const { command, stdinPrompt } = buildClaudeCommand(prompt, model, effort);
246+
return { command, stdinPrompt, prompt, cwd, label: "Code Review", captureStdout: true, model, effort };
232247
}
233248

234249
return null;
@@ -243,7 +258,7 @@ export async function startReviewServer(options: {
243258

244259
// Override verdict if there are blocking findings (P0/P1) — Codex's
245260
// freeform correctness string can say "mostly correct" with real bugs.
246-
const hasBlockingFindings = output.findings.some((f: any) => f.priority !== null && f.priority <= 1);
261+
const hasBlockingFindings = output.findings.some(f => f.priority !== null && f.priority <= 1);
247262
job.summary = {
248263
correctness: hasBlockingFindings ? "Issues Found" : output.overall_correctness,
249264
explanation: output.overall_explanation,
@@ -260,7 +275,10 @@ export async function startReviewServer(options: {
260275

261276
if (job.provider === "claude" && meta.stdout) {
262277
const output = parseClaudeStreamOutput(meta.stdout);
263-
if (!output) return;
278+
if (!output) {
279+
console.error(`[claude-review] Failed to parse output (${meta.stdout.length} bytes, last 200: ${meta.stdout.slice(-200)})`);
280+
return;
281+
}
264282

265283
const total = output.summary.important + output.summary.nit + output.summary.pre_existing;
266284
job.summary = {
@@ -276,6 +294,20 @@ export async function startReviewServer(options: {
276294
}
277295
return;
278296
}
297+
298+
if (job.provider === "tour") {
299+
const { summary } = await tour.onJobComplete({ job, meta });
300+
if (summary) {
301+
job.summary = summary;
302+
} else {
303+
// The process exited 0 but the model returned empty or malformed output
304+
// and nothing was stored. Flip status so the client doesn't auto-open
305+
// a successful-looking card that 404s on /api/tour/:id.
306+
job.status = "failed";
307+
job.error = TOUR_EMPTY_OUTPUT_ERROR;
308+
}
309+
return;
310+
}
279311
},
280312
});
281313
const sharingEnabled =
@@ -415,6 +447,32 @@ export async function startReviewServer(options: {
415447
const server = createServer(async (req, res) => {
416448
const url = requestUrl(req);
417449

450+
// API: Get tour result
451+
if (url.pathname.match(/^\/api\/tour\/[^/]+$/) && req.method === "GET") {
452+
const jobId = url.pathname.slice("/api/tour/".length);
453+
const result = tour.getTour(jobId);
454+
if (!result) {
455+
json(res, { error: "Tour not found" }, 404);
456+
return;
457+
}
458+
json(res, result);
459+
return;
460+
}
461+
462+
// API: Save tour checklist state
463+
const checklistMatch = url.pathname.match(/^\/api\/tour\/([^/]+)\/checklist$/);
464+
if (checklistMatch && req.method === "PUT") {
465+
const jobId = checklistMatch[1];
466+
try {
467+
const body = await parseBody(req) as { checked: boolean[] };
468+
if (Array.isArray(body.checked)) tour.saveChecklist(jobId, body.checked);
469+
json(res, { ok: true });
470+
} catch {
471+
json(res, { error: "Invalid JSON" }, 400);
472+
}
473+
return;
474+
}
475+
418476
if (url.pathname === "/api/diff" && req.method === "GET") {
419477
json(res, {
420478
rawPatch: currentPatch,

apps/pi-extension/vendor.sh

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ cd "$(dirname "$0")"
66

77
mkdir -p generated generated/ai/providers
88

9-
for f in feedback-templates review-core storage draft project pr-provider pr-github pr-gitlab checklist integrations-common repo reference-common favicon resolve-file config external-annotation agent-jobs worktree html-to-markdown url-to-markdown; do
9+
for f in feedback-templates review-core storage draft project pr-provider pr-github pr-gitlab checklist integrations-common repo reference-common favicon resolve-file config external-annotation agent-jobs worktree html-to-markdown url-to-markdown tour; do
1010
src="../../packages/shared/$f.ts"
1111
printf '// @generated — DO NOT EDIT. Source: packages/shared/%s.ts\n' "$f" | cat - "$src" > "generated/$f.ts"
1212
done
@@ -21,6 +21,17 @@ for f in codex-review claude-review path-utils; do
2121
> "generated/$f.ts"
2222
done
2323

24+
# tour-review lives in packages/server/tour/ — parent-relative imports and the
25+
# shared tour types package each map to the flat generated/ layout.
26+
for f in tour-review; do
27+
src="../../packages/server/tour/$f.ts"
28+
printf '// @generated — DO NOT EDIT. Source: packages/server/tour/%s.ts\n' "$f" | cat - "$src" \
29+
| sed 's|from "\.\./vcs"|from "./review-core.js"|' \
30+
| sed 's|from "\.\./pr"|from "./pr-provider.js"|' \
31+
| sed 's|from "@plannotator/shared/tour"|from "./tour.js"|' \
32+
> "generated/$f.ts"
33+
done
34+
2435
for f in index types provider session-manager endpoints context base-session; do
2536
src="../../packages/ai/$f.ts"
2637
printf '// @generated — DO NOT EDIT. Source: packages/ai/%s.ts\n' "$f" | cat - "$src" > "generated/ai/$f.ts"

0 commit comments

Comments
 (0)