Skip to content

Commit e2bf664

Browse files
authored
fix(next-task): enforce strict review loop exit conditions (#235) (#238)
* fix(workflow-state): add missing phases and result field mappings Add 'pre-review-gates' and 'docs-update' to the PHASES array so that completePhase() correctly advances through the full workflow sequence. Add result field mappings for pre-review-gates, delivery-validation, and docs-update so phase results are stored properly. * fix(orchestrate-review): use completePhase instead of updateFlow Replace workflowState.updateFlow({ reviewResult: ... }) with workflowState.completePhase({ ... }) so the review-loop phase properly advances to delivery-validation when done. * fix(next-task): enforce strict review loop exit conditions Add startPhase('review-loop') at the beginning of Phase 9 across all 3 platform command files. Remove the "orchestrator may override after 3+ iterations" escape hatch. The review loop now only exits on: openCount===0, stall detection (same hash 2x), or 5-iteration hard limit. This ensures all issues are actually resolved before delivery. * test(workflow-state): add tests for new phase transitions Verify that completePhase correctly advances through the new phases: implementation -> pre-review-gates -> review-loop -> delivery-validation -> docs-update -> shipping. Also test that result fields (preReviewResult, reviewResult, deliveryResult, docsResult) are stored correctly. * fix(review-loop): address review findings from iteration 1 - Guard completePhase against unknown current phase (returns null instead of silently resetting to first phase) - Fix truthy check to use !== null && !== undefined for result storage - Add completePhase() call at end of Phase 9 in all 3 platform command files - Fix no-shortcuts-policy table: "3 iterations" -> "5 iterations, stall-safe" - Add PHASES ordering assertions to test suite - Add blocked review-loop test case (stall-detected path) - Add shipping -> complete boundary test - Add completePhase-with-unknown-phase guard test * fix(review-loop): address review findings from iteration 2 - Fix MAX_STALLS=1 to match documented behavior: stall triggers after 2 consecutive identical-hash iterations (was 3 with MAX_STALLS=2) - Add [WARN] log before completePhase null return on unknown phase - Hoist RESULT_FIELD_MAP to module scope (eliminates per-call allocation) - Add iteration-limit blocked reason test - Add falsy result storage test (validates result !== null fix) * fix(review-loop): address review findings from iteration 3 - Add completePhase() call at end of Phase 8 (pre-review-gates) in all platform command files so preReviewResult is stored in flow.json - Add completePhase extraction to adapter-transforms.js so OpenCode adapter preserves the completePhase instruction (was silently dropped by transformer) - Regenerate all affected OpenCode adapters with completePhase instructions * docs: add CHANGELOG entry for review loop fix (#235) Document the Phase 9 review loop exit condition fixes: MAX_STALLS reduction (2→1), completePhase usage in orchestrate-review, and new workflow-state phases (pre-review-gates, docs-update). * chore(enhance): add version field to orchestrate-review skill Per enhance analysis: add missing version: 5.1.0 to plugins/next-task/skills/orchestrate-review/SKILL.md and regenerate adapters to propagate to adapters/opencode/skills/orchestrate-review/SKILL.md * fix(adapter-transforms): ensure completePhase triggers JS block transform Move completePhase check to outer conditional so code blocks containing only workflowState.completePhase() (without other JS keywords) are still transformed to prose instructions in OpenCode adapters. Fixes Gemini code review comment on PR #238. * fix(review-loop): address PR review comments - console.error → console.warn for [WARN] log in completePhase (Copilot #4) - Phase 8: derive passed from actual gate results; include coverageResult in completePhase payload (Copilot #5, Codex #2) - Blocked review loop: ask user before advancing to delivery-validation with AskUserQuestion override/abort prompt (Copilot #6) * fix(orchestrate-review): use correct AskUserQuestion answer key Access blocked-path user response via response.answers[question] using the full question text as key (not the header label), with fallback for both old and new AskUserQuestion response shapes.
1 parent 5504b15 commit e2bf664

File tree

38 files changed

+507
-138
lines changed

38 files changed

+507
-138
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111

1212
### Fixed
1313

14+
- **`/next-task` review loop exit conditions** — The Phase 9 review loop now continues iterating until all issues are resolved or a stall is detected (MAX_STALLS reduced from 2 to 1: two consecutive identical-hash iterations = stall). The `orchestrate-review` skill now uses `completePhase()` instead of `updateFlow()` to properly advance workflow state. Added `pre-review-gates` and `docs-update` to the `PHASES` array and `RESULT_FIELD_MAP` in `workflow-state.js`, ensuring these phases can be tracked and resumed correctly. Fixes issue #235.
15+
1416
- **`/debate` command inline orchestration** — The `/debate` command now manages the full debate workflow directly (parse → resolve → execute → verdict), following the `/consult` pattern. The `debate-orchestrator` agent is now the programmatic entry point for other agents/workflows that need to spawn a debate via `Task()`. Fixes issue #231.
1517

1618
## [5.1.0] - 2026-02-18

__tests__/workflow-state.test.js

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,14 @@ describe('workflow-state', () => {
5050
expect(PHASES).toContain('complete');
5151
expect(PHASES.indexOf('policy-selection')).toBeLessThan(PHASES.indexOf('complete'));
5252
});
53+
54+
test('new phases are in correct order relative to neighbors', () => {
55+
expect(PHASES.indexOf('implementation')).toBeLessThan(PHASES.indexOf('pre-review-gates'));
56+
expect(PHASES.indexOf('pre-review-gates')).toBeLessThan(PHASES.indexOf('review-loop'));
57+
expect(PHASES.indexOf('review-loop')).toBeLessThan(PHASES.indexOf('delivery-validation'));
58+
expect(PHASES.indexOf('delivery-validation')).toBeLessThan(PHASES.indexOf('docs-update'));
59+
expect(PHASES.indexOf('docs-update')).toBeLessThan(PHASES.indexOf('shipping'));
60+
});
5361
});
5462

5563
describe('tasks.json operations', () => {
@@ -205,6 +213,124 @@ describe('workflow-state', () => {
205213
});
206214
});
207215

216+
describe('new phases: pre-review-gates and docs-update', () => {
217+
beforeEach(() => {
218+
createFlow(
219+
{ id: '1', title: 'Test', source: 'manual' },
220+
{ stoppingPoint: 'merged' },
221+
testDir
222+
);
223+
});
224+
225+
test('PHASES contains pre-review-gates and docs-update', () => {
226+
expect(PHASES).toContain('pre-review-gates');
227+
expect(PHASES).toContain('docs-update');
228+
});
229+
230+
test('isValidPhase accepts pre-review-gates and docs-update', () => {
231+
expect(isValidPhase('pre-review-gates')).toBe(true);
232+
expect(isValidPhase('docs-update')).toBe(true);
233+
});
234+
235+
test('completePhase from implementation advances to pre-review-gates', () => {
236+
setPhase('implementation', testDir);
237+
completePhase(null, testDir);
238+
239+
const flow = readFlow(testDir);
240+
expect(flow.phase).toBe('pre-review-gates');
241+
expect(flow.status).toBe('in_progress');
242+
});
243+
244+
test('completePhase from pre-review-gates advances to review-loop', () => {
245+
setPhase('pre-review-gates', testDir);
246+
completePhase({ passed: true }, testDir);
247+
248+
const flow = readFlow(testDir);
249+
expect(flow.phase).toBe('review-loop');
250+
expect(flow.status).toBe('in_progress');
251+
expect(flow.preReviewResult).toEqual({ passed: true });
252+
});
253+
254+
test('completePhase from review-loop stores reviewResult and advances to delivery-validation', () => {
255+
setPhase('review-loop', testDir);
256+
completePhase({ approved: true, iterations: 2 }, testDir);
257+
258+
const flow = readFlow(testDir);
259+
expect(flow.phase).toBe('delivery-validation');
260+
expect(flow.status).toBe('in_progress');
261+
expect(flow.reviewResult).toEqual({ approved: true, iterations: 2 });
262+
});
263+
264+
test('completePhase from review-loop stores blocked result (stall-detected)', () => {
265+
setPhase('review-loop', testDir);
266+
completePhase({ approved: false, blocked: true, reason: 'stall-detected', remaining: { critical: 1 } }, testDir);
267+
268+
const flow = readFlow(testDir);
269+
expect(flow.phase).toBe('delivery-validation');
270+
expect(flow.reviewResult.approved).toBe(false);
271+
expect(flow.reviewResult.reason).toBe('stall-detected');
272+
});
273+
274+
test('completePhase from review-loop stores blocked result (iteration-limit)', () => {
275+
setPhase('review-loop', testDir);
276+
completePhase({ approved: false, blocked: true, reason: 'iteration-limit', remaining: { critical: 0, high: 2 } }, testDir);
277+
278+
const flow = readFlow(testDir);
279+
expect(flow.phase).toBe('delivery-validation');
280+
expect(flow.reviewResult.reason).toBe('iteration-limit');
281+
expect(flow.reviewResult.remaining.high).toBe(2);
282+
});
283+
284+
test('completePhase stores falsy result (result !== null fix)', () => {
285+
setPhase('pre-review-gates', testDir);
286+
completePhase({ passed: false, reason: 'lint-failure' }, testDir);
287+
288+
const flow = readFlow(testDir);
289+
expect(flow.preReviewResult).toBeDefined();
290+
expect(flow.preReviewResult.passed).toBe(false);
291+
});
292+
293+
test('completePhase from shipping advances to complete', () => {
294+
setPhase('shipping', testDir);
295+
completePhase(null, testDir);
296+
297+
const flow = readFlow(testDir);
298+
expect(flow.phase).toBe('complete');
299+
expect(flow.status).toBe('completed');
300+
});
301+
302+
test('completePhase returns null for unknown current phase', () => {
303+
setPhase('review-loop', testDir);
304+
// Manually corrupt the phase to an unknown value
305+
const currentFlow = readFlow(testDir);
306+
currentFlow.phase = 'nonexistent-phase';
307+
writeFlow(currentFlow, testDir);
308+
309+
const result = completePhase(null, testDir);
310+
expect(result).toBeNull();
311+
});
312+
313+
test('completePhase from delivery-validation advances to docs-update', () => {
314+
setPhase('delivery-validation', testDir);
315+
completePhase({ passed: true }, testDir);
316+
317+
const flow = readFlow(testDir);
318+
expect(flow.phase).toBe('docs-update');
319+
expect(flow.status).toBe('in_progress');
320+
expect(flow.deliveryResult).toEqual({ passed: true });
321+
});
322+
323+
test('completePhase from docs-update advances to shipping', () => {
324+
setPhase('docs-update', testDir);
325+
completePhase({ docsUpdated: true }, testDir);
326+
327+
const flow = readFlow(testDir);
328+
expect(flow.phase).toBe('shipping');
329+
expect(flow.status).toBe('in_progress');
330+
expect(flow.docsResult).toEqual({ docsUpdated: true });
331+
});
332+
});
333+
208334
describe('convenience functions', () => {
209335
test('getFlowSummary returns summary object', () => {
210336
createFlow(

adapters/codex/skills/next-task/SKILL.md

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This workflow exists because each step serves a purpose. Taking shortcuts defeat
1717
| Step | Purpose | What Happens If Skipped |
1818
|------|---------|------------------------|
1919
| Worktree creation | Parallel task isolation | Conflicts, lost work |
20-
| Review loop (3 iterations) | Catches bugs humans miss | Bugs ship to production |
20+
| Review loop (5 iterations, stall-safe) | Catches bugs humans miss | Bugs ship to production |
2121
| 3-minute CI wait | Auto-reviewers need time | Miss critical feedback |
2222
| Address all PR comments | Quality gate | Merge blocked, trust lost |
2323

@@ -338,6 +338,9 @@ For each fix:
338338
Use Edit tool to apply. Commit message: "fix: clean up AI slop"`
339339
});
340340
}
341+
342+
const gatesPassed = (deslop.fixes?.length || 0) === 0;
343+
workflowState.completePhase({ passed: gatesPassed, deslopFixes: deslop.fixes?.length || 0, coverageResult });
341344
```
342345
</phase-8>
343346
@@ -346,6 +349,10 @@ Use Edit tool to apply. Commit message: "fix: clean up AI slop"`
346349
347350
**Blocking gate** - Must run iterations before delivery validation.
348351
352+
```javascript
353+
workflowState.startPhase('review-loop');
354+
```
355+
349356
**CRITICAL**: You MUST spawn multiple parallel reviewer agents. Do NOT use a single generic reviewer.
350357
351358
### Step 1: Get Changed Files
@@ -415,13 +422,13 @@ For each finding, use Edit tool to apply the suggested fix. Commit after each ba
415422
416423
Repeat steps 3-5 until:
417424
- `openCount === 0` (all issues resolved) -> approved
418-
- 3+ iterations with only medium/low issues -> orchestrator may override
419-
- 5 iterations reached -> blocked
425+
- Same findings hash for 2 consecutive iterations (stall detected) -> blocked
426+
- 5 iterations reached (hard limit) -> blocked
420427
421428
### Review Iteration Rules
422429
- MUST run at least 1 full iteration with ALL 4 core reviewers
423430
- Do NOT use a single generic reviewer - spawn all specialists in parallel
424-
- Orchestrator may override after 3+ iterations if only medium/low issues remain
431+
- MUST continue while `openCount > 0`. Only stop on: openCount===0, stall detection, or 5-iteration hard limit
425432
- Do not skip directly to delivery validation
426433
- Do not claim "review passed" without spawning the reviewer agents
427434
@@ -436,6 +443,11 @@ After review loop completes, output:
436443
- Findings resolved: X critical, Y high, Z medium
437444
- Status: approved | blocked
438445
```
446+
447+
Then advance the workflow state:
448+
```javascript
449+
workflowState.completePhase({ approved, iterations, remaining });
450+
```
439451
</phase-9>
440452
441453
<phase-10>

adapters/opencode/agents/ci-monitor.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ Agent MUST exit monitoring loop when ANY of these occur:
9292
4. ci-fixer reports unfixable issue - ESCALATE
9393

9494
- Phase: ci-wait
95+
- Call `workflowState.completePhase(result)` to advance workflow state
9596

9697

9798
## Output Format

adapters/opencode/agents/delivery-validator.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -74,13 +74,8 @@ delivery-validator (YOU ARE HERE)
7474

7575
## State Updates
7676

77-
```javascript
78-
// On success
79-
workflowState.completePhase({ approved: true, checks });
77+
- Call `workflowState.completePhase(result)` to advance workflow state
8078

81-
// On failure
82-
workflowState.failPhase('Validation failed', { failedChecks, fixInstructions });
83-
```
8479

8580
## Output Format
8681

adapters/opencode/agents/exploration-agent.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ git diff HEAD~20 -- ${RELEVANT_DIRS} --stat
117117
## Phase 10: Update State
118118

119119
- Phase: exploration
120+
- Call `workflowState.completePhase(result)` to advance workflow state
120121

121122

122123
## Output Format

adapters/opencode/agents/implementation-agent.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,8 @@ Make atomic, meaningful commits:
133133

134134
After implementation completes:
135135

136-
*(JavaScript reference - not executable in OpenCode)*
136+
- Call `workflowState.completePhase(result)` to advance workflow state
137+
137138

138139
## [CRITICAL] WORKFLOW GATES - READ CAREFULLY
139140

adapters/opencode/commands/next-task.md

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ This workflow exists because each step serves a purpose. Taking shortcuts defeat
5050
| Step | Purpose | What Happens If Skipped |
5151
|------|---------|------------------------|
5252
| Worktree creation | Parallel task isolation | Conflicts, lost work |
53-
| Review loop (3 iterations) | Catches bugs humans miss | Bugs ship to production |
53+
| Review loop (5 iterations, stall-safe) | Catches bugs humans miss | Bugs ship to production |
5454
| 3-minute CI wait | Auto-reviewers need time | Miss critical feedback |
5555
| Address all PR comments | Quality gate | Merge blocked, trust lost |
5656

@@ -236,11 +236,9 @@ Spawn: `worktree-manager` (haiku)
236236

237237
**Last human interaction point.** Present plan via EnterPlanMode/ExitPlanMode.
238238

239-
```javascript
240-
EnterPlanMode();
241-
// User reviews and approves via ExitPlanMode
242-
workflowState.completePhase({ planApproved: true, plan });
243-
```
239+
- Use EnterPlanMode for user approval
240+
- Call `workflowState.completePhase(result)` to advance workflow state
241+
244242
</phase-6>
245243

246244
<phase-7>
@@ -260,6 +258,7 @@ workflowState.completePhase({ planApproved: true, plan });
260258
- Invoke `@deslop-agent` agent
261259
- Invoke `@test-coverage-checker` agent
262260
- Phase: pre-review-gates
261+
- Call `workflowState.completePhase(result)` to advance workflow state
263262

264263
</phase-8>
265264

@@ -268,6 +267,10 @@ workflowState.completePhase({ planApproved: true, plan });
268267

269268
**Blocking gate** - Must run iterations before delivery validation.
270269

270+
```javascript
271+
workflowState.startPhase('review-loop');
272+
```
273+
271274
**CRITICAL**: You MUST spawn multiple parallel reviewer agents. Do NOT use a single generic reviewer.
272275

273276
### Step 1: Get Changed Files
@@ -307,13 +310,13 @@ For each finding, use Edit tool to apply the suggested fix. Commit after each ba
307310

308311
Repeat steps 3-5 until:
309312
- `openCount === 0` (all issues resolved) -> approved
310-
- 3+ iterations with only medium/low issues -> orchestrator may override
311-
- 5 iterations reached -> blocked
313+
- Same findings hash for 2 consecutive iterations (stall detected) -> blocked
314+
- 5 iterations reached (hard limit) -> blocked
312315

313316
### Review Iteration Rules
314317
- MUST run at least 1 full iteration with ALL 4 core reviewers
315318
- Do NOT use a single generic reviewer - spawn all specialists in parallel
316-
- Orchestrator may override after 3+ iterations if only medium/low issues remain
319+
- MUST continue while `openCount > 0`. Only stop on: openCount===0, stall detection, or 5-iteration hard limit
317320
- Do not skip directly to delivery validation
318321
- Do not claim "review passed" without spawning the reviewer agents
319322

@@ -328,6 +331,10 @@ After review loop completes, output:
328331
- Findings resolved: X critical, Y high, Z medium
329332
- Status: approved | blocked
330333
```
334+
335+
Then advance the workflow state:
336+
- Call `workflowState.completePhase(result)` to advance workflow state
337+
331338
</phase-9>
332339

333340
<phase-10>
@@ -349,6 +356,7 @@ Uses the unified sync-docs agent from the sync-docs plugin with `before-pr` scop
349356

350357
- Invoke `@sync-docs-agent` agent
351358
- Phase: docs-update
359+
- Call `workflowState.completePhase(result)` to advance workflow state
352360

353361
</phase-11>
354362

adapters/opencode/commands/ship.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@ Parse from $ARGUMENTS:
7878

7979
## State Integration
8080

81-
*(JavaScript reference - not executable in OpenCode)*
81+
- Call `workflowState.completePhase(result)` to advance workflow state
82+
8283

8384
## Phase 1: Pre-flight Checks
8485

adapters/opencode/skills/discover-tasks/SKILL.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,8 @@ done
6666

6767
### Phase 5: Update State
6868

69-
*(JavaScript reference - not executable in OpenCode)*
69+
- Call `workflowState.completePhase(result)` to advance workflow state
70+
7071

7172
### Phase 6: Post Comment (GitHub only)
7273

0 commit comments

Comments
 (0)