Merge pull request #83 from dmoliveira/my_opencode-e14-validation-docs

dmoliveira · web-flow · commit 0b99496aa676 · 2026-02-13T21:58:23.000+11:00
Complete E14-T4 validation coverage and docs
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -47,6 +47,7 @@ All notable changes to this project are documented in this file.
 - Added `scripts/start_work_command.py` with `/start-work <plan>` execution, persisted checkpoint status, and deviation reporting (`status`, `deviations`).
 - Added `/start-work`, `/start-work-status`, and `/start-work-deviations` aliases in `opencode.json`.
 - Added `/start-work-bg` and `/start-work-doctor-json` aliases for background-safe queueing and execution health diagnostics.
+- Added `instructions/plan_execution_workflows.md` with sample plans and direct/background/recovery workflows for `/start-work`.
 
 ### Changes
 - Documented extension evaluation outcomes and when each tool is the better fit.
@@ -90,6 +91,7 @@ All notable changes to this project are documented in this file.
 - Expanded browser verification coverage to assert provider reset readiness and added install smoke checks that run browser status/doctor after switching across providers.
 - Expanded install/selftest coverage for `/start-work` plan validation, execution state persistence, and deviation diagnostics.
 - Expanded `/start-work` integrations with background queue handoff, digest recap payloads, and unified `/doctor` visibility.
+- Expanded `/start-work` validation coverage for missing frontmatter, out-of-order ordinals, and recovery from invalid runtime state.
 
 ## v0.2.0 - 2026-02-12
 
diff --git a/IMPLEMENTATION_ROADMAP.md b/IMPLEMENTATION_ROADMAP.md
@@ -50,7 +50,7 @@ Use this map to avoid overlapping implementations.
 | E11 | Context-Window Resilience Toolkit | done | High | E4 | bd-2tj, bd-n9y, bd-2t0, bd-18e | Improve long-session stability and recovery |
 | E12 | Provider/Model Fallback Visibility | done | Medium | E5 | bd-1jq, bd-298, bd-194, bd-2gq | Explain why model routing decisions happen |
 | E13 | Browser Automation Profile Switching | done | Medium | E1 | bd-3rs, bd-2qy, bd-f6g, bd-393 | Toggle Playwright/agent-browser with checks |
-| E14 | Plan-to-Execution Bridge Command | in_progress | Medium | E2, E3 | bd-1z6, bd-2te, bd-3sg | Execute validated plans with progress tracking |
+| E14 | Plan-to-Execution Bridge Command | done | Medium | E2, E3 | bd-1z6, bd-2te, bd-3sg, bd-2bv | Execute validated plans with progress tracking |
 | E15 | Todo Enforcer and Plan Compliance | planned | High | E14 | TBD | Keep execution aligned with approved checklists |
 | E16 | Comment and Output Quality Checker Loop | merged | Medium | E23 | TBD | Merged into E23 (PR Review Copilot) |
 | E17 | Auto-Resume and Recovery Loop | planned | High | E11, E14 | TBD | Resume interrupted work from checkpoints safely |
@@ -563,7 +563,7 @@ Every command-oriented epic must ship all of the following:
 
 ## Epic 14 - Plan-to-Execution Bridge Command
 
-**Status:** `in_progress`
+**Status:** `done`
 **Priority:** Medium
 **Goal:** Add a command to execute from an approved plan artifact with progress tracking and deviation reporting.
 **Depends on:** Epic 2, Epic 3
@@ -583,12 +583,13 @@ Every command-oriented epic must ship all of the following:
   - [x] Subtask 14.3.2: Integrate with digest summaries for end-of-run recap
   - [x] Subtask 14.3.3: Expose execution status in doctor/debug outputs
   - [x] Notes: Added background-safe `/start-work` queueing (`--background` + `/start-work-bg`), digest `plan_execution` recap output, and `/doctor` integration via `/start-work doctor --json`.
-- [ ] Task 14.4: Validation and docs
-  - [ ] Subtask 14.4.1: Add tests for plan parsing and execution flow
-  - [ ] Subtask 14.4.2: Add recovery tests for interrupted plan runs
-  - [ ] Subtask 14.4.3: Add docs with sample plans and workflows
-- [ ] Exit criteria: approved plans can be executed and resumed with clear state
-- [ ] Exit criteria: deviations are explicitly surfaced and reviewable
+- [x] Task 14.4: Validation and docs
+  - [x] Subtask 14.4.1: Add tests for plan parsing and execution flow
+  - [x] Subtask 14.4.2: Add recovery tests for interrupted plan runs
+  - [x] Subtask 14.4.3: Add docs with sample plans and workflows
+  - [x] Notes: Expanded `scripts/selftest.py` with additional plan validation/recovery checks and added `instructions/plan_execution_workflows.md` with sample plans plus direct/background/recovery workflows.
+- [x] Exit criteria: approved plans can be executed and resumed with clear state
+- [x] Exit criteria: deviations are explicitly surfaced and reviewable
 
 ---
 
diff --git a/README.md b/README.md
@@ -584,6 +584,7 @@ Recommended workflow:
 Epic 14 Task 14.1 defines the baseline plan format and execution-state rules for the upcoming `/start-work <plan>` command:
 
 - contract spec: `instructions/plan_artifact_contract.md`
+- validation/workflow guide: `instructions/plan_execution_workflows.md`
 - backend command: `scripts/start_work_command.py`
 - format scope: markdown checklist + YAML metadata frontmatter
 - validation scope: deterministic preflight failures with line-level remediation hints
diff --git a/instructions/plan_execution_workflows.md b/instructions/plan_execution_workflows.md
@@ -0,0 +1,55 @@
+# Plan Execution Validation and Workflow Guide
+
+This guide accompanies Epic 14 Task 14.4 and documents validated workflows for `/start-work`.
+
+## Sample plan artifact
+
+Use this baseline plan format:
+
+```markdown
+---
+id: sample-plan-001
+title: Sample implementation plan
+owner: diego
+created_at: 2026-02-13T00:00:00Z
+version: 1
+---
+
+# Plan
+
+- [ ] 1. Prepare implementation environment
+- [ ] 2. Apply code changes
+- [ ] 3. Run verification and summarize results
+```
+
+## Primary execution workflow
+
+1. Run `/start-work path/to/plan.md --json`.
+2. Verify state with `/start-work status --json`.
+3. Review deviations with `/start-work deviations --json`.
+4. Run `/start-work doctor --json` before handoff.
+
+## Background-safe workflow
+
+Use queued execution when you want reviewable handoff through the background subsystem:
+
+1. Run `/start-work-bg path/to/plan.md`.
+2. Capture returned `job_id`.
+3. Execute queued work with `/bg run --id <job-id>`.
+4. Inspect logs via `/bg read <job-id> --json`.
+5. Confirm final state using `/start-work status --json`.
+
+## Validation failure examples
+
+- Missing frontmatter should fail with `validation_failed` and `missing_frontmatter` violation.
+- Out-of-order checklist ordinals should fail with `validation_failed` and `out_of_order_ordinals` violation.
+- Non-numbered checklist items should fail with `validation_failed` and `missing_step_ordinal` violation.
+
+## Recovery workflow
+
+When runtime state is inconsistent (for example, multiple steps marked `in_progress`):
+
+1. Run `/start-work doctor --json` to confirm failure diagnostics.
+2. Re-run a valid plan with `/start-work path/to/plan.md --json` to restore deterministic state.
+3. Re-check with `/start-work doctor --json` and `/doctor run --json`.
+4. Run `/digest run --reason manual` to capture end-of-run recap including `plan_execution` summary.
diff --git a/scripts/selftest.py b/scripts/selftest.py
@@ -1840,6 +1840,122 @@ def run_bg(*args: str) -> subprocess.CompletedProcess[str]:
             "start-work should report validation_failed for invalid plan format",
         )
 
+        malformed_frontmatter_plan = tmp / "invalid_frontmatter_plan.md"
+        malformed_frontmatter_plan.write_text(
+            """# Plan
+
+- [ ] 1. Missing metadata should fail
+""",
+            encoding="utf-8",
+        )
+        malformed_start_work = subprocess.run(
+            [
+                sys.executable,
+                str(START_WORK_SCRIPT),
+                str(malformed_frontmatter_plan),
+                "--json",
+            ],
+            capture_output=True,
+            text=True,
+            env=refactor_env,
+            check=False,
+            cwd=REPO_ROOT,
+        )
+        expect(
+            malformed_start_work.returncode == 1,
+            "start-work should fail when frontmatter is missing",
+        )
+        malformed_start_work_report = parse_json_output(malformed_start_work.stdout)
+        expect(
+            malformed_start_work_report.get("code") == "validation_failed",
+            "start-work should return validation_failed for missing frontmatter",
+        )
+
+        out_of_order_plan = tmp / "invalid_out_of_order_plan.md"
+        out_of_order_plan.write_text(
+            """---
+id: out-of-order-plan
+title: Out Of Order Plan
+owner: selftest
+created_at: 2026-02-13T00:00:00Z
+version: 1
+---
+
+# Plan
+
+- [ ] 2. Second task appears first
+- [ ] 1. First task appears second
+""",
+            encoding="utf-8",
+        )
+        out_of_order_start_work = subprocess.run(
+            [sys.executable, str(START_WORK_SCRIPT), str(out_of_order_plan), "--json"],
+            capture_output=True,
+            text=True,
+            env=refactor_env,
+            check=False,
+            cwd=REPO_ROOT,
+        )
+        expect(
+            out_of_order_start_work.returncode == 1,
+            "start-work should fail out-of-order step ordinals",
+        )
+        out_of_order_report = parse_json_output(out_of_order_start_work.stdout)
+        expect(
+            any(
+                violation.get("code") == "out_of_order_ordinals"
+                for violation in out_of_order_report.get("violations", [])
+                if isinstance(violation, dict)
+            ),
+            "start-work should surface out_of_order_ordinals violation",
+        )
+
+        runtime_config_path = Path(str(start_work_report.get("config") or ""))
+        expect(
+            runtime_config_path.exists(),
+            "start-work should report writable config path",
+        )
+        runtime_cfg = load_json_file(runtime_config_path)
+        runtime_cfg.setdefault("plan_execution", {})["status"] = "in_progress"
+        runtime_cfg["plan_execution"]["steps"] = [
+            {"ordinal": 1, "state": "in_progress"},
+            {"ordinal": 2, "state": "in_progress"},
+        ]
+        runtime_config_path.write_text(
+            json.dumps(runtime_cfg, indent=2) + "\n", encoding="utf-8"
+        )
+
+        start_work_doctor_fail = subprocess.run(
+            [sys.executable, str(START_WORK_SCRIPT), "doctor", "--json"],
+            capture_output=True,
+            text=True,
+            env=refactor_env,
+            check=False,
+            cwd=REPO_ROOT,
+        )
+        expect(
+            start_work_doctor_fail.returncode == 1,
+            "start-work doctor should fail invalid in-progress step recovery state",
+        )
+        start_work_doctor_fail_report = parse_json_output(start_work_doctor_fail.stdout)
+        expect(
+            start_work_doctor_fail_report.get("result") == "FAIL",
+            "start-work doctor should report FAIL for invalid recovery state",
+        )
+
+        start_work_recover = subprocess.run(
+            [sys.executable, str(START_WORK_SCRIPT), str(plan_path), "--json"],
+            capture_output=True,
+            text=True,
+            env=refactor_env,
+            check=False,
+            cwd=REPO_ROOT,
+        )
+        expect(
+            start_work_recover.returncode == 0,
+            "start-work should recover by re-running valid plan after invalid runtime state",
+        )
+
         keyword_report = resolve_prompt_modes(
             "Please safe-apply and deep-analyze this migration; ulw can wait.",
             enabled=True,