This roadmap tracks phased delivery of advanced orchestration features inspired by gaps identified versus oh-my-opencode.
- Task completion: use
[ ]and[x]. - Epic status values:
planned,in_progress,paused,merged,done,postponed. - Recommendation: move only one epic to
in_progressat a time.
planned: scoped and ready, not started.in_progress: actively being implemented now.paused: started but intentionally stopped; can resume any time.merged: scope absorbed into another epic to avoid duplication.postponed: intentionally deferred; not expected this cycle.done: fully implemented, documented, and validated.
High: foundational or high-risk controls; implement in near-term phases.Medium: meaningful acceleration; schedule after foundations stabilize.Low: optional power-user capability; defer when capacity is constrained.
Use this map to avoid overlapping implementations.
/start-work(E14): executes a prepared plan artifact step-by-step./autoflow(E22): unified orchestration wrapper for plan/todo/recovery/report primitives./autopilot(E28): bounded objective runner on top of/autoflowwith strict budget control./loop(merged into E22/E28): optional loop controls, not a standalone roadmap epic./hotfix(E25): constrained emergency path with mandatory minimum safeguards.
| Epic | Title | Status | Priority | Depends On | br Issue | Notes |
|---|---|---|---|---|---|---|
| E1 | Config Layering + JSONC Support | done | High | - | bd-1g0, bd-208, bd-4j1 | Foundation for most later epics |
| E2 | Background Task Orchestration | done | High | E1 | bd-1ob, bd-3lf, bd-2xo, bd-mb2 | Keep minimal and stable first |
| E3 | Refactor Workflow Command | done | High | E1 | bd-zfx, bd-vc3, bd-2ps, bd-3fr | Safer rollout after config layering |
| E4 | Continuation and Safety Hooks | done | Medium | E1, E2 | bd-1h0, bd-1ex, bd-1dr, bd-3uq | Start with minimal hooks only |
| E5 | Category-Based Model Routing | done | Medium | E1 | bd-2z6, bd-m48, bd-15y, bd-222 | Can partially overlap with E2/E3 |
| E6 | Session Intelligence and Resume Tooling | done | Medium | E2 | bd-23l, bd-e5q, bd-3ju | Completed session indexing, command surface, and resume-hint workflows |
| E7 | Tmux Visual Multi-Agent Mode | postponed | Low | E2 | TBD | Optional power-user feature |
| E8 | Keyword-Triggered Execution Modes | done | High | E1, E4 | bd-302, bd-2fb, bd-2zq, bd-3dp | Fast power-mode activation from prompt text |
| E9 | Conditional Rules Injector | done | High | E1 | bd-1q8, bd-3rj, bd-fo8, bd-2ik | Enforce project conventions with scoped rules |
| E10 | Auto Slash Command Detector | done | Medium | E1, E8 | bd-wbo, bd-3nv | Shipped preview-first intent mapping for doctor/stack/nvim/devtools with audit logging |
| E11 | Context-Window Resilience Toolkit | done | High | E4 | bd-2tj, bd-n9y, bd-2t0, bd-18e | Improve long-session stability and recovery |
| E12 | Provider/Model Fallback Visibility | done | Medium | E5 | bd-1jq, bd-298, bd-194, bd-2gq | Explain why model routing decisions happen |
| E13 | Browser Automation Profile Switching | done | Medium | E1 | bd-3rs, bd-2qy, bd-f6g, bd-393 | Toggle Playwright/agent-browser with checks |
| E14 | Plan-to-Execution Bridge Command | done | Medium | E2, E3 | bd-1z6, bd-2te, bd-3sg, bd-2bv | Execute validated plans with progress tracking |
| E15 | Todo Enforcer and Plan Compliance | done | High | E14 | bd-l9c | Keep execution aligned with approved checklists |
| E16 | Comment and Output Quality Checker Loop | merged | Medium | E23 | TBD | Merged into E23 (PR Review Copilot) |
| E17 | Auto-Resume and Recovery Loop | done | High | E11, E14 | bd-1ho, bd-2vc, bd-1yz, bd-2xk | Resume interrupted work from checkpoints safely |
| E18 | LSP/AST-Assisted Safe Edit Mode | done | High | E3 | bd-3cg, bd-2ln, bd-10l, bd-lds | Prefer semantic edits over plain text replacements |
| E19 | Session Checkpoint Snapshots | done | Medium | E2, E17 | bd-553, bd-3tb, bd-3um, bd-3gm | Durable state for rollback and restart safety |
| E20 | Execution Budget Guardrails | done | High | E2, E11 | bd-63f | Bound time/tool/token usage for autonomous runs |
| E21 | Bounded Loop Mode Presets | merged | Medium | E22, E28 | TBD | Merged into E22/E28 loop controls |
| E22 | Autoflow Unified Orchestration Command | done | High | E14, E15, E17, E19, E20 | TBD | One command for plan-run-resume-report lifecycle |
| E23 | PR Review Copilot | done | High | E3 | bd-u6t | Pre-PR quality, output, and risk review automation |
| E24 | Release Train Assistant | done | High | E14, E23 | bd-nk3 | Validate, draft, and gate releases reliably |
| E25 | Incident Hotfix Mode | done | Medium | E20, E22 | bd-kow | Constrained emergency workflow with strict safety |
| E26 | Repo Health Score and Drift Monitor | done | Medium | E9, E12, E20 | TBD | Operational visibility and continuous diagnostics |
| E27 | Knowledge Capture from Completed Tasks | done | Medium | E9, E14, E23 | TBD | Convert delivered work into reusable team memory |
| E28 | Autopilot Objective Runner Command | done | High | E20, E22 | TBD | Completed bounded objective lifecycle with verification and install smoke coverage |
- Keep migration stable-first: ship low-risk foundations before advanced orchestration.
- Prefer additive changes and compatibility fallbacks over breaking behavior.
- Do not expand to unrelated feature areas during in-progress epics.
Start an epic only when all are true:
- Clear user pain is documented and measurable.
- Existing command/profile cannot solve the problem with small changes.
- Expected value is higher than maintenance cost after launch.
- Rollback path is defined and tested.
If any condition is missing, keep the epic paused or postponed.
Promotion from paused/postponed to active planning requires all of the following:
- At least two recent operator pain reports linked to the epic scope.
- A bounded prototype plan with explicit success metrics and rollback steps.
- Validation impact estimate (
make validate,make selftest,make install-test) documented before implementation. - A no-surprise execution mode (preview-first or opt-in) for any automation that can trigger commands.
Epic-specific trigger:
- E7 (Tmux Visual Mode): confirmed demand from at least two active workflows and a fallback UX that leaves non-tmux users unaffected.
- Prefer extending existing commands over introducing new top-level commands.
- Prefer one robust implementation path over multiple experimental variants.
- Defer optional UX layers until core reliability/diagnostics are stable.
- Dependencies must reference earlier or same-phase epics only (no forward references).
- Avoid circular dependencies; when uncertain, split shared prerequisites into a separate task.
- If an epic dependency changes, update both the epic block and dashboard row in the same PR.
For high-risk automation epics (E20, E22, E28), require:
- A prototype phase with success/failure metrics before full implementation.
- A kill-switch and rollback checklist in the first delivery PR.
- A post-release observation window with explicit go/no-go decision.
- Full rewrite of existing command scripts in a new language/runtime.
- Broad UI redesign of docs/install flows unrelated to orchestration objectives.
- Large provider/model benchmarking initiatives beyond routing correctness.
Every command-oriented epic must ship all of the following:
- README updates with command purpose and options.
- At least 3 practical examples (basic, intermediate, failure/recovery).
- One end-to-end workflow showing where the command maximizes throughput.
- Prefer one concrete verb per subtask (
define,implement,integrate,verify,document). - Avoid duplicate subtasks when covered by cross-cutting criteria in
Task C2. - Keep subtask text implementation-specific; move generic policy wording to shared sections.
- Epic moved to
in_progressin this file and dashboard row updated. - Matching
brissue created and linked in dashboard. - Worktree branch created using full workflow.
- Success metrics and risk notes reviewed before implementation starts.
- All epic tasks/subtasks and exit criteria checked
[x]. - Docs and tests updated and validated (
make validate,make selftest,make install-test). - PR merged and cleanup completed (branch/worktree removed, main synced).
- Epic status moved to
done, dashboard row updated, and weekly log updated.
| Risk | Impact | Mitigation |
|---|---|---|
| Config precedence bugs (E1) break expected behavior | High | Add golden-file tests for precedence + compatibility fallback |
| Background jobs leak state/processes (E2) | High | Add retention cleanup, stale timeout, and cancel-safe handling |
| Refactor workflow too aggressive (E3) | Medium | Default to safe mode and require verification gates |
| Hooks generate noise/regressions (E4) | Medium | Keep hooks opt-in/disableable with deterministic ordering |
| Model routing confusion (E5) | Medium | Expose effective resolution in doctor output and docs |
| Session index growth (E6) | Low | Add retention policy and cleanup command |
| Tmux complexity support burden (E7) | Low | Keep postponed unless strong usage signal appears |
| Keyword mode false positives (E8) alter behavior unexpectedly | Medium | Require explicit keywords and add safe opt-out switch |
| Rules injector over-constrains outputs (E9) | Medium | Add precedence, conflict reporting, and per-rule disable |
| Auto slash misfires on normal prompts (E10) | Medium | Add confidence threshold and preview-before-run mode |
| Context pruning removes needed evidence (E11) | High | Protect critical tools/messages and keep reversible summaries |
| Fallback reporting leaks noisy internals (E12) | Low | Keep verbose chain behind debug/doctor views only |
| Browser profile setup drift (E13) | Medium | Add doctor checks and install verification scripts |
| Plan execution diverges from approved plan (E14) | Medium | Lock plan snapshot and require explicit deviation notes |
| Todo enforcer blocks valid edge workflows (E15) | Medium | Add bypass with explicit annotation + audit trail |
| Auto-resume repeats harmful action (E17) | High | Require idempotency checks and last-step verification |
| LSP/AST mode unavailable in some repos (E18) | Medium | Provide graceful fallback to safe text-mode edits |
| Checkpoint snapshots grow too quickly (E19) | Low | Add retention cap and compression/rotation |
| Budget guardrails too strict for complex tasks (E20) | Medium | Provide profile-based limits and controlled override |
| Autoflow hides too much control and confuses users (E22) | Medium | Keep subcommands explicit and expose dry-run plus explain mode |
| PR copilot misses critical regressions (E23) | Medium | Blend deterministic checks with configurable risk heuristics |
| Release assistant automates wrong tag/version (E24) | High | Enforce explicit version confirmation and dry-run output |
| Hotfix mode bypasses important checks (E25) | High | Keep mandatory minimum verification and post-hotfix audit |
| Health score becomes noisy and ignored (E26) | Medium | Weight high-signal checks and suppress repetitive noise |
| Knowledge capture stores low-quality patterns (E27) | Medium | Add approval workflow and confidence scoring before publish |
| Autopilot over-automation causes unintended actions (E28) | High | Keep objective scope limits, dry-run default, and hard budget caps |
Status: done
Priority: High
Goal: Add user/project layered config and JSONC parsing so behavior can be customized per repo without mutating global defaults.
Depends on: None
- Task 1.1: Define configuration precedence and file discovery
- Subtask 1.1.1: Document precedence order (
project>user> bundled defaults) - Subtask 1.1.2: Define file paths (
.opencode/my_opencode.jsonc,.opencode/my_opencode.json,~/.config/opencode/my_opencode.jsonc,~/.config/opencode/my_opencode.json) - Subtask 1.1.3: Define merge semantics (object merge, array replacement, explicit overrides)
- Subtask 1.1.1: Document precedence order (
- Task 1.2: Implement config loader module
- Subtask 1.2.1: Create parser supporting JSON and JSONC
- Subtask 1.2.2: Implement precedence-based merge and validation
- Subtask 1.2.3: Add schema validation and actionable error messages
- Task 1.3: Integrate layered config into command scripts
- Subtask 1.3.1: Wire loader into
mcp/plugin/notify/telemetry/post-session/policy/stack/nvim/devtoolsflows - Subtask 1.3.2: Keep existing env var overrides as highest-priority runtime override
- Subtask 1.3.3: Add compatibility fallback when only legacy files exist
- Subtask 1.3.1: Wire loader into
- Task 1.4: Documentation and tests
- Subtask 1.4.1: Add docs with examples for user/project overrides
- Subtask 1.4.2: Add selftests for precedence and JSONC behavior
- Subtask 1.4.3: Add install-test coverage for layered config discovery
- Exit criteria: all command scripts resolve config through shared layered loader
- Exit criteria: precedence + JSONC behavior covered by tests and docs
Status: done
Priority: High
Goal: Add lightweight background job workflows for async research and result retrieval.
Depends on: Epic 1
- Task 2.1: Design minimal background task model
- Subtask 2.1.1: Define job lifecycle (
queued,running,completed,failed,cancelled) - Subtask 2.1.2: Define persistent state file format and retention policy
- Subtask 2.1.3: Define maximum concurrency and stale-timeout defaults
- Notes: See
instructions/background_task_model.mdfor lifecycle transitions, storage schema, and deterministic defaults.
- Subtask 2.1.1: Define job lifecycle (
- Task 2.2: Implement background task manager script
- Subtask 2.2.1: Add enqueue/run/read/list/cancel operations
- Subtask 2.2.2: Capture stdout/stderr and execution metadata
- Subtask 2.2.3: Add stale job detection and cleanup
- Notes: Implemented in
scripts/background_task_manager.pywith deterministic selftest coverage.
- Task 2.3: Expose OpenCode commands
- Subtask 2.3.1: Add
/bgcommand family (start|status|list|read|cancel) - Subtask 2.3.2: Add autocomplete shortcuts for high-frequency operations
- Subtask 2.3.3: Integrate with
/doctorsummary checks - Notes: Added
/bgcommand + shortcuts inopencode.jsonand wiredbgdiagnostics intoscripts/doctor_command.py.
- Subtask 2.3.1: Add
- Task 2.4: Notifications and diagnostics
- Subtask 2.4.1: Add optional completion notification via existing notify stack
- Subtask 2.4.2: Add JSON diagnostics output for background subsystem
- Subtask 2.4.3: Add docs and examples for async workflows
- Notes:
scripts/background_task_manager.pynow emits optional notify-aligned alerts and exposes richerstatus --json/doctor --jsondiagnostics.
- Exit criteria: background workflows are deterministic, inspectable, and cancel-safe
- Exit criteria: doctor + docs cover baseline troubleshooting
Status: done
Priority: High
Goal: Add a safe, repeatable refactor workflow command using existing tools and verification gates.
Depends on: Epic 1
- Task 3.1: Define command contract
- Subtask 3.1.1: Define syntax (
/refactor-lite <target> [--scope] [--strategy]) - Subtask 3.1.2: Define safe defaults and guardrails (
safeby default) - Subtask 3.1.3: Define success/failure output shape
- Notes: See
instructions/refactor_lite_contract.md.
- Subtask 3.1.1: Define syntax (
- Task 3.2: Implement workflow backend
- Subtask 3.2.1: Add preflight analysis step (grep + file map)
- Subtask 3.2.2: Add structured plan preview output
- Subtask 3.2.3: Add post-change verification hooks (
make validate, optionalmake selftest) - Notes: Implemented in
scripts/refactor_lite_command.pywith deterministic selftest coverage.
- Task 3.3: OpenCode integration
- Subtask 3.3.1: Add
/refactor-liteand helper commands toopencode.json - Subtask 3.3.2: Add installer self-check hints
- Subtask 3.3.3: Add
/doctoroptional check when command is configured - Notes: Added
/refactor-litetemplates, installer hints, and optionalrefactor-litedoctor check.
- Subtask 3.3.1: Add
- Task 3.4: Tests and docs
- Subtask 3.4.1: Add selftest scenarios for argument parsing and safe-mode behavior
- Subtask 3.4.2: Add docs for safe vs aggressive strategies
- Subtask 3.4.3: Add install-test smoke checks
- Notes: Expanded
/refactor-liteselftests for missing-target and safe-mode ambiguity handling, plus install smoke coverage.
- Exit criteria: safe mode is default and validates before completion
- Exit criteria: failure output gives actionable remediation
Status: done
Priority: Medium
Goal: Add minimal lifecycle automation hooks for continuation and resilience without introducing heavy complexity.
Depends on: Epic 1, Epic 2
- Task 4.1: Hook framework baseline
- Subtask 4.1.1: Define hook events (
PreToolUse,PostToolUse,Stop) for our scope - Subtask 4.1.2: Define hook config and disable list
- Subtask 4.1.3: Implement deterministic execution order
- Notes: Added
scripts/hook_framework.pybaseline planner and selftest coverage for deterministic ordering + disabled hook filtering.
- Subtask 4.1.1: Define hook events (
- Task 4.2: Initial hooks
- Subtask 4.2.1: Add continuation reminder hook for unfinished explicit checklists
- Subtask 4.2.2: Add output truncation safety hook for large tool outputs
- Subtask 4.2.3: Add basic error recovery hint hook for common command failures
- Notes: Added
scripts/hook_actions.pyand/hookscommand for continuation reminders, truncation safety, and common failure recovery hints.
- Task 4.3: Governance and controls
- Subtask 4.3.1: Add opt-out per hook via config
- Subtask 4.3.2: Add telemetry-safe logging for hook actions
- Subtask 4.3.3: Add docs for enabling/disabling hooks
- Notes: Added
/hooksconfig controls (enable,disable, per-hook toggle), telemetry-safe hook audit logging, and governance docs.
- Task 4.4: Verification
- Subtask 4.4.1: Add selftests for hook order and disable behavior
- Subtask 4.4.2: Add install-test smoke checks
- Subtask 4.4.3: Add doctor check summary for hook health
- Notes: Added
/hooks doctor --json, included hook health in unified/doctor, and expanded deterministic selftests/install smoke for hook controls.
- Exit criteria: hooks are optional, predictable, and low-noise by default
- Exit criteria: disabling individual hooks is tested and documented
Status: done
Priority: Medium
Goal: Introduce category presets (quick/deep/visual/writing) for better cost/performance model routing.
Depends on: Epic 1
- Task 5.1: Define category schema
- Subtask 5.1.1: Define baseline categories and descriptions
- Subtask 5.1.2: Define category settings (
model,temperature,reasoning,verbosity) - Subtask 5.1.3: Define fallback behavior when model is unavailable
- Notes: Added baseline schema/validation/resolution helpers in
scripts/model_routing_schema.pyand schema docs ininstructions/model_routing_schema.md.
- Task 5.2: Implement resolution engine
- Subtask 5.2.1: Resolve from user override -> category default -> system default
- Subtask 5.2.2: Add deterministic fallback logging for diagnostics
- Subtask 5.2.3: Add integration points with
/stackand wizard profiles - Notes: Added
scripts/model_routing_command.py, deterministic resolution trace inresolve_model_settings, and stack/wizard integration for model profile selection.
- Task 5.3: UX and docs
- Subtask 5.3.1: Add
/model-profilecommand surface - Subtask 5.3.2: Document practical routing examples by workload
- Subtask 5.3.3: Add doctor visibility for effective routing
- Notes: Added
/model-profilealiases over routing backend, practical workload guidance in README, andmodel-routingcoverage in unified/doctor.
- Subtask 5.3.1: Add
- Task 5.4: Verification
- Subtask 5.4.1: Add tests for precedence and fallback
- Subtask 5.4.2: Add tests for stack integration
- Subtask 5.4.3: Add install-test checks
- Notes: Added deterministic fallback-reason assertions in selftest and expanded install smoke routing resolve scenarios.
- Exit criteria: effective model resolution is visible and explainable
- Exit criteria: fallback behavior is deterministic and tested
Status: done
Priority: Medium
Goal: Add lightweight session listing/search and structured resume cues.
Depends on: Epic 2
- Task 6.1: Session metadata index
- Subtask 6.1.1: Define session metadata store format
- Subtask 6.1.2: Record key events and timestamps
- Subtask 6.1.3: Add retention and cleanup strategy
- Notes: Added
scripts/session_metadata_index.pywith deterministic index schema (~/.config/opencode/sessions/index.json), digest-event ingestion, and retention controls (max_sessions,max_age_days,max_events_per_session) sourced from layered config.
- Task 6.2: Session commands
- Subtask 6.2.1: Add
/session list - Subtask 6.2.2: Add
/session show <id> - Subtask 6.2.3: Add
/session search <query> - Notes: Added
scripts/session_command.pywithlist|show|search|doctorsurfaces, wired/session*aliases inopencode.json, and integrated session command checks into installer self-check and unified doctor coverage.
- Subtask 6.2.1: Add
- Task 6.3: Resume support
- Subtask 6.3.1: Add
resume-hintsoutput after interrupted workflows - Subtask 6.3.2: Add docs for common recovery playbooks
- Subtask 6.3.3: Add optional integration with digest summaries
- Notes: Added shared
resume_hintsgeneration in recovery engine and surfaced it in/resume status|nowand/start-work recoverresponses; digestplan_executionsnapshots now include resume eligibility/hints for lightweight recovery cues.
- Subtask 6.3.1: Add
- Exit criteria: sessions are searchable and resume hints are practical
Status: postponed
Priority: Low
Goal: Add optional tmux pane orchestration for observing background jobs in real time.
Depends on: Epic 2
Postpone reason: deliver core orchestration reliability before adding visual runtime complexity.
- Task 7.1: Design tmux mode constraints
- Subtask 7.1.1: Define supported layouts and minimum pane sizes
- Subtask 7.1.2: Define server mode and attach requirements
- Subtask 7.1.3: Define safe fallback when not inside tmux
- Task 7.2: Implement tmux integration
- Subtask 7.2.1: Spawn background jobs in dedicated panes
- Subtask 7.2.2: Stream status and auto-close completed panes
- Subtask 7.2.3: Add pane naming and collision handling
- Task 7.3: UX and docs
- Subtask 7.3.1: Add
/tmuxstatus/config helpers - Subtask 7.3.2: Add shell helper snippets for macOS/Linux
- Subtask 7.3.3: Add troubleshooting for pane/orphan cleanup
- Subtask 7.3.1: Add
- Exit criteria: feature is opt-in, non-disruptive, and gracefully degrades outside tmux
Status: done
Priority: High
Goal: Enable explicit keywords (for example, ulw) to activate high-value execution modes without manual command chaining.
Depends on: Epic 1, Epic 4
- Task 8.1: Define keyword dictionary and behavior mapping
- Subtask 8.1.1: Define reserved keywords (
ulw,deep-analyze,parallel-research,safe-apply) - Subtask 8.1.2: Define mode side-effects and precedence rules
- Subtask 8.1.3: Define explicit opt-out syntax and defaults
- Notes: Added
instructions/keyword_execution_modes.mdwith deterministic keyword matching, precedence, conflict handling, and opt-out syntax.
- Subtask 8.1.1: Define reserved keywords (
- Task 8.2: Implement keyword detector engine
- Subtask 8.2.1: Parse user prompts and resolve keyword intents
- Subtask 8.2.2: Apply mode flags to runtime execution context
- Subtask 8.2.3: Add conflict handling when multiple keywords appear
- Notes: Added
scripts/keyword_mode_schema.py+scripts/keyword_mode_command.pyfor deterministic token matching, precedence-aware conflict handling, and persisted keyword mode runtime context.
- Task 8.3: User visibility and control
- Subtask 8.3.1: Add status command for active mode stack
- Subtask 8.3.2: Add config toggles to disable selected keywords
- Subtask 8.3.3: Document examples and anti-patterns
- Notes: Extended
/keyword-modewith global enable/disable and per-keyword toggles, surfaced effective mode stack details in status output, and documented examples/anti-patterns in README.
- Task 8.4: Verification
- Subtask 8.4.1: Add tests for matching accuracy and false positives
- Subtask 8.4.2: Add install-test smoke scenarios for keyword activation
- Subtask 8.4.3: Add doctor visibility for keyword subsystem
- Notes: Expanded selftest/install smoke for false-positive resistance and keyword toggle flows, and added
/doctorintegration viakeyword-modediagnostics.
- Exit criteria: keyword activation is deterministic and low-surprise
- Exit criteria: users can disable or override keyword behavior safely
Status: done
Priority: High
Goal: Load project/user rule files with optional glob conditions to enforce coding conventions contextually.
Depends on: Epic 1
- Task 9.1: Define rule file schema and precedence
- Subtask 9.1.1: Define frontmatter fields (
globs,alwaysApply,description,priority) - Subtask 9.1.2: Define project/user rule search paths
- Subtask 9.1.3: Define rule conflict resolution strategy
- Notes: Added
instructions/conditional_rules_schema.mdwith deterministic discovery, matching, precedence, conflict handling, and validation requirements.
- Subtask 9.1.1: Define frontmatter fields (
- Task 9.2: Implement rule discovery and matching engine
- Subtask 9.2.1: Discover markdown rule files recursively
- Subtask 9.2.2: Match rules by file path and operation context
- Subtask 9.2.3: Inject effective rule set into execution context
- Notes: Added
scripts/rules_engine.pywith frontmatter parsing, layered discovery, deterministic precedence sorting, duplicate-id conflict reporting, and effective rule stack resolution helpers.
- Task 9.3: Operations and diagnostics
- Subtask 9.3.1: Add
/rules statusand/rules explain <path>commands - Subtask 9.3.2: Add per-rule disable list in config
- Subtask 9.3.3: Add doctor output for rule source and conflicts
- Notes: Added
scripts/rules_command.pywith status/explain/disable-id/enable-id/doctor workflows, wired/doctorintegration for rules diagnostics, and added command aliases/install smoke coverage.
- Subtask 9.3.1: Add
- Task 9.4: Verification and docs
- Subtask 9.4.1: Add tests for glob matching and precedence
- Subtask 9.4.2: Add docs with examples for team rule packs
- Subtask 9.4.3: Add install-test smoke checks
- Notes: Expanded rules selftest/install smoke coverage for precedence/always-apply/disable-id flows and added team rule-pack examples in
instructions/rules_team_pack_examples.md.
- Exit criteria: applicable rules are explainable for any target file
- Exit criteria: conflicting rules are surfaced with clear remediation
Status: done
Priority: Medium
Goal: Detect natural-language intent that maps to existing slash commands and optionally execute with guardrails.
Depends on: Epic 1, Epic 8
- Task 10.1: Define intent-to-command mappings
- Subtask 10.1.1: Map common intents to existing commands (
/doctor,/stack,/nvim,/devtools) - Subtask 10.1.2: Define confidence scoring and ambiguity thresholds
- Subtask 10.1.3: Define no-op behavior when confidence is low
- Notes: Added
scripts/auto_slash_schema.pywith deterministic tokenization, phrase/keyword scoring, confidence gate, ambiguity delta, and explicit no-op reason codes.
- Subtask 10.1.1: Map common intents to existing commands (
- Task 10.2: Implement detection and dispatch
- Subtask 10.2.1: Parse prompt intent candidates
- Subtask 10.2.2: Resolve best command + argument template
- Subtask 10.2.3: Execute with safe preview mode option
- Notes: Added
scripts/auto_slash_command.pywithdetect|preview|executeworkflow and preview-first--forceexecution guard.
- Task 10.3: Controls and safety
- Subtask 10.3.1: Add config toggles (global and per-command)
- Subtask 10.3.2: Add audit log for auto-executed commands
- Subtask 10.3.3: Add fast cancel/undo guidance in output
- Notes: Added layered config section
auto_slash_detector, per-command enablement controls, and JSONL audit trail at~/.config/opencode/my_opencode/runtime/auto_slash_audit.jsonl.
- Task 10.4: Validation
- Subtask 10.4.1: Add tests for mapping precision and ambiguity handling
- Subtask 10.4.2: Add smoke tests for preview + execute modes
- Subtask 10.4.3: Add docs with examples and limitations
- Notes: Expanded selftest/install smoke, added
/doctorsubsystem check, and documented behavior inREADME.md+instructions/auto_slash_detector.md.
- Exit criteria: detector reduces manual command typing without unsafe surprises
- Exit criteria: low-confidence intents never auto-execute
Status: done
Priority: High
Goal: Improve long-session reliability with configurable truncation/pruning/recovery policies.
Depends on: Epic 4
- Task 11.1: Define resilience policy schema
- Subtask 11.1.1: Define truncation modes (
default,aggressive) - Subtask 11.1.2: Define protected tools/messages list
- Subtask 11.1.3: Define pruning and recovery notification levels
- Notes: Added
instructions/context_resilience_policy_schema.mddocumenting config shape, truncation modes, protected artifact constraints, notification levels, and validation requirements.
- Subtask 11.1.1: Define truncation modes (
- Task 11.2: Implement context pruning engine
- Subtask 11.2.1: Add deduplication and superseded-write pruning
- Subtask 11.2.2: Add old-error input purge with turn thresholds
- Subtask 11.2.3: Preserve critical evidence and command outcomes
- Notes: Added
scripts/context_resilience.pywith policy resolution plus deterministic pruning (dedupe, superseded writes, stale error purge, budget trim) while preserving protected artifacts and latest command outcomes.
- Task 11.3: Recovery workflows
- Subtask 11.3.1: Add automatic resume hints after successful recovery
- Subtask 11.3.2: Add safe fallback when recovery cannot proceed
- Subtask 11.3.3: Add diagnostics for pruning/recovery actions
- Notes: Added recovery-plan generation in
scripts/context_resilience.pywith resume hints, safe fallback actions, and structured pruning/recovery diagnostics.
- Task 11.4: Validation and docs
- Subtask 11.4.1: Add stress tests for long-session behavior
- Subtask 11.4.2: Add docs for tuning resilience settings
- Subtask 11.4.3: Add doctor summary for context resilience health
- Notes: Added
instructions/context_resilience_tuning.md,/resiliencecommand diagnostics, and unified/doctorresilience subsystem checks.
- Exit criteria: long sessions remain stable under constrained context budgets
- Exit criteria: recovery decisions are transparent and auditable
Status: done
Priority: Medium
Goal: Make model routing and provider fallback decisions observable and explainable.
Depends on: Epic 5
- Task 12.1: Define explanation model
- Subtask 12.1.1: Define resolution trace format (requested -> attempted -> selected)
- Subtask 12.1.2: Define compact vs verbose output levels
- Subtask 12.1.3: Define redaction rules for sensitive provider details
- Notes: Added
instructions/model_fallback_explanation_model.mddefining fallback trace shape, output levels, redaction policy, and deterministic reason-code requirements.
- Task 12.2: Implement resolution tracing
- Subtask 12.2.1: Capture fallback chain attempts in runtime
- Subtask 12.2.2: Store latest trace per command/session
- Subtask 12.2.3: Expose trace to doctor and debug commands
- Notes: Extended
scripts/model_routing_schema.pywith requested/attempted/selected runtime trace payloads and added persisted latest-trace support plus/model-routing traceinscripts/model_routing_command.py.
- Task 12.3: User-facing command surface
- Subtask 12.3.1: Add
/routing statusand/routing explaincommands - Subtask 12.3.2: Add examples for category-driven routing outcomes
- Subtask 12.3.3: Add docs for troubleshooting unexpected model selection
- Notes: Added
scripts/routing_command.py, routed aliases inopencode.json, and README examples/troubleshooting for compact explainability workflows.
- Subtask 12.3.1: Add
- Task 12.4: Verification
- Subtask 12.4.1: Add tests for deterministic trace output
- Subtask 12.4.2: Add tests for fallback and no-fallback scenarios
- Subtask 12.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith deterministic resolution-trace assertions plus fallback/no-fallback routing explain scenarios and added/routingsmoke hints ininstall.sh.
- Exit criteria: users can explain model/provider selection for every routed task
- Exit criteria: trace output remains readable in default mode
Status: done
Priority: Medium
Goal: Add first-class profile switching between browser automation engines with install/runtime checks.
Depends on: Epic 1
- Task 13.1: Define browser profile model
- Subtask 13.1.1: Define supported providers (
playwright,agent-browser) - Subtask 13.1.2: Define profile settings and defaults
- Subtask 13.1.3: Define migration behavior for existing installs
- Notes: Added
instructions/browser_profile_model.mdwith provider schema, defaults, migration behavior, and validation contract.
- Subtask 13.1.1: Define supported providers (
- Task 13.2: Implement profile command backend
- Subtask 13.2.1: Add
/browser profile <provider>command - Subtask 13.2.2: Add status and doctor checks for selected provider
- Subtask 13.2.3: Add install helper guidance for missing dependencies
- Notes: Added
scripts/browser_command.py,/browser*aliases inopencode.json, doctor integration, and install/selftest smoke coverage for provider switching and dependency diagnostics.
- Subtask 13.2.1: Add
- Task 13.3: Integrate with wizard and docs
- Subtask 13.3.1: Add provider selection into install/reconfigure wizard
- Subtask 13.3.2: Document provider trade-offs and examples
- Subtask 13.3.3: Add recommended defaults for stable-first users
- Notes: Extended
scripts/install_wizard.pywith browser profile selection (--browser-profile) and updated README install/browser guidance with provider trade-offs plus stable-first recommendations.
- Task 13.4: Verification
- Subtask 13.4.1: Add tests for profile switching and persistence
- Subtask 13.4.2: Add smoke tests for status/doctor across providers
- Subtask 13.4.3: Add install-test checks
- Notes: Expanded
scripts/selftest.pyto validate provider reset readiness and updated install smoke checks to runstatusanddoctoracross both providers.
- Exit criteria: provider switching is one-command and reversible
- Exit criteria: missing dependency states are diagnosed with exact fixes
Status: done
Priority: Medium
Goal: Add a command to execute from an approved plan artifact with progress tracking and deviation reporting.
Depends on: Epic 2, Epic 3
- Task 14.1: Define plan artifact contract
- Subtask 14.1.1: Define accepted plan format (markdown checklist + metadata)
- Subtask 14.1.2: Define validation rules before execution starts
- Subtask 14.1.3: Define step state transitions and completion semantics
- Notes: Added
instructions/plan_artifact_contract.mdcovering artifact schema, deterministic validation failures, transition rules, and deviation capture requirements for/start-work.
- Task 14.2: Implement execution bridge backend
- Subtask 14.2.1: Add
/start-work <plan>command implementation - Subtask 14.2.2: Execute steps sequentially with checkpoint updates
- Subtask 14.2.3: Capture and report deviations from original plan
- Notes: Added
scripts/start_work_command.pywith plan parsing + validation, sequential checkpoint transitions, persisted execution status, and deviation reporting; wired aliases and smoke/selftest coverage.
- Subtask 14.2.1: Add
- Task 14.3: Integrations and observability
- Subtask 14.3.1: Integrate with background subsystem where safe
- Subtask 14.3.2: Integrate with digest summaries for end-of-run recap
- Subtask 14.3.3: Expose execution status in doctor/debug outputs
- Notes: Added background-safe
/start-workqueueing (--background+/start-work-bg), digestplan_executionrecap output, and/doctorintegration via/start-work doctor --json.
- Task 14.4: Validation and docs
- Subtask 14.4.1: Add tests for plan parsing and execution flow
- Subtask 14.4.2: Add recovery tests for interrupted plan runs
- Subtask 14.4.3: Add docs with sample plans and workflows
- Notes: Expanded
scripts/selftest.pywith additional plan validation/recovery checks and addedinstructions/plan_execution_workflows.mdwith sample plans plus direct/background/recovery workflows.
- Exit criteria: approved plans can be executed and resumed with clear state
- Exit criteria: deviations are explicitly surfaced and reviewable
Status: done
Priority: High
Goal: Enforce explicit checklist progress during execution so outcomes stay aligned with approved plans.
Depends on: Epic 14
- Task 15.1: Define compliance model
- Subtask 15.1.1: Define required todo states (
pending,in_progress,done,skipped) - Subtask 15.1.2: Define rules for one-active-item-at-a-time enforcement
- Subtask 15.1.3: Define acceptable bypass annotations and audit format
- Notes: Added
instructions/todo_compliance_model.mdwith state model, transition constraints, bypass metadata requirements, and audit event contract.
- Subtask 15.1.1: Define required todo states (
- Task 15.2: Implement enforcement engine
- Subtask 15.2.1: Validate state transitions before major actions
- Subtask 15.2.2: Block completion when required tasks remain unchecked
- Subtask 15.2.3: Emit actionable remediation prompts on violations
- Notes: Added
scripts/todo_enforcement.pyand wired/start-workto enforce deterministic todo transitions, completion gating, and remediation/audit outputs in runtime state.
- Task 15.3: Integrate command workflows
- Subtask 15.3.1: Integrate with plan execution command and background runs
- Subtask 15.3.2: Add
/todo statusand/todo enforcediagnostics - Subtask 15.3.3: Add docs for compliant workflow patterns
- Notes: Added
scripts/todo_command.py, command aliases, doctor integration, and README/install workflow guidance for explicit todo compliance checks.
- Task 15.4: Verification
- Subtask 15.4.1: Add tests for transition validity and blocking behavior
- Subtask 15.4.2: Add tests for bypass annotations and logs
- Subtask 15.4.3: Add install-test smoke scenarios
- Notes: Expanded
scripts/selftest.pyand install smoke checks for transition gating, completion blocking, bypass metadata validation, and deterministic bypass audit payloads.
- Exit criteria: plan completion cannot be marked done with unchecked required items
- Exit criteria: bypass behavior is explicit, logged, and reviewable
Status: merged
Priority: Medium
Goal: Scope merged into Epic 23 to keep PR quality logic in one place.
Merged into: Epic 23
- Merged note: keep quality-check rules and output clarity checks under
/pr-reviewinstead of separate command.
Status: done
Priority: High
Goal: Resume interrupted workflows from last valid checkpoint with explicit safety checks.
Depends on: Epic 11, Epic 14
- Task 17.1: Define resume policy
- Subtask 17.1.1: Define interruption classes (tool failure, timeout, context reset, crash)
- Subtask 17.1.2: Define resume eligibility and cool-down rules
- Subtask 17.1.3: Define max resume attempts and escalation path
- Notes: Added
instructions/resume_policy_model.mdwith interruption classes, deterministic eligibility/cool-down/attempt-limit rules, reason codes, and audit event contract.
- Task 17.2: Implement recovery engine
- Subtask 17.2.1: Load last safe checkpoint and reconstruct state
- Subtask 17.2.2: Re-run only idempotent or explicitly approved steps
- Subtask 17.2.3: Persist resume trail for audit/debugging
- Notes: Added
scripts/recovery_engine.pyand/start-work recoverbackend path for checkpoint eligibility checks, approval-gated replay, and persisted resume audit trail events.
- Task 17.3: User control surfaces
- Subtask 17.3.1: Add
/resume status,/resume now,/resume disablecommands - Subtask 17.3.2: Add clear output explaining why resume did/did not trigger
- Subtask 17.3.3: Document recommended recovery playbooks
- Notes: Added
scripts/resume_command.pyand/resume*aliases with eligibility/status/disable controls, added human-readable recovery reasons viaexplain_resume_reason, and documented recovery playbooks in README.
- Subtask 17.3.1: Add
- Task 17.4: Verification
- Subtask 17.4.1: Add tests for each interruption class
- Subtask 17.4.2: Add tests for idempotency safeguards
- Subtask 17.4.3: Add install-test scenarios for interrupted flows
- Notes: Expanded
scripts/selftest.pywith interruption-class eligibility/cooldown/disable-guard assertions and added interrupted-flow smoke recovery (resume nowapproval gating) toinstall.shself-check.
- Exit criteria: interrupted runs can be resumed safely with deterministic outcomes
- Exit criteria: recovery decisions are visible and auditable
Status: done
Priority: High
Goal: Prefer semantic edits via language tooling to reduce refactor regressions.
Depends on: Epic 3
- Task 18.1: Define safe-edit capability matrix
- Subtask 18.1.1: Define supported operations (
rename,extract,organize imports, scoped replace) - Subtask 18.1.2: Define language/tool availability checks
- Subtask 18.1.3: Define text-mode fallback when semantic tooling is missing
- Notes: Added
instructions/safe_edit_capability_matrix.mdwith operation/backend matrix, deterministic availability checks, guarded text-fallback rules, and reason-code contract.
- Subtask 18.1.1: Define supported operations (
- Task 18.2: Implement semantic edit adapters
- Subtask 18.2.1: Add LSP adapter for symbol-aware operations
- Subtask 18.2.2: Add AST adapter for deterministic structural edits
- Subtask 18.2.3: Add diff validation for changed references
- Notes: Added
scripts/safe_edit_adapters.pywith deterministic operation/backend selection (lsp,ast, guardedtext) and changed-reference validation helpers, plus selftest coverage for fallback/ambiguity/validation outcomes.
- Task 18.3: Command integration
- Subtask 18.3.1: Add
/safe-editor mode flag integration with/refactor-lite - Subtask 18.3.2: Add status/doctor checks for available semantic tools
- Subtask 18.3.3: Document safe-edit best practices and limitations
- Notes: Added
scripts/safe_edit_command.pyand/safe-edit*aliases, wiredsafe-editdiagnostics into unified/doctor, and expanded README/install guidance for semantic planning and fallback diagnostics.
- Subtask 18.3.1: Add
- Task 18.4: Verification
- Subtask 18.4.1: Add cross-language tests for rename/reference correctness
- Subtask 18.4.2: Add fallback tests when LSP/AST unavailable
- Subtask 18.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith cross-language safe-edit planning + reference-validation fixtures, added deterministic fallback failure assertions, and added/safe-edit plansmoke coverage in installer self-check.
- Exit criteria: semantic mode reduces accidental text-based regressions
- Exit criteria: fallback behavior is safe and clearly reported
Status: done
Priority: Medium
Goal: Persist periodic snapshots of execution state to improve rollback, restart, and auditability.
Depends on: Epic 2, Epic 17
- Task 19.1: Define snapshot format and lifecycle
- Subtask 19.1.1: Define snapshot schema (step state, context digest, command outcomes)
- Subtask 19.1.2: Define frequency and trigger points (step boundary, error boundary, timer)
- Subtask 19.1.3: Define retention, rotation, and optional compression
- Notes: Added
instructions/checkpoint_snapshot_lifecycle.mddefining snapshot schema, deterministic trigger boundaries, retention/rotation defaults, optional compression, and failure reason-code semantics.
- Task 19.2: Implement snapshot manager
- Subtask 19.2.1: Write atomic snapshots with corruption-safe semantics
- Subtask 19.2.2: Add list/show/prune operations
- Subtask 19.2.3: Integrate with resume/recovery engine
- Notes: Added
scripts/checkpoint_snapshot_manager.pywith atomic history/latest writes, integrity-aware load/list APIs, retention+compression pruning, and/start-work+/start-work recoverpersistence integration.
- Task 19.3: Visibility and tooling
- Subtask 19.3.1: Add
/checkpoint list|show|prunecommands - Subtask 19.3.2: Add doctor diagnostics for snapshot health
- Subtask 19.3.3: Document rollback/restart examples
- Notes: Added
scripts/checkpoint_command.pyand/checkpoint*aliases, integrated checkpoint doctor checks into unified/doctor, and documented checkpoint list/show/prune usage in README.
- Subtask 19.3.1: Add
- Task 19.4: Verification
- Subtask 19.4.1: Add tests for atomic write and corruption handling
- Subtask 19.4.2: Add retention/rotation tests
- Subtask 19.4.3: Add install-test checkpoint smoke flows
- Notes: Expanded
scripts/selftest.pywith direct atomic-write, corruption/integrity mismatch, retention bound, and compression rotation assertions; added installer smoke coverage for/checkpoint list|show|prune|doctor.
- Exit criteria: checkpoints support reliable restart and recovery workflows
- Exit criteria: snapshot retention stays bounded by policy
Status: done
Priority: High
Goal: Prevent runaway autonomous runs by enforcing configurable limits for time, tool calls, and token usage.
Depends on: Epic 2, Epic 11
- Task 20.1: Define budget model
- Subtask 20.1.1: Define limit dimensions (wall-clock, tool-call count, token estimate)
- Subtask 20.1.2: Define profiles (
conservative,balanced,extended) - Subtask 20.1.3: Define override and emergency-stop semantics
- Notes: Added
instructions/execution_budget_model.mddefining budget dimensions, profile defaults, threshold semantics, and auditable override/emergency-stop rules.
- Task 20.2: Implement budget enforcement runtime
- Subtask 20.2.1: Track usage counters in real time
- Subtask 20.2.2: Block/soft-stop execution at threshold boundaries
- Subtask 20.2.3: Emit summary and next-step recommendations on stop
- Notes: Added
scripts/execution_budget_runtime.pyand integrated/start-work+/start-work recoverbudget evaluation payloads with hard-stop status (budget_stopped) and recommendation output when limits are exceeded.
- Task 20.3: Commands and diagnostics
- Subtask 20.3.1: Add
/budget status|profile|overridecommands - Subtask 20.3.2: Expose budget consumption in doctor/debug outputs
- Subtask 20.3.3: Document budget tuning by workload type
- Notes: Added
scripts/budget_command.py, wired/budget*aliases inopencode.json, integrated budget checks into unified/doctor, and expanded README/install/selftest flows for profile+override diagnostics.
- Subtask 20.3.1: Add
- Task 20.4: Verification
- Subtask 20.4.1: Add tests for threshold crossings and stop behavior
- Subtask 20.4.2: Add tests for override and reset flows
- Subtask 20.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith threshold-stop, override/reset, and invalid-override checks; install smoke now exercises/budget status|override|doctor|override --clearflows.
- Exit criteria: runaway loops are prevented by hard and soft limits
- Exit criteria: budget stops provide actionable continuation guidance
Status: merged
Priority: Medium
Goal: Loop-control scope merged into Epic 22 and Epic 28.
Merged into: Epic 22, Epic 28
- Merged note: expose loop controls as
/autoflowand/autopilotoptions, not a separate top-level epic.
Status: done
Priority: High
Goal: Provide a single command (/autoflow) that orchestrates plan execution, enforcement, recovery, and reporting with safe defaults.
Depends on: Epic 14, Epic 15, Epic 17, Epic 19, Epic 20
- Task 22.1: Define
/autoflowcommand contract- Subtask 22.1.1: Define subcommands (
start,status,resume,stop,report,dry-run) - Subtask 22.1.2: Define input plan requirements and validation errors
- Subtask 22.1.3: Define output format for concise and verbose modes
- Notes: Added
instructions/autoflow_command_contract.mddefining/autoflowsubcommands, deterministic validation/error shape, concise-vs-JSON output schema, lifecycle status model, and safety defaults for Task 22.2 implementation.
- Subtask 22.1.1: Define subcommands (
- Task 22.2: Implement orchestration adapter layer
- Subtask 22.2.1: Compose existing plan, todo, budget, checkpoint, and loop primitives
- Subtask 22.2.2: Add deterministic state machine transitions
- Subtask 22.2.3: Add explain mode showing decisions and fallbacks
- Notes: Added
scripts/autoflow_adapter.pyto compose runtime primitives (plan,todo,budget,checkpoint,resume,loop_guard) and resolve deterministic intent transitions with explain traces and fallback reason codes.
- Task 22.3: Add safety and usability controls
- Subtask 22.3.1: Add
dry-runto preview actions without mutating state - Subtask 22.3.2: Add explicit kill-switch behavior for unsafe or runaway states
- Subtask 22.3.3: Add docs and migration guidance from low-level commands
- Notes: Added
scripts/autoflow_command.pyplus/autoflow*aliases with non-mutatingdry-run, explicitstopkill-switch semantics, doctor integration, installer smoke coverage, and README migration mappings from low-level commands.
- Subtask 22.3.1: Add
- Task 22.4: Verification
- Subtask 22.4.1: Add integration tests for full lifecycle (
start -> status -> report) - Subtask 22.4.2: Add recovery tests (
resumeafter interruption) - Subtask 22.4.3: Add install-test smoke checks for
/autoflowhappy path - Notes: Expanded
scripts/selftest.pywith/autoflow reportlifecycle assertions and approval-gated/autoflow resumerecovery checks; install smoke validates/autoflow dry-run|status|report|stopflows.
- Subtask 22.4.1: Add integration tests for full lifecycle (
- Exit criteria:
/autoflowcan run end-to-end flows with auditable outputs - Exit criteria: users can always fall back to lower-level commands safely
Status: done
Priority: High
Goal: Add a command that reviews pending PR changes for risk, quality, and release readiness before merge.
Depends on: Epic 3
- Task 23.1: Define review rubric and risk scoring
- Subtask 23.1.1: Define risk categories (security, data loss, migration impact, test coverage)
- Subtask 23.1.2: Define confidence and severity scoring model
- Subtask 23.1.3: Define required evidence for blocking recommendations
- Notes: Added
instructions/pr_review_rubric.mdwith deterministic risk category signals, severity/confidence scoring scales, conservative blocker-evidence thresholds, and recommendation mapping optimized for low-noise pre-merge triage.
- Task 23.2: Implement copilot analyzer
- Subtask 23.2.1: Parse git diff and classify changed areas
- Subtask 23.2.2: Detect missing tests/docs/changelog implications
- Subtask 23.2.3: Produce actionable findings with file-level references
- Notes: Added
scripts/pr_review_analyzer.pywith deterministic unified-diff parsing, changed-area classification, rubric-aligned finding generation, missing evidence detection (tests,README,CHANGELOG), and recommendation mapping (approve|needs_review|changes_requested|block) with file-level references and blocker evidence gates.
- Task 23.3: Command surface and workflow integration
- Subtask 23.3.1: Add
/pr-reviewwith concise and JSON modes - Subtask 23.3.2: Integrate with pre-merge checklist and doctor output
- Subtask 23.3.3: Document triage flow for warnings vs blockers
- Notes: Added
scripts/pr_review_command.pycommand surface with concise/JSON output andchecklist/doctorsubcommands, wired aliases inopencode.json, integratedpr-reviewinto unified doctor checks, and documented blocker-vs-warning triage guidance in README.
- Subtask 23.3.1: Add
- Task 23.4: Verification
- Subtask 23.4.1: Add tests for risk detection and false positive control
- Subtask 23.4.2: Add tests for missing-evidence behavior
- Subtask 23.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith docs-only false-positive guard assertions and tested-source-change missing-evidence checks, and installer smoke now exercises/pr-review,/pr-review checklist, and/pr-review doctorworkflows.
- Exit criteria: copilot catches high-risk omissions before merge
- Exit criteria: outputs are actionable and low-noise in default mode
Status: done
Priority: High
Goal: Automate release preparation checks, release-note drafting, and tag gating.
Depends on: Epic 14, Epic 23
- Task 24.1: Define release policy contract
- Subtask 24.1.1: Define required preconditions (clean tree, tests passing, changelog updated)
- Subtask 24.1.2: Define semantic version rules and validation
- Subtask 24.1.3: Define rollback strategy for partial release failures
- Notes: Added
instructions/release_train_policy_contract.mdwith deterministic preflight gates, semantic version mapping rules, command reason-code requirements, and partial-failure rollback strategy for/release-trainworkflows.
- Task 24.2: Implement release assistant engine
- Subtask 24.2.1: Add preflight checks and blocking diagnostics
- Subtask 24.2.2: Generate draft release notes from merged changes
- Subtask 24.2.3: Add dry-run publish flow with explicit confirmation step
- Notes: Added
scripts/release_train_engine.pywithstatus|prepare|draft|publish|doctorflows, deterministic preflight reason-code diagnostics, changelog/version gating, release-note draft generation from git history, and confirmation-gated publish simulation.
- Task 24.3: Command integration
- Subtask 24.3.1: Add
/release-train status|prepare|draft|publish - Subtask 24.3.2: Integrate with existing
make release-checkand changelog flow - Subtask 24.3.3: Document release operator workflow
- Notes: Added
scripts/release_train_command.pywith status/prepare/draft/publish/doctor command surface, wired/release-train*aliases inopencode.json, integrated release-train diagnostics into unified doctor and installer smoke/hints, and connectedmake release-check VERSION=x.y.zto release-train preflight gating.
- Subtask 24.3.1: Add
- Task 24.4: Verification
- Subtask 24.4.1: Add tests for version and changelog mismatch handling
- Subtask 24.4.2: Add tests for dry-run vs publish behavior
- Subtask 24.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith fixture-based coverage for breaking-change/version mismatch gating and publish confirmation vs dry-run behavior, while install smoke exercises/release-trainstatus/prepare/draft/doctor command paths.
- Exit criteria: releases are blocked when preconditions are unmet
- Exit criteria: release-note drafts are generated consistently and reviewable
Status: done
Priority: Medium
Goal: Provide an emergency workflow mode that is faster but still bounded and auditable.
Depends on: Epic 20, Epic 22
- Task 25.1: Define hotfix constraints and policy
- Subtask 25.1.1: Define mandatory checks that cannot be skipped
- Subtask 25.1.2: Define reduced-scope validation profile
- Subtask 25.1.3: Define post-hotfix follow-up requirements
- Notes: Added
instructions/hotfix_mode_policy_contract.mddefining activation constraints, mandatory non-skippable checks, reduced validation profile limits, required post-incident follow-up artifacts, timeline/audit schema, and deterministic failure reason codes for/hotfixworkflows.
- Task 25.2: Implement hotfix runtime profile
- Subtask 25.2.1: Add constrained budget and tool permission settings
- Subtask 25.2.2: Add expedited patch flow with rollback checkpoint
- Subtask 25.2.3: Add incident timeline capture for auditability
- Notes: Added
scripts/hotfix_runtime.pyruntime backend with incident activation gating, constrained budget/permission profile defaults, rollback checkpoint capture, patch and validation event tracking, closure guardrails, and append-only incident timeline persistence under~/.config/opencode/my_opencode/runtime/hotfix_mode.json.
- Task 25.3: Command integration and docs
- Subtask 25.3.1: Add
/hotfix start|status|close - Subtask 25.3.2: Add automatic reminder for post-incident hardening tasks
- Subtask 25.3.3: Document incident playbooks and escalation notes
- Notes: Added
scripts/hotfix_command.pycommand surface with start/status/close passthrough plusremindanddoctor, wired/hotfix*aliases inopencode.json, integrated hotfix checks into installer self-check and install smoke workflows, and documented incident playbook usage in README.
- Subtask 25.3.1: Add
- Task 25.4: Verification
- Subtask 25.4.1: Add tests for mandatory guardrail enforcement
- Subtask 25.4.2: Add tests for rollback and closure flow
- Subtask 25.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith dirty-worktree guardrail blocking and rollback incident lifecycle assertions, and expanded install smoke to validate/hotfix closefailure without follow-up metadata before successful closure.
- Exit criteria: hotfix mode is faster while preserving mandatory safety controls
- Exit criteria: each hotfix run produces a clear post-incident audit trail
Status: done
Priority: Medium
Goal: Aggregate repository operational signals into a health score with drift alerts.
Depends on: Epic 9, Epic 12, Epic 20
- Task 26.1: Define health model and scoring weights
- Subtask 26.1.1: Define high-signal indicators (tests, hooks, stale branches, config drift)
- Subtask 26.1.2: Define weighted scoring and status thresholds
- Subtask 26.1.3: Define suppression window for repeated alerts
- Notes: Added
instructions/health_score_policy_contract.mddefining indicator schema, default weights/penalties, deterministic health thresholds, drift suppression-window behavior, and reason-code/remediation output requirements for upcoming/healthcommands.
- Task 26.2: Implement health collector
- Subtask 26.2.1: Collect diagnostics from existing command subsystems
- Subtask 26.2.2: Detect drift from expected profile/policy baselines
- Subtask 26.2.3: Persist score history and trend snapshots
- Notes: Added
scripts/health_score_collector.pybackend to collect repo/runtime signals (validation target readiness, git hygiene, budget/hooks drift, background failures, freshness debt), compute weighted health status, apply suppression-window semantics, and persist latest+history snapshots under runtime state files.
- Task 26.3: Command and reporting integration
- Subtask 26.3.1: Add
/health status|trend|drift - Subtask 26.3.2: Add JSON export for dashboards/CI
- Subtask 26.3.3: Document remediation recommendations by score bucket
- Notes: Added
scripts/health_command.pywithstatus|trend|drift|doctorcommand flows, wired/health*aliases inopencode.json, integrated health diagnostics into unified doctor plus installer/install-smoke coverage, and exposed score-bucket remediation guidance from backend recommendations.
- Subtask 26.3.1: Add
- Task 26.4: Verification
- Subtask 26.4.1: Add tests for score determinism and threshold behavior
- Subtask 26.4.2: Add tests for drift detection precision
- Subtask 26.4.3: Add install-test smoke checks
- Notes: Expanded
scripts/selftest.pywith repeated status determinism checks and precise policy-drift attribution assertions, and expanded install smoke to exercise drift force-refresh behavior after controlled budget-profile drift injection.
- Exit criteria: health score reflects real operational risk with actionable guidance
- Exit criteria: drift signals are precise enough to avoid alert fatigue
Status: done
Priority: Medium
Goal: Turn completed work into reusable patterns, checklists, and guidance for future runs.
Depends on: Epic 9, Epic 14, Epic 23
- Task 27.1: Define capture schema and quality gates
- Subtask 27.1.1: Define entry types (pattern, pitfall, checklist, rule candidate)
- Subtask 27.1.2: Define confidence score and approval workflow
- Subtask 27.1.3: Define tagging and search metadata
- Notes: Added
instructions/knowledge_capture_policy_contract.mddefining entry taxonomy, deterministic confidence scoring, approval quality gates, and required tagging/search metadata for reusable task learnings.
- Task 27.2: Implement knowledge extraction pipeline
- Subtask 27.2.1: Extract signals from merged PRs and task digests
- Subtask 27.2.2: Generate draft entries with source links
- Subtask 27.2.3: Support review/edit/publish lifecycle
- Notes: Added
scripts/knowledge_capture_pipeline.pywith merged-PR + digest signal extraction, grouped draft generation with evidence links, and deterministic review/publish/archive quality-gate transitions.
- Task 27.3: Command and integration surface
- Subtask 27.3.1: Add
/learn capture|review|publish|search - Subtask 27.3.2: Integrate published patterns with rules injector and
/autoflowworkflow docs - Subtask 27.3.3: Document maintenance process for stale entries
- Notes: Added
scripts/learn_command.py, wired/learn*aliases plus doctor/install coverage, and documented knowledge-assisted/autoflowworkflow guidance with stale-entry maintenance loops.
- Subtask 27.3.1: Add
- Task 27.4: Verification
- Subtask 27.4.1: Add tests for extraction quality thresholds
- Subtask 27.4.2: Add tests for approval/publish permissions
- Subtask 27.4.3: Add install-test smoke checks
- Notes: Expanded learn selftest with low-confidence review rejection and high-risk double-approval publish gating, and expanded install-smoke checks to enforce the same high-risk approval behavior.
- Exit criteria: completed work reliably yields reusable, reviewed guidance
- Exit criteria: stale/low-confidence knowledge can be pruned safely
Status: done
Priority: High
Goal: Add /autopilot as a high-level objective runner that executes bounded autonomous cycles with explicit controls.
Depends on: Epic 20, Epic 22
- Task 28.1: Define command contract and safety defaults
- Subtask 28.1.1: Define subcommands (
start,status,pause,resume,stop,report) - Subtask 28.1.2: Define required objective fields (
goal,scope,done-criteria,max-budget) - Subtask 28.1.3: Define safe default behavior (
dry-runpreview before first execution) - Notes: Added
instructions/autopilot_command_contract.mddefining command surface, required objective schema, deterministic lifecycle transitions, dry-run-first safety defaults, and output reason-code invariants.
- Subtask 28.1.1: Define subcommands (
- Task 28.2: Implement objective orchestration loop
- Subtask 28.2.1: Break objective into bounded execution cycles
- Subtask 28.2.2: Apply budget guardrails and mandatory checkpoints per cycle
- Subtask 28.2.3: Emit progress, blockers, and next-step recommendations
- Notes: Added
scripts/autopilot_runtime.pywith objective schema validation, bounded cycle execution, per-cycle budget hard-stop evaluation, mandatory checkpoint persistence, and deterministic progress/blocker recommendation payloads.
- Task 28.3: Integrate with existing control subsystems
- Subtask 28.3.1: Reuse
/autoflowprimitives for plan and state transitions - Subtask 28.3.2: Integrate with todo enforcement and resume/checkpoint systems
- Subtask 28.3.3: Add explicit manual handoff mode when confidence drops
- Notes: Added
scripts/autopilot_integration.pybridging autopilot run-state to autoflow transition evaluation, todo/resume/checkpoint control diagnostics, and confidence-based manual handoff safeguards.
- Subtask 28.3.1: Reuse
- Task 28.4: Command UX, docs, and workflows
- Subtask 28.4.1: Add
/autopilotexamples inREADME.md - Subtask 28.4.2: Add workflow guides (quick-fix objective, feature objective, release objective)
- Subtask 28.4.3: Add troubleshooting guide for stopped/paused runs
- Notes: Added
scripts/autopilot_command.pycommand surface (start|status|pause|resume|stop|report|doctor), wired/autopilot*aliases inopencode.json, integrated/autopilotchecks into install self-check + unified doctor diagnostics, and documented lifecycle workflows/troubleshooting inREADME.md.
- Subtask 28.4.1: Add
- Task 28.5: Verification
- Subtask 28.5.1: Add tests for scope bounding and budget cap enforcement
- Subtask 28.5.2: Add tests for pause/resume/stop transitions
- Subtask 28.5.3: Add install-test smoke scenarios for objective lifecycle
- Notes: Expanded
scripts/selftest.pywith autopilot scope-violation and budget hard-stop assertions plus pause/resume/stop transition checks, and expanded install smoke to run in-scope and out-of-scope/autopilot resumescenarios.
- Exit criteria:
/autopilotnever exceeds declared objective scope and budget limits - Exit criteria: users can inspect and control every run stage with clear status output
Status: done
- Task C1: Add release slicing plan by phase
- Subtask C1.1: Phase A (low-risk foundation): Epic 1
- Subtask C1.2: Phase B (workflow power): Epic 2 + Epic 3
- Subtask C1.3: Phase C (advanced automation): Epic 4 + Epic 5
- Subtask C1.4: Phase D (control layer): Epic 8 + Epic 9 + Epic 10
- Subtask C1.5: Phase E (resilience and observability): Epic 11 + Epic 12
- Subtask C1.6: Phase F (workflow expansion): Epic 13 + Epic 14
- Subtask C1.7: Phase G (quality and control): Epic 15 + Epic 23
- Subtask C1.8: Phase H (recovery and semantic safety): Epic 17 + Epic 18 + Epic 19
- Subtask C1.9: Phase I (bounded autonomy): Epic 20 + Epic 22
- Subtask C1.10: Phase J (unified orchestration): Epic 22
- Subtask C1.11: Phase K (delivery acceleration): Epic 23 + Epic 24 + Epic 25
- Subtask C1.12: Phase L (operational intelligence): Epic 26 + Epic 27
- Subtask C1.13: Phase M (objective autonomy): Epic 28
- Subtask C1.14: Phase N (optional power-user): Epic 6 + Epic 7
- Notes: Release slicing now uses deterministic gates per phase so each slice ships only when command contract/docs, validation suite (
make validate,make selftest,make install-test), and rollback notes are complete. - Phase gate baseline:
- Phase A: Foundation config readiness (
E1) with layered writes and fallback-safe migration. - Phase B: Async workflow operability (
E2-E3) with queued execution plus safe-refactor preflight. - Phase C: Policy/control baseline (
E4-E5) with hooks and routing fallback explainability. - Phase D: Rule-driven command control (
E8-E10) with keyword/rule determinism and intent-mapping safety. - Phase E: Runtime resilience visibility (
E11-E12) with context recovery and fallback trace outputs. - Phase F: Workflow expansion (
E13-E14) with browser profile switching and plan execution bridge. - Phase G: Quality and review control (
E15-E23) with todo compliance and PR risk triage. - Phase H: Recovery-safe execution (
E17-E19) with resume governance, safe-edit, and checkpoint continuity. - Phase I: Bounded autonomy (
E20-E22) with budgets and unified orchestration controls. - Phase J: Unified orchestration consolidation (
E22) as migration gate for command surface consistency. - Phase K: Delivery acceleration (
E23-E25) with review, release-train, and incident-mode readiness. - Phase L: Operational intelligence (
E26-E27) with health drift telemetry and reusable knowledge capture. - Phase M: Objective autonomy (
E28) with scoped/budgeted autopilot lifecycle controls. - Phase N: Optional power-user capabilities (
E6-E7) behind demand-driven adoption gates.
- Phase A: Foundation config readiness (
- Task C2: Add acceptance criteria template per epic
- Subtask C2.1: Functional criteria
- Subtask C2.2: Reliability criteria
- Subtask C2.3: Documentation criteria
- Subtask C2.4: Validation criteria (
make validate,make selftest,make install-test) - Subtask C2.5: Evidence links (PR, commit, test output summary)
- Subtask C2.6: Docs quality criteria (
READMEupdates + command examples + end-to-end workflow guides) - Subtask C2.7: Measurable thresholds (0 failing checks, explicit risk notes, clear rollback path)
- Notes: Added reusable acceptance template and quality gates for every epic so completion criteria are measurable and reviewable.
- Epic acceptance criteria template:
- Functional: command/feature paths behave as defined in contract docs and reject invalid input with deterministic reason codes.
- Reliability: runtime state writes are atomic/idempotent where required, with explicit recovery and rollback guidance.
- Documentation:
README.md, roadmap notes, andCHANGELOG.mdare updated with usage examples and operator-facing caveats. - Validation:
make validate,make selftest, andmake install-testpass on the task branch before merge. - Evidence: PR URL, commit SHA, and concise test result summary are recorded in issue/PR body.
- Docs quality: include at least one workflow example and one troubleshooting/remediation path for new command surfaces.
- Thresholds: zero failing required checks, explicit risk notes for non-trivial behavior changes, and clear rollback path.
- Task C3: Add tracking cadence
- Subtask C3.1: Weekly status update section in this file
- Subtask C3.2: Keep one epic
in_progress - Subtask C3.3: Move deferred work to
postponedexplicitly - Subtask C3.4: Revisit paused/postponed epics at least once per month
- Notes: Added explicit cadence policy so roadmap updates always record one active epic focus, a weekly status entry, and a monthly paused/postponed review checkpoint.
- Cadence policy:
- Weekly: append one dated status bullet in
Weekly Status Updateswith completed cards and next focus. - Active scope: keep exactly one epic marked
in_progressunless emergency/hotfix overlap is explicitly documented. - Deferred hygiene: represent non-active backlog epics as
pausedorpostponed, notin_progress. - Monthly: review paused/postponed epics at least once every 30 days and capture decisions in
Decision Log.
- Weekly: append one dated status bullet in
- Task C4: Command UX baseline (quality-of-life required)
- Subtask C4.1: Add command autocomplete shortcuts in
opencode.json - Subtask C4.2: Add command help and doctor JSON outputs
- Subtask C4.3: Add code-assistant guidance snippets (inputs, expected outputs, safe defaults)
- Subtask C4.4: Add tips/troubleshooting output for common failures
- Subtask C4.5: Add hover-like inline explanation docs (what it does, when to use, limits)
- Subtask C4.6: Add at least one easy-path command alias for frequent workflows
- Notes: Baseline is now standardized across command families using alias shortcuts in
opencode.json,*-helpand*-doctor-jsonpathways, installer hints, and README quick-reference snippets. - UX baseline checklist:
- Autocomplete/shortcut aliases exist for core command families (
/doctor*,/config*,/bg*,/stack*,/autoflow*,/autopilot*,/release-train*,/hotfix*,/health*,/learn*). - Help/doctor paths are consistently available either as explicit aliases or command-level
help/doctor --jsonsubcommands. - README contains command snippets with safe defaults and operator-focused examples.
- Installer output includes fast-path hints and troubleshooting-oriented follow-up commands.
- Frequent workflows expose easy-path shortcuts (status/report/doctor variants) to reduce argument overhead.
- Autocomplete/shortcut aliases exist for core command families (
- Subtask C4.1: Add command autocomplete shortcuts in
Run this checklist for every roadmap refinement pass:
- No duplicate command ownership across epics.
- No ambiguous command names (
or equivalent, placeholder aliases). - Dependencies are acyclic and point to existing epics only.
- Each high-priority epic has explicit safety and rollback notes.
- Docs requirements are present (
README, examples, workflow guide). - Low-value or high-noise epics are paused/postponed unless a measurable gap exists.
Use this log to track what changed week by week.
- 2026-02-14: Completed C1-C3 cross-cutting roadmap governance tasks; next focus is Task C4 command UX baseline while keeping only Epic 6 as
in_progress. - 2026-02-14: Completed C4 command UX baseline governance task; next focus returns to Epic 6 scoped delivery work.
Now: E1 -> E2 -> E3 -> E20Next: E14 -> E15 -> E22Later: E23 -> E24 -> E26 -> E27Deferred: E7 (postponed), E16/E21 (merged)
- 2026-02-12: Adopt stable-first sequencing; prioritize E1 before orchestration-heavy epics.
- 2026-02-12: Keep E6 paused until E1-E5 foundations stabilize.
- 2026-02-12: Keep E7 postponed pending stronger demand for tmux visual mode.
- 2026-02-12: Add E8-E14 as high-value extensions identified from comparative analysis.
- 2026-02-12: Add E15-E21 for enforcement, quality, recovery, semantic editing, checkpointing, budgets, and bounded loops.
- 2026-02-12: Add E22-E27 to unify orchestration and accelerate delivery quality and release reliability.
- 2026-02-13: Add E28
/autopilotas a non-duplicated high-value command on top of/autoflow. - 2026-02-13: Pause/postpone lower-confidence epics (E10, E28) until measurable value is proven.
- 2026-02-13: Merge duplicate epic scopes E16 -> E23 and E21 -> E22/E28.
- 2026-02-13: Require command UX baseline (autocomplete, assistant tips, hovers/explanations, QoL aliases) for all new command features.
- 2026-02-14: Complete C1 release slicing plan with deterministic phase gates across A-N and keep deferred queue aligned with current epic statuses.
- 2026-02-14: Complete C3 tracking cadence rules (weekly update, single active epic policy, and monthly paused/postponed review requirement).
- 2026-02-14: Monthly paused/postponed review checkpoint: keep E7
postponed; no promotion due to current value/risk profile. - 2026-02-14: Complete C4 command UX baseline and standardize alias/help/doctor usability expectations across command families.
- 2026-02-14: Complete E6-T1 session metadata index backend with digest-linked event capture and retention pruning defaults.
- 2026-02-14: Complete E6-T2 session command surface (
/session list|show|search) with index diagnostics and install/selftest coverage. - 2026-02-14: Complete E6-T3 resume support with actionable
resume_hintsoutputs and digest-integrated recovery cues. - 2026-02-14: Close E28-T5 verification task and mark Epic 28 done after validating lifecycle and install smoke coverage.
- 2026-02-14: Reconcile roadmap dashboard drift for E25-E27 and mark cross-cutting delivery status done for consistency with completed tasks.
- 2026-02-14: Reconcile Epic 18 checkbox state so Task 18.2/18.3 match completed subtasks and notes.
- 2026-02-14: Define measurable pause-exit criteria for E7/E10 so promotion decisions are evidence-based and reversible.
- 2026-02-14: Complete E10 implementation with preview-first intent routing, audit logging, and representative precision validation.
- Start with Epic 1 next (lowest risk, highest leverage).
- Prioritize E8-E9 after E1-E5 for fast workflow gains.
- Prioritize E11-E12 before E13-E14 when stability concerns are high.
- Prioritize E15 + E20 before E22 to keep autonomy controlled and auditable.
- Prioritize E22 before E23-E27 so higher-level automation builds on stable primitives.
- Keep Epic 7 postponed until clear demand and value evidence justify promotion.