This is the working PR checklist for building ripr incrementally. It is more
operational than the roadmap: each entry should become a scoped PR
with clear artifacts, tests, documentation updates, and gates.
The checklist is grouped into implementation campaigns. A Codex Goal may work through multiple work items in a campaign, but each work item should follow the scoped PR contract.
| Campaign | Objective | Work items |
|---|---|---|
| Agentic DevEx Foundation | Make the repo safe for Codex Goals and human review. | policy/architecture-guard, output/output-contract-check, docs/codex-goals-campaigns, fixtures/runner-comparison-v1, fixtures/first-two-goldens, testing/test-oracle-report, dogfood/static-self-check |
| Syntax-Backed Analyzer Foundation | Move the analyzer from lexical facts to syntax-backed facts. | analysis/file-facts-model, analysis/syntax-adapter-mvp, design/rust-syntax-substrate, analysis/ast-test-oracle-extraction, analysis/ast-probe-ownership, analysis/ast-probe-generation |
| Evidence Quality | Improve oracle strength, local flow, activation values, output evidence, and stop reasons. | output/unknown-stop-reason-invariant, analysis/oracle-strength-v2, analysis/local-delta-flow-v1, analysis/activation-value-modeling-v1, output/evidence-first-output, fixtures/negative-metamorphic-baseline |
| Test Efficiency and Vacuity Signals (4A) | Make low-discriminator, smoke-only, broad-oracle, opaque, circular, and duplicate test signals visible as advisory evidence; ship ripr and ripr+ badge artifacts. |
test-efficiency/test-fact-ledger, test-efficiency/vacuous-signal-v1, test-efficiency/duplicate-discriminator-v1, test-efficiency/report-and-metrics, badge/ripr-count-v1, badge/ripr-plus-count-v1, badge/repo-scope-artifacts, badge/publish-main-endpoint |
| Repo Seam Inventory and Test Grip (4B) | Inventory behavior seams, classify test-grip per seam, and turn actionable gaps into editor diagnostics and agent-ready packets. | spec/repo-seam-inventory, analysis/repo-seam-model-v1, analysis/repo-seam-inventory-v1, analysis/test-grip-evidence-v1, analysis/repo-ripr-classification-v1, output/repo-exposure-report-v1, lsp/repo-seam-diagnostics-v1, lsp/seam-evidence-hover-v1, context/agent-seam-packets-v1, docs/agent-dispatch-workflow-v1 |
| Seam Evidence Usability and Precision (5A) | Make repo seam evidence fast, precise, and directly actionable for developers and coding agents. | Complete: #255, #310, #313, #314, #315, #316, #327, and campaign/seam-evidence-usability-closeout. |
| Operationalization (5B) | Govern analyzer behavior with repository config, integrate SARIF/CI policy modes, and remap badges onto seam-native counts. | config/ripr-config-v1, ci/sarif-ci-policy, badge/seam-native-count-mapping |
| Module SRP Refactoring (6) | Refactor internal modules under crates/ripr/src/ so each module has one product responsibility, without splitting the package. |
See docs/IMPLEMENTATION_CAMPAIGNS.md Campaign 6. |
The active machine-readable campaign is .ripr/goals/active.toml. Campaigns 1
through 5A are complete. The active queue is now Campaign 5B
Operationalization. config/ripr-config-v1 established the repo-root
configuration surface; ci/sarif-ci-policy added SARIF rendering and an
opt-in baseline policy, and badge/seam-native-count-mapping is the next ready
item.
Purpose: put the plan, engineering rules, metrics, ADRs, specs, changelog, and traceability conventions in the repository before analyzer rewrites begin.
Deliverables:
- Update
docs/ROADMAP.mdwith the release sequence and PR queue. - Add an implementation checklist that future PRs can update.
- Add ADR scaffolding and initial ADRs for product-shaping decisions.
- Add spec scaffolding for behavior contracts.
- Add metrics definitions for capability and regression tracking.
- Add learnings and repo-knowledge log.
- Add spec-test-code traceability rules.
- Update the README doc index and metric summary.
- Add a root changelog.
- Add PR review checklist guidance.
- Add contributor workflow guidance.
- Add CI strategy guidance.
- Add dogfooding guidance.
- Add ADR and spec templates.
- Add changelog policy guidance.
- Add scoped evidence-heavy PR doctrine.
- Add first executable policy checks for static language and panic-family debt.
Acceptance:
- A contributor can identify the next PR from docs alone.
- A contributor can identify which spec, tests, and code modules belong together for a feature.
- The docs state that production and test code should avoid
panic,unwrap, andexpect, and that existing uses are tracked debt. - The docs preserve the product contract and conservative static language.
- PRs are scoped by production risk rather than line count.
Purpose: verify the normal VS Code extension path without requiring users to
install ripr separately.
Deliverables:
- Manual install verification matrix for VS Marketplace and Open VSX.
- Fresh-profile check with no
ripronPATH. - Server auto-download and checksum verification evidence.
- Output-channel log checklist for mode, base, config, server path, and download source.
- Clear-error scenarios for disabled auto-download, missing manifest, unsupported platform, and checksum mismatch.
Tests and gates:
-
cd editors/vscode && npm ci -
cd editors/vscode && npm run compile -
cd editors/vscode && npm run package
Purpose: expand the initial policy checks into a broader local and CI quality rail.
Deliverables:
- Move static language and panic-family checks into CI.
- Add markdown local link check.
- Add doc index check for README, docs, specs, and ADRs.
- Add traceability manifest validation.
- Add capability matrix validation.
- Add PR-scope check for production delta and evidence delta.
Acceptance:
-
cargo xtask ci-fastruns the core policy checks. - Existing debt is allowlisted with counts, and new debt fails the check.
- Docs explain how to remove allowlist entries as debt is paid down.
Purpose: keep repo implementation and automation Rust-first by denying unapproved non-Rust programming files, checked-in executable scripts, and workflow shell sprawl.
Deliverables:
- Add Rust-first file policy docs.
- Add non-Rust allowlist with owner, kind, and reason.
- Add workflow shell-budget allowlist.
- Add
cargo xtask check-file-policy. - Add
cargo xtask check-executable-files. - Add
cargo xtask check-workflows. - Wire checks into
cargo xtask ci-fast. - Wire checks into CI.
Acceptance:
- Rust is documented as the default implementation and automation language.
- Existing VS Code, workflow, docs, fixture, asset, and config surfaces are explicitly allowlisted.
- New shell, Python, JavaScript, TypeScript, or other programming files outside approved surfaces fail the file policy check.
- Checked-in executable bits fail unless allowlisted.
- Long workflow run blocks fail unless allowlisted.
Future policy PRs:
- generated-file policy
- dependency-surface policy
- process-spawn policy
- network policy
- workspace-shape policy
- architecture import guard
- public API guard
Purpose: make specs and fixtures agent-readable and mechanically checkable before fixture and golden output work expands.
Deliverables:
- Add spec format reference.
- Add test taxonomy reference.
- Add fixture contract README.
- Update existing specs to the checked format.
- Add
cargo xtask check-spec-format. - Add
cargo xtask check-fixture-contracts. - Wire checks into
cargo xtask ci-fast. - Wire checks into CI.
Acceptance:
- Every
docs/specs/RIPR-SPEC-*.mdhas required sections and a valid status. - Spec filename IDs match title IDs.
- Future fixture directories must include
SPEC.md,diff.patch, andexpected/check.json. - Fixture
SPEC.mdfiles must include Given/When/Then/Must Not sections.
Purpose: finish the first Rust-first policy family by making generated files, dependency surfaces, process spawning, and network behavior explicit.
Deliverables:
- Add generated-file allowlist and
cargo xtask check-generated. - Add dependency-surface allowlist and
cargo xtask check-dependencies. - Add process-spawn allowlist and
cargo xtask check-process-policy. - Add network allowlist and
cargo xtask check-network-policy. - Wire checks into
cargo xtask ci-fast. - Wire checks into CI.
- Update the file policy, CI docs, contributor docs, and PR template.
Acceptance:
- Tracked generated lockfiles and future fixture goldens require explicit allowlist entries.
- New dependency manager files fail unless they belong to approved Cargo, VS Code, or fixture surfaces.
- New process spawning fails unless allowlisted with a reason.
- New network behavior fails unless allowlisted with a reason.
Purpose: add the first mutating PR-shaping commands without changing existing policy semantics.
Deliverables:
- Add
cargo xtask shape. - Add
cargo xtask fix-pr. - Run
cargo fmtthroughshape. - Sort
.ripr/*.txtandpolicy/*.txtallowlists throughshape. - Ensure
target/ripr/reportsexists. - Write
target/ripr/reports/shape.md. - Write
target/ripr/reports/fix-pr.md. - Document safe mutations and repair guidance.
Acceptance:
-
cargo xtask shapepasses. -
cargo xtask fix-prpasses. -
cargo xtask ci-fastpasses after shaping. - Shaping does not add policy exceptions or bless output drift.
Purpose: generate a reviewer packet before human review without mutating source files.
Deliverables:
- Add
cargo xtask pr-summary. - Read changed paths from git diff and git status.
- Write
target/ripr/reports/pr-summary.md. - Classify production delta and evidence/support delta.
- Classify detected surfaces, public contracts, and policy exceptions.
- Suggest reviewer focus files.
- Update
cargo xtask fix-prto refresh the PR summary after shaping.
Acceptance:
-
cargo xtask pr-summarypasses. -
target/ripr/reports/pr-summary.mdexists after the command. -
cargo xtask fix-prrefreshes shape, PR summary, and fix-pr reports.
Purpose: document the fix/check/guide operating model and the Codex Goals campaign handoff so automation and analyzer implementation work share the same review contract.
Deliverables:
- Add a PR automation operating model.
- Document deterministic shaping, non-mutating checks, and repair briefs.
- Document the scoped PR contract.
- Record the automation cutoff that made Campaign 1 safe to leave setup mode.
- Link the new docs from the roadmap, documentation map, agent workflow, contributor docs, and README.
Acceptance:
- A contributor can identify which cleanup should be automated and which changes require explicit judgment.
- A coding agent can identify the next automation PRs without confusing them with product campaign work.
- A coding agent can use a standard task template for the analyzer queue.
Purpose: add obvious local gates for cheap pre-commit checks and review readiness checks.
Deliverables:
- Add
cargo xtask precommit. - Add
cargo xtask check-pr. - Keep
precommitcheap and non-mutating. - Make
check-prrun the review-ready command set that exists today. - Update CI, contributor, and agent docs.
Acceptance:
-
cargo xtask precommitpasses on main. -
cargo xtask check-prpasses on main. -
check-prdoes not run release packaging unless the repo later adds a path-aware release lane.
Purpose: make existing policy checks emit repair briefs instead of only command failure text.
Deliverables:
- Add a shared report model or helper for Markdown check reports.
- Upgrade static-language, panic-family, file-policy, executable-file,
workflow, spec-format, fixture-contract, generated, dependency, process,
and network checks to write reports under
target/ripr/reports. - Classify failures as auto-fixable, author decision, reviewer decision, or policy exception.
- Include exact rerun commands and exception templates where useful.
Acceptance:
- Each upgraded check writes a useful report on failure.
- Successful checks either write a pass report or are summarized by
pr-summary. - Report generation does not hide the non-zero exit status of failed checks.
Purpose: make CI upload review artifacts even when a check fails.
Deliverables:
- Run
cargo xtask pr-summarywhere possible in CI. - Defer metrics report generation until
cargo xtask metricsexists. - Upload
target/ripr/reportswith an always step. - Document report artifact names and expected contents.
Acceptance:
- CI artifacts include the PR summary and any check reports that were generated before failure.
- CI remains non-mutating.
Purpose: add the command surface for fixture execution and golden comparison before analyzer internals change.
Deliverables:
- Add
cargo xtask fixtures. - Add
cargo xtask fixtures <name>. - Add
cargo xtask goldens check. - Add
cargo xtask goldens bless <name> --reason "...". - Document the fixture and golden directory conventions.
Acceptance:
- Fixture commands pass with a clear "no fixtures found" message if no executable fixtures exist yet.
- Existing fixture contract checks still pass.
- Golden blessing requires an explicit reason.
Purpose: make spec IDs and behavior manifest entries checkable.
Deliverables:
- Harden
.ripr/traceability.toml. - Add
cargo xtask check-spec-ids. - Add
cargo xtask check-behavior-manifest. - Add warning-only drift checks for analysis, output, docs, fixture, and metric changes.
Acceptance:
- Accepted specs point to real docs and at least one test or fixture unless explicitly planned.
- Fixture specs reference valid spec IDs.
- Missing expected evidence appears in the PR summary.
Purpose: make capability progress and automation debt visible.
Deliverables:
- Add or harden a machine-readable capability source.
- Add
cargo xtask metrics. - Add
cargo xtask check-capabilities. - Write
target/ripr/reports/metrics.mdormetrics.json. - Keep the README capability snapshot aligned with the capability source.
Acceptance:
- Capability statuses have valid values and required fields.
- Stable or calibrated statuses require the evidence defined by policy.
- Metrics reports are generated without changing product behavior.
Purpose: protect internal seams while keeping one published package.
Deliverables:
- Add
cargo xtask check-workspace-shape. - Add
cargo xtask check-architecture. - Add
cargo xtask check-public-apior document why it is deferred. - Add policy metadata for allowed workspace packages and module-boundary rules.
Acceptance:
- New workspace packages require an explicit approved policy entry.
- Domain and analysis layers cannot accidentally depend on adapters.
- CLI, LSP, and output layers do not own exposure classification.
Purpose: make README state and Markdown links part of the checked trust packet.
Deliverables:
- Add
cargo xtask check-readme-state. - Add
cargo xtask markdown-links. - Check README front-door sections and headline capability snapshot shape.
- Check README/capability matrix checkpoint drift against
metrics/capabilities.toml. - Check repo-local Markdown links in tracked
.mdfiles. - Wire the checks into
precommitandci-fast. - Update CI and PR automation docs.
Acceptance:
- Deleted or renamed docs fail before review when still linked.
- README remains linked to active campaign, metrics, capability, and automation docs.
-
cargo xtask check-readme-stateandcargo xtask markdown-linkspass on main.
Purpose: make the active Codex Goals campaign queue mechanically checkable and reportable.
Deliverables:
- Add
cargo xtask check-campaign. - Add
cargo xtask check-goalsas an alias. - Add
cargo xtask goals status. - Add
cargo xtask goals next. - Validate
.ripr/goals/active.tomlagainstdocs/IMPLEMENTATION_CAMPAIGNS.md. - Validate work item IDs, statuses, branch fields, acceptance claims, stackability, merge boundaries, blocked dependencies, and command names.
- Wire the manifest check into
precommitandci-fast.
Acceptance:
-
cargo xtask check-campaignpasses on main. -
cargo xtask goals statuswritestarget/ripr/reports/goals.md. -
cargo xtask goals nextwritestarget/ripr/reports/goals-next.md.
Purpose: make fixture and golden commands execute the current product and compare actual output against checked-in expected output.
Deliverables:
-
cargo xtask fixturesruns all fixtures when fixture directories exist. -
cargo xtask fixtures <name>runs one fixture. - Actual JSON and human outputs are written under
target/ripr/fixtures/<name>/. -
cargo xtask goldens checkcompares actualcheck.jsonand optionalhuman.txtoutputs againstfixtures/<name>/expected/. -
cargo xtask goldens bless <name> --reason "..."requires a reason, updatesexpected/check.jsonandexpected/human.txt, and appends the fixture changelog.
Acceptance:
- Fixture commands still pass with a clear report when no fixture directories exist.
- Golden checks fail on drift without mutating expected outputs.
- Golden blessing remains explicit and does not run from
shapeorfix-pr.
Purpose: build the regression control bench before changing analyzer internals.
Deliverables:
-
fixtures/boundary_gap -
fixtures/weak_error_oracle -
fixtures/field_not_asserted -
fixtures/side_effect_unobserved -
fixtures/smoke_assertion_only -
fixtures/no_static_path -
fixtures/opaque_fixture -
fixtures/workspace_cross_crate -
fixtures/duplicate_symbols -
fixtures/stacked_test_attrs -
fixtures/nested_src_tests_layout -
fixtures/macro_unknown -
fixtures/snapshot_oracle -
fixtures/mock_effect
Each fixture should include:
- source and tests
-
diff.patch - expected JSON output
- expected human output
- expected context packet
- expected LSP diagnostic shape when relevant
Invariants:
- Static output never says
killedorsurvived. - Unknowns include stop reasons.
- Weak or smoke oracle evidence does not silently become strong.
- Finding order is deterministic.
- Context packets are parseable.
Purpose: measure ripr's own test oracle strength as analyzer work expands.
Deliverables:
-
cargo xtask test-oracle-reportwritestarget/ripr/reports/test-oracles.md. -
cargo xtask test-oracle-reportwritestarget/ripr/reports/test-oracles.json. -
cargo xtask check-test-oraclesaliases the same advisory report. - The report classifies detected Rust tests as strong, medium, weak, or smoke.
- Existing weak or smoke debt is advisory and non-blocking.
Acceptance:
-
cargo xtask test-oracle-report -
cargo xtask check-test-oracles -
cargo xtask metrics -
cargo xtask check-pr
Purpose: add a focused non-blocking ripr-on-ripr report.
Deliverables:
-
cargo xtask dogfoodruns stable fixture diffs throughripr check --mode fast. - Actual dogfood JSON and human outputs are written under
target/ripr/dogfood/<fixture>/. -
target/ripr/reports/dogfood.mdsummarizes findings, exposure classes, runtime, and errors. -
target/ripr/reports/dogfood.jsonprovides the same advisory summary for future machine readers. - Dogfood is advisory and non-blocking.
Acceptance:
-
cargo xtask dogfood -
cargo xtask check-pr
Purpose: introduce an internal fact model while preserving current scanner behavior.
Deliverables:
-
FileFacts -
FunctionFact -
TestFact -
OracleFact -
CallFact -
ReturnFact -
StructConstructionFact -
EnumConstructionFact -
LiteralFact -
BuilderChainFact -
EffectFact
Acceptance:
- Existing sample findings are unchanged.
- Analysis consumes facts rather than ad hoc scanner structures.
- Scanner behavior remains available as the fallback.
Purpose: create the parser boundary before relying on parser-specific details.
Deliverables:
-
RustSyntaxAdaptertrait or equivalent boundary. - Lexical adapter
summarize_fileimplementation. - Changed range to syntax-node mapping.
- No public API commitment to a parser crate.
- Parser substrate decision recorded in ADR 0006.
- Parser-backed
summarize_fileimplementation.
Acceptance:
- Existing outputs remain stable or intentionally updated with fixture evidence.
- Parser errors produce
static_unknownor structured diagnostics, not panics.
Purpose: extract tests and oracles from syntax nodes instead of line substrings.
Deliverables:
-
#[test]function extraction. - Stacked attribute preservation.
- Multi-line assertion macro extraction.
-
assert!,assert_eq!,assert_ne!,assert_matches!, andmatches!handling. -
unwrapandexpectsmoke-oracle handling.
Acceptance:
- Fixture output remains deterministic.
- Line scanning is fallback only.
Purpose: attach probes to stable owner symbols.
Deliverables:
- Diff hunk to changed text range.
- Changed range to syntax-backed owner node.
- Syntax node to enclosing function, method, or module.
- Stable
SymbolId.
Acceptance:
- Duplicate function names across modules or crates do not cross-link tests.
- Probe IDs remain stable enough for
explainandcontext.
Purpose: generate probes from syntax kind and ownership facts.
Deliverables:
- Predicate boundary probes.
- Return value probes.
- Error path probes.
- Field construction probes.
- Side-effect or call-change probes.
-
static_unknownfallback with reason.
Acceptance:
- Multi-line predicate changes produce one useful probe.
- Tail-expression return changes produce return probes.
-
Err(Error::X)changes produce error-path probes.
Purpose: make oracle kind and strength explicit and probe-relative.
Deliverables:
- Exact value oracle.
- Exact error variant oracle.
- Broad error oracle.
- Whole-object equality oracle.
- Snapshot oracle.
- Mock expectation oracle.
- Relational check oracle.
- Shape-only oracle.
- Smoke-only oracle.
- Unknown oracle kind.
Acceptance:
-
is_err()differs from exact error variant assertions. -
unwrap()differs from exact return assertions. - JSON and human output keep the stable schema while rendering probe-relative oracle strength.
Purpose: explain what changed behavior appears to flow to.
Deliverables:
- Changed expression to
letbinding flow. - Binding to return flow.
- Binding to struct field flow.
- Changed expression to
OkorErrflow. - Predicate branch to return or field construction flow.
- Changed call to effect boundary candidate.
Acceptance:
- Findings can name at least one sink when locally visible.
-
propagation_unknownincludes a concrete stop reason.
Purpose: detect whether tests appear to activate the changed behavior.
Deliverables:
- Numeric and string literal value facts.
- Function argument value facts.
- Builder-chain value facts.
- Table-row value facts.
- Enum variant value facts.
- Boundary equality discriminator facts.
Acceptance:
- Boundary findings include detected values.
- Boundary findings include missing equality value.
- Opaque fixtures produce
infection_unknown, not false confidence.
Purpose: make CLI output the reference explanation.
Deliverables:
- Changed behavior section.
- RIPR stage evidence section.
- Related tests section.
- Oracle evidence section.
- Missing discriminator section.
- Next step section.
- Stop reason section for unknowns.
Acceptance:
- Golden human and JSON output cover current Campaign 3 fixtures.
- Static language remains conservative.
- Negative and metamorphic fixtures cover noise-only and syntax-variant cases.
Purpose: make editor diagnostics specific and actionable.
Deliverables:
- Diagnostic data with finding and probe IDs.
- Stable diagnostic codes.
- Hover evidence for exact finding.
- Copy context packet code action.
- Open related tests code action.
- Run deep check command.
- Output-channel lifecycle logs.
Acceptance:
-
didChangerefreshes diagnostics after debounce. - Code action copies the context for the selected finding.
Purpose: turn ripr context into a test-writing brief.
Deliverables:
- Recommended test location.
- Related existing tests.
- Fixture or builder hints.
- Missing input values.
- Missing oracle shape.
- Suggested assertion shapes.
- Confidence and stop reasons.
Acceptance:
- Context packet is golden-tested.
- CLI and LSP use the same packet shape.
Purpose: let repositories teach ripr topology and oracle conventions.
Deliverables:
- Workspace-root config discovery.
- Missing config accepted.
- Useful invalid-config errors.
- Test topology override.
- Custom oracle macro config.
- Snapshot, mock, and external-boundary config.
Acceptance:
- Config changes oracle classification only through explicit rules.
Purpose: support honest noise control without hiding the model.
Deliverables:
- Inline suppression comment form.
- Config suppression form.
- Required reason.
- Optional expiry.
-
--show-suppressed.
Acceptance:
- Suppressed findings remain visible when requested.
- Suppression rate can be measured.
Purpose: support PR workflows without making default CI noisy.
Deliverables:
- SARIF output.
- Markdown summary.
- JSON artifact guidance.
- Advisory mode.
- Opt-in failure modes.
- Baseline-aware mode.
Acceptance:
- SARIF validates.
- SARIF results point to static evidence locations.
- Blocking policy is opt-in.
Purpose: compare static predictions with real mutation results.
Deliverables:
- Import cargo-mutants output through
cargo xtask mutation-calibration. - Match static seam evidence to runtime records by
seam_idfirst and unambiguous normalized file/line second; report ambiguous file/line candidates separately. - Emit advisory static class vs runtime outcome reports at
target/ripr/reports/mutation-calibration.{json,md}. - Keep mutation-runtime language out of static findings; runtime vocabulary is confined to calibration/runtime reports.
Acceptance:
- Runtime mutation vocabulary appears only in explicit calibration data and static-language checks remain clean.
Purpose: cache stable facts after the fact model is worth caching.
Deliverables:
- File-hash invalidation.
- Warm
FileFactsreuse. - LSP reuse of test and oracle facts.
- Graceful stale-cache recovery.
Acceptance:
- Warm run avoids reparsing unchanged files.
Rust PRs must run:
cargo fmt --check
cargo check --workspace --all-targets
cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings
cargo doc --workspace --no-deps
cargo package -p ripr --list
cargo publish -p ripr --dry-runExtension PRs must run:
cd editors/vscode
npm ci
npm run compile
npm run package