This is the campaign-level plan for Codex Goals and long-context contributor work. Campaigns are larger than one PR. Each campaign has an objective, an end state, and work items that should each follow the scoped PR contract.
The operational checklist remains in Implementation plan.
The machine-readable active campaign is .ripr/goals/active.toml.
Campaign ID: agentic-devex-foundation
Status: complete
Objective:
Make the repo safe for autonomous Codex Goals work and human review.
Why it matters:
ripr is being built for long-context, agent-assisted implementation. The repo
must reject ambiguous PRs before review and produce enough receipts for humans
to evaluate trusted change instead of chat transcripts.
End state:
- architecture guard exists
- output-contract checks exist
- first behavior fixtures exist
- docs-as-tests baseline exists
- test-oracle report exists
- dogfood report exists
- Codex Goals campaign docs exist
Work items:
| Work item | Status | Notes |
|---|---|---|
policy/architecture-guard |
done | Workspace, architecture, and public API guardrails exist. |
output/output-contract-check |
done | Output contract registry checks exist. |
docs/docs-index-checks |
done | Docs index checks exist. |
docs/codex-goals-campaigns |
done | Clarify Codex Goals as multi-PR campaigns. |
docs/readme-state-and-link-checks |
done | README state and repo-local Markdown links are checked. |
goals/manifest-check |
done | Active campaign manifest is validated and reportable. |
fixtures/runner-comparison-v1 |
done | Fixture and golden commands run ripr and compare actual outputs. |
fixtures/first-two-goldens |
done | boundary_gap and weak_error_oracle fixtures exist with JSON and human goldens. |
testing/test-oracle-report |
done | Advisory report measures ripr's own strong, medium, weak, and smoke test oracles. |
dogfood/static-self-check |
done | Advisory ripr-on-ripr report runs stable fixture diffs and records current output. |
campaign/agentic-devex-closeout |
done | Campaign 1 is complete and Campaign 2 is active. |
Dependencies:
- Do not start analyzer rewrites until fixture and golden scaffolding can record behavior.
- Do not treat test-oracle reports as blocking until baseline debt is measured.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask pr-summary
cargo xtask fixtures
cargo xtask goldens check
cargo xtask test-oracle-report
cargo xtask dogfood
cargo xtask metricsBlocking conditions:
- policy exception required
- architecture exception required
- output schema change required
- golden blessing needed without explicit review scope
- campaign item depends on an unmerged non-stackable PR
Review policy:
Work items should usually produce one scoped PR. Independent docs or reporting items may be stackable when the campaign manifest marks them that way.
Campaign ID: syntax-backed-analyzer-foundation
Status: complete
Objective:
Move the analyzer from lexical facts to syntax-backed facts.
Why it matters:
Current analyzer behavior still has line-oriented surfaces. ripr needs a
stable fact model and parser adapter boundary before replacing lexical checks.
End state:
FileFactsmodel exists- syntax adapter boundary exists
- Rust parser substrate is recorded in an ADR
- tests and oracles are extracted from syntax-backed facts
- probes attach to stable owner symbols
- current probe families are generated from syntax facts
Work items:
| Work item | Status | Notes |
|---|---|---|
analysis/file-facts-model |
done | FileFacts DTOs exist and the lexical scanner fills them without output drift. |
analysis/syntax-adapter-mvp |
done | RustSyntaxAdapter boundary exists with lexical adapter compatibility. |
design/rust-syntax-substrate |
done | ADR 0006 selects ra_ap_syntax behind the adapter and keeps parser types internal. |
analysis/ast-test-oracle-extraction |
done | Parser-backed facts identify test functions, assertion macros, and unwrap/expect smoke oracles. |
analysis/ast-probe-ownership |
done | Changed lines map to module- and impl-qualified owner symbols without cross-linking duplicate names. |
analysis/ast-probe-generation |
done | Current probe families are generated from parser-backed probe shape facts with lexical fallback. |
Dependencies:
analysis/file-facts-modelshould merge before syntax adapter work.- Parser-backed extraction should use the substrate decision in ADR 0006.
- Analyzer work items are non-stackable unless the manifest explicitly says otherwise.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask fixtures
cargo xtask goldens check
cargo xtask pr-summaryBlocking conditions:
- output drift without golden evidence
- parser-specific types leaking outside the syntax adapter
- architecture exception required
- missing stop reason for new unknowns
Review policy:
Each analyzer work item should include spec, fixture or test, output contract evidence when user-visible output changes, metrics movement when capability status changes, and a clear non-goal list.
Campaign ID: evidence-quality
Status: complete
Objective:
Make findings explain changed behavior, oracle strength, propagation, activation,
and unknown stop reasons with enough precision to guide test work.
End state:
- oracle kind and strength are probe-relative
- local delta flow can name visible sinks
- activation modeling can name observed and missing discriminator values
- output is evidence-first
- unknown findings include stop reasons across surfaces
- negative and metamorphic fixtures protect evidence-first output
Work items:
| Work item | Status | Notes |
|---|---|---|
output/unknown-stop-reason-invariant |
done | Unknown classifications carry stop reasons across domain, JSON, context, GitHub annotations, and human output. |
analysis/oracle-strength-v2 |
done | Oracle kind and strength distinguish exact error variants, exact values, broad errors, smoke-only checks, snapshots, relational checks, and mock expectations. |
analysis/local-delta-flow-v1 |
done | Findings carry typed local flow sinks for visible return, error, field, match-arm, and effect boundaries. |
analysis/activation-value-modeling-v1 |
done | Findings carry observed value facts and missing discriminator facts tied to local flow evidence. |
output/evidence-first-output |
done | Human and JSON output render changed behavior, evidence path, weakness, stop reasons, and next action as first-class finding evidence. |
fixtures/negative-metamorphic-baseline |
done | Negative and metamorphic fixtures cover whitespace/comment/import noise, unrelated token mentions, strong boundary/error oracles, and syntax variants. |
campaign/evidence-quality-closeout |
done | Campaign 3 closed with evidence-first output and negative/metamorphic fixture guardrails. |
Dependencies:
output/unknown-stop-reason-invariantshould land before deeper unknown evidence grows so silent unknowns do not become accepted output.analysis/local-delta-flow-v1landed before activation/value modeling.analysis/activation-value-modeling-v1landed before evidence-first output.output/evidence-first-outputlanded before negative/metamorphic fixture expansion.fixtures/negative-metamorphic-baselineshould land before Campaign 3 closeout so the evidence-first output has negative and metamorphic guardrails.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask fixtures
cargo xtask goldens check
cargo xtask check-output-contracts
cargo xtask pr-summaryBlocking conditions:
- unknown classification without a stop reason
- output drift without golden evidence
- schema change required outside the scoped PR
- fixture expansion before evidence fields are stable
Review policy:
Campaign 3 work should improve evidence precision without claiming real mutation outcomes. Unknown is acceptable, but it must be explicit and actionable.
Status: complete
Objective:
Make low-discriminator tests visible from the same evidence facts used for
static exposure findings.
End state:
- per-test ledgers name reachable owners, oracle kind and strength, observed values, and static limitations
- likely-vacuous, smoke-only, broad-oracle, opaque, circular, and
duplicativesignals are advisory - reports explain evidence and suggested next steps without calling tests bad
- test-efficiency metrics are available for trend tracking
- agent and editor surfaces can avoid imitating low-discriminator tests
riprandripr+badge artifacts publish unresolved-finding counts as inbox-zero signals, with intent and suppressions as durable exception files
Work items:
| Work item | Status | Notes |
|---|---|---|
test-efficiency/test-fact-ledger |
done | cargo xtask test-efficiency-report writes advisory per-test ledgers with reached owners, oracle kind/strength, observed values, and static limitations. |
test-efficiency/vacuous-signal-v1 |
done | The advisory report now records smoke-only, broad-oracle, disconnected, opaque, circular, and likely-vacuous reasons. |
test-efficiency/duplicate-discriminator-v1 |
done | Advisory groups expose tests sharing an owner set, role-aware activation signature, and oracle shape; members are reclassified duplicative with reason duplicate_activation_and_oracle_shape and a per-test duplicate_group_id linked to the top-level duplicate_groups array. Already-flagged classes (opaque, likely_vacuous, possibly_circular) are preserved. |
test-efficiency/report-and-metrics |
done | Top-level metrics object in target/ripr/reports/test-efficiency.json exposes tests_scanned, class_counts (all seven classes), reason_counts (all emitted reasons), and duplicate_discriminator_group_count = duplicate_groups.length. The duplicative test count and the group count are intentionally distinct fields. Capability metadata in metrics/capabilities.toml references the new metrics surface. |
docs/badge-policy |
done | Badge policy locks the badge counting rule, native JSON shape, Shields projection, and exact emitted vocabulary. |
badge/summary-renderer-v1 |
done | Private BadgeSummary, BadgeCounts, BadgePolicy, BadgeKind, BadgeStatus live in pub(crate) mod output::badge. ripr_badge_summary derives counts from CheckOutput; render_native_json and render_shields_json produce the wire shapes. 14 unit tests. Public API and policy/public_api.txt unchanged. |
badge/ripr-count-v1 |
done | ripr check --format badge-json and --format badge-shields dispatch through output::badge::ripr_badge_summary plus the native and Shields renderers from #189. The temporary #![allow(dead_code)] in output/badge.rs and its .ripr/allow-attributes.txt entry are removed. CLI smoke tests cover both formats and confirm badge-plus-* formats remain rejected until badge/ripr-plus-count-v1. |
test-intent/v1 |
done | .ripr/test_intent.toml loader attaches declared_intent metadata (intent, owner, reason, source) to matching test-efficiency entries. The original class is preserved — intent is additive metadata, never a replacement. Unmatched and ambiguous (name-only) selectors fail the report; declared tests remain visible in both the JSON ledger and the Markdown ## Declared Test Intent section. |
badge/ripr-plus-count-v1 |
done | ripr check --format badge-plus-json and --format badge-plus-shields read target/ripr/reports/test-efficiency.json (relative to --root), sum unsuppressed exposure gaps and unsuppressed actionable test-efficiency findings, exclude entries with declared_intent metadata, and report opaque entries as unknowns_test_efficiency. Missing report fails clearly with a regenerator hint. |
suppressions/v1 |
done | .ripr/suppressions.toml loader with closed-set kinds (exposure_gap, test_efficiency); owner + reason required, expires optional in YYYY-MM-DD. Expired entries do not apply and surface as warnings — silent green-forever debt is impossible. Suppressed findings stay visible in detailed reports; the badge counts move them from unsuppressed_* to suppressed_*. Native badge JSON gains a warnings array; Shields stays exactly four fields. |
ci/badge-artifacts |
done | cargo xtask badge-artifacts writes ripr-badge.json, ripr-badge-shields.json, ripr-plus-badge.json, ripr-plus-badge-shields.json, and ripr-badges.md to target/ripr/reports/. The CI workflow runs cargo xtask test-efficiency-report then cargo xtask badge-artifacts (both advisory, both || true); the existing Upload ripr reports step picks up the new files; the badges Markdown is appended to $GITHUB_STEP_SUMMARY. The badge-artifacts task captures git diff origin/main...HEAD to target/ripr/badge-input.diff and runs each format against --root . so exposure and test-efficiency analyze the same codebase. New ReceiptSpec covers all five files. Advisory by default — no --fail-on-nonzero. |
badge/repo-scope-artifacts |
done | cargo xtask repo-badge-artifacts analyzes the full repo baseline through run_repo_analysis (every currently-probeable production syntax shape, not a diff) and writes repo-ripr-badge.json, repo-ripr-badge-shields.json, repo-ripr-plus-badge.json, repo-ripr-plus-badge-shields.json, and repo-ripr-badges.md. Native badge JSON now carries a scope field ("diff" or "repo") on schema 0.2; Shields projection stays exactly four fields. New OutputFormat::RepoBadge* variants route through app::check_workspace_repo; existing diff-scoped cargo xtask badge-artifacts and the BadgeJson/BadgeShields/BadgePlus* formats are unchanged. The v1 baseline is the currently-probeable repo surface — not full seam inventory, not mutation adequacy proof; the deeper seam / test-grip model is tracked as later work. |
badge/publish-main-endpoint |
done | The two repo-scoped Shields JSON files (badges/ripr.json, badges/ripr-plus.json) are committed to main and served via raw.githubusercontent.com/EffortlessMetrics/ripr/main/badges/.... Root README.md renders them via img.shields.io/endpoint. Refresh: cargo xtask update-badge-endpoints (regenerates from repo-badge-artifacts and copies into badges/). Verify (advisory, not yet a hard CI gate): cargo xtask check-badge-endpoints. Pages deployment was prototyped and rejected as over-engineered for v1 dogfood — it would have required Pages enablement, a deploy workflow, and would have implied downstream users must also enable Pages. The ripr product contract is "ripr emits Shields-compatible JSON"; hosting is replaceable. See deferred/hosted-badge-service in docs/DEFERRED.md. |
campaign/test-efficiency-closeout |
done | Campaign 4A marked complete here and in .ripr/goals/active.toml. Final architecture: per-test ledger + class/reason metrics from cargo xtask test-efficiency-report; .ripr/test_intent.toml declarations and .ripr/suppressions.toml exceptions wired into the ripr+ count; diff-scoped PR badge artifacts via cargo xtask badge-artifacts (#195); repo-scoped baseline via cargo xtask repo-badge-artifacts (#204) on schema 0.2 with scope: "repo"; checked-in badges/ripr.json and badges/ripr-plus.json rendered through img.shields.io/endpoint?url=https://raw.githubusercontent.com/EffortlessMetrics/ripr/main/badges/... (#209). Final dogfood snapshot at this campaign close: ripr 163, ripr+ 163 (main = 6b4b2b0); snapshot, not a fixture expectation. PR chain: #195, #198, #199, #200, #204, #205 (DEFERRED.md), #206 (friction-log graduation), #208 (stale-317-headline correction), #209. Issue #207 was the endpoint design-plan. Pages was rejected for v1 dogfood; hosted badge service is deferred/hosted-badge-service. The seam-inventory + test-grip product reframe is next-campaign work (deferred/seam-inventory-test-grip), not unfinished 4A work. |
Dependencies:
- Campaign 3 evidence fields should remain the source of truth; test-efficiency work should not invent a separate classifier for changed behavior.
- The first report should be advisory and should not fail CI.
- Badge counting must use the exact emitted strings audited in Badge policy; aspirational class names that the reporter does not produce must not appear in the badge schema.
test-intent/v1ships beforesuppressions/v1so intentional smoke and duplicate tests are positive declarations, not exception entries.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask pr-summary
cargo xtask reports index
cargo xtask receipts check
cargo xtask test-oracle-reportBlocking conditions:
- output says a test is bad instead of reporting evidence and risk shape
- static analysis suggests deleting tests
- report language becomes blocking policy before calibration/configuration
- new automation bypasses Rust-first
xtaskpolicy
Objective:
Turn findings into editor and agent actions that help produce targeted tests.
End state:
- LSP diagnostics carry finding and probe IDs
- hovers show evidence for the selected finding
- code actions can copy context packets or open related tests
- context packets include missing values and assertion shapes
The original Campaign 4 plan was a direct extension of Campaign 3's
Finding/StageEvidence model. Campaign 4A (Test Efficiency) made
clear that the editor/agent surface needs a richer substrate —
behavior seams classified by test-grip evidence rather than ad-hoc
finding metadata. The continuation lives under Campaign 4B; the work
items below are subsumed there with seam-aware shapes:
| Work item | Status | Notes |
|---|---|---|
lsp/evidence-hover-actions |
superseded | Folded into Campaign 4B as lsp/seam-evidence-hover-v1 (preceded by lsp/repo-seam-diagnostics-v1). |
context/agent-context-v2 |
superseded | Folded into Campaign 4B as context/agent-seam-packets-v1, scoped around RepoSeam and SeamGripClass. |
docs/how-to-use-agent-context |
superseded | Folded into Campaign 4B as docs/agent-dispatch-workflow-v1. |
Campaign ID: repo-seam-inventory-test-grip
Status: complete
Objective:
Inventory behavior seams across the repo, classify how strongly current tests
grip each seam through RIPR evidence, and turn actionable gaps into editor
diagnostics and agent-ready test packets.
The Voice A baseline shipped in Campaign 4A
(badge/repo-scope-artifacts, #204) becomes a special case of seam
classification rather than the analyzer's only repo mode. The seam
evidence loop is the editor/agent loop with the right substrate:
first-class RepoSeam and SeamGripClass underneath, evidence-first
hover and agent packets on top.
End state:
RepoSeam,SeamKind,RequiredDiscriminator, andSeamGripClassexist as a first-class data model- seam IDs are stable across runs and across input file walk reorderings
- test-grip evidence per seam covers reach, activate/infect, propagate, observe, discriminate
- a separate
SeamGripClass/TestGripClassis used for grip classification; mapping to existingExposureClassand to badge counts is explicit, not implicit through type extension - a repo exposure report enumerates seams with their grip class and missing-discriminator hypothesis
- LSP diagnostics surface ungripped or under-gripped seams
- hover renders the RIPR evidence path for the classification with cited related tests
- agent context packets carry the load-bearing fields a coding agent needs to write the missing test
- public repo badge counts can be derived from seam classification without breaking the existing schema
- static-language constraints hold: no
killed/survived/proven/adequatein static output - static seam evidence does not pretend to prove mutation adequacy
Pre-4B LSP groundwork. Before the seam model was ready, three PRs
built editor/agent surfaces on the current Finding / AnalysisSnapshot
model. They protect the LSP loop and provide fallback behavior while
Campaign 4B types are being designed:
- PR #211 — evidence-rich hover over current
Finding/AnalysisSnapshot, replacing generic "evidence found" text with realStageEvidence.summary, related-test oracle text, and weakness rendering. - PR #218 — LSP
executeCommandripr.collectContextwith server-side context packet lookup and VS Code LSP-first / CLI-fallbackcopyContextpath. - PR #219 — VS Code extension e2e smoke tests for activation,
command registration,
copyContext, andrestartServer; wired CIxvfb-runstep.
Campaign 4B LSP work (lsp/repo-seam-diagnostics-v1,
lsp/seam-evidence-hover-v1, context/agent-seam-packets-v1) will
extend or revise these surfaces for RepoSeam / SeamGripClass.
Work items:
| Work item | Status | Notes |
|---|---|---|
spec/repo-seam-inventory |
done | Landed in #223 as docs/specs/RIPR-SPEC-0005-repo-seam-inventory.md; defines RepoSeam, SeamKind, RequiredDiscriminator, TestGripEvidence, SeamGripClass, stable seam ID rules, the relationship to ProbeShapeFact, headline-vs-visible mapping, static-language boundaries, and the Voice A vs Voice B contract. |
analysis/repo-seam-model-v1 |
done | Landed in #229 as crates/ripr/src/analysis/seams.rs; introduces RepoSeam, SeamId, SeamKind, ExpectedSink, RequiredDiscriminator, SeamGripClass as crate-private types per RIPR-SPEC-0005. Deterministic 16-char SeamId via FNV-1a 64-bit; no public Rust API change; no LSP; no badge change. |
analysis/repo-seam-inventory-v1 |
done | Walks production Rust files and emits Vec<RepoSeam>; writes target/ripr/reports/repo-seams.{json,md} via cargo xtask repo-seam-inventory. Initial seam kinds: predicate_boundary, error_variant, return_value, field_construction, side_effect, match_arm, call_presence (validation_branch deferred to a follow-up detection PR). |
analysis/test-grip-evidence-v1 |
done | Crate-private TestGripEvidence + RelatedTestGrip attaching reach/activate/propagate/observe/discriminate evidence per inventoried seam. No classification, no public report. Built from existing RustIndex / OracleFact / ValueFact facts. |
analysis/repo-ripr-classification-v1 |
done | Crate-private SeamGripClass (re-introduced) + classify_seam(seam, evidence) mapping TestGripEvidence to one of 11 spec classes. Headline-vs-visible table on is_headline_eligible. Replaces the stage-zero discard hook from #236 with a real classifier consumer. |
output/repo-exposure-report-v1 |
done | cargo xtask repo-exposure-report writes target/ripr/reports/repo-exposure.{json,md} from the classified seam inventory; repo-exposure-json / repo-exposure-md formats live in crates/ripr/src/output/repo_exposure.rs. Schema 0.1 documented in docs/OUTPUT_SCHEMA.md § "Repo Exposure Report". Replaces the stage-zero classification discard from #237 with the real renderer consumer. |
lsp/repo-seam-diagnostics-v1 |
done | LSP publishes seam diagnostics with stable ripr-seam-{class} codes when seamDiagnostics: true is set in initialization options. WARNING for weakly_gripped/ungripped/reachable_unrevealed; INFORMATION for the four *_unknown classes and opaque. strongly_gripped/intentional/suppressed produce no diagnostic. Off by default until cache/repo-seam-facts-v1 lands. Diagnostic data carries seam_id for hover lookup. |
lsp/seam-evidence-hover-v1 |
done | LSP hover for seam diagnostics: looks up ClassifiedSeam via data.seam_id and renders the seam evidence path (grip class, all five RIPR stages with summary, observed values, missing discriminator, related tests with oracle kind/strength, per-kind next step). Pre-4B Finding hover still works for diff-scoped diagnostics — backend prefers seam hover when seam_id is present, otherwise falls through to Finding hover. Code-action work deferred. |
context/agent-seam-packets-v1 |
done | cargo xtask agent-seam-packets writes target/ripr/reports/agent-seam-packets.json. Schema 0.2 in crates/ripr/src/output/agent_seam_packets.rs. Each headline-eligible classified seam emits one write_targeted_test packet with seam_id, owner, kind, expression, current_grip, RIPR evidence, observed values, missing input values, missing oracle shape, related tests, and assertion templates. Strongly-gripped/opaque/intentional/suppressed seams emit no packet. |
docs/agent-dispatch-workflow-v1 |
done | docs/AGENT_DISPATCH_WORKFLOW.md documents the practical loop: run ripr → inspect report/diagnostic → read seam evidence hover → copy seam packet → hand to agent → agent writes targeted test → rerun ripr → optional cargo-mutants confirmation. Includes per-kind examples (predicate boundary, error variant, return value, field construction, side effect, opaque, intentional, suppressed) and explicit pushback against "add more tests" / "coverage is fine" / "this is proven". Linked from docs/DOCUMENTATION.md. |
cache/repo-seam-facts-v1 |
rolled-forward | Carried forward into Campaign 5 (Adoption and Calibration). Optional fact-layer cache (file-facts, owner-index, seam-facts; never final outputs). Gated on real performance signal. Landed in Campaign 5A as #255. |
calibration/cargo-mutants-v1 |
rolled-forward | Carried forward into Campaign 5. Optional scaffold for comparing static SeamGripClass against cargo-mutants outcomes. Advisory only; static output adopts no mutation-runtime language. |
campaign/seam-inventory-test-grip-closeout |
done | Campaign 4B marked complete here and in .ripr/goals/active.toml. Repo seam evidence is now first-class: RepoSeam model, repo seam inventory, TestGripEvidence, SeamGripClass classification, repo exposure report, agent seam packets, LSP seam diagnostics, seam evidence hover, and agent dispatch workflow docs. Static output remains evidence-first; runtime mutation testing remains a separate confirmation step (calibration/cargo-mutants-v1 in Campaign 5). PR chain: #229, #235, #236, #237, #239, #240, #241, #242, #248. The active manifest now points at Campaign 5; cache/repo-seam-facts-v1 and calibration/cargo-mutants-v1 carry forward as ready items there. |
Dependencies:
spec/repo-seam-inventorylanded in #223,analysis/repo-seam-model-v1in #229,analysis/repo-seam-inventory-v1in #235,analysis/test-grip-evidence-v1in #236,analysis/repo-ripr-classification-v1in #237, andoutput/repo-exposure-report-v1follows. Recommended next core steps:context/agent-seam-packets-v1(agent work-order packets) orlsp/repo-seam-diagnostics-v1(editor surface).cache/repo-seam-facts-v1andcalibration/cargo-mutants-v1remain unblocked but optional.lsp/seam-evidence-hover-v1extends or revises PR #211, which is already merged as pre-4B evidence-rich hover over the current Finding / AnalysisSnapshot model. The seam-native hover will supersede the Finding-backed hover once RepoSeam and SeamGripClass are stable.- PR #218 (LSP executeCommand
ripr.collectContext) and PR #219 (VS Code extension smoke tests) are also pre-4B groundwork merged before Campaign 4B seam work began. Campaign 4B agent and editor surfaces will build on or replace these current-model implementations. cache/repo-seam-facts-v1andcalibration/cargo-mutants-v1subsume their broader analogs from Campaign 5; Campaign 5 retains its config and CI policy work.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask check-spec-format
cargo xtask check-spec-ids
cargo xtask check-output-contracts
cargo xtask check-static-language
cargo xtask markdown-links
cargo xtask check-doc-indexBlocking conditions:
- analyzer code committed before the spec lands
SeamGripClassextended without explicit mapping to badge counts- runtime-mutation language (
killed,survived, etc.) leaking into static seam reports - public Rust API surface change without a
policy/public_api.txtupdate - LSP / agent surfaces shipped before the seam model and report are settled
Review policy:
This campaign sits inside the operating contract codified in
docs/reference/AGENT_HANDOFF_PROTOCOL.md.
Spec/model work pings the owner; mechanical sub-step work proceeds
inline once authorized.
Campaign ID: seam-evidence-usability-and-precision
Status: done
Objective:
Make repo seam evidence fast, precise, and directly actionable for
developers and coding agents, without adopting mutation-runtime
language in static output.
Why it matters:
Campaign 4B made repo seam evidence first-class (RepoSeam,
TestGripEvidence, SeamGripClass, repo exposure report, agent seam
packets, LSP diagnostics, hover, agent dispatch docs). The signal is
visible but not yet useful every day: full-repo seam classification
adds multi-second editor latency (so seamDiagnostics ships off by
default), related-test fanout is broad, many seams classify as
activation_unknown because value extraction does not yet cover
common Rust test data patterns, oracle-shape detection misses
real-world assertion shapes (field assertions, whole-object equality,
mock expectations), and packets explain the gap without telling an
agent where and how to close it. This campaign closes that gap along
four product axes: fast (cache), precise (related-test, value,
oracle-shape), actionable (agent packets v2, LSP code actions), and
calibrated (cargo-mutants).
Operationalization items (config/ripr-config-v1,
ci/sarif-ci-policy) move to Campaign 5B because their defaults and
severity model depend on cache performance and oracle-shape
stability.
End state:
- seam fact layers cache cleanly so the cold path still works and the warm path avoids full repo seam walk when inputs are unchanged
- cache invalidates on source/config/intent/suppression changes; repo exposure report and LSP diagnostics consume the same cached fact source
- no rendered outputs are cached; cache serialization stays behind a
codec boundary; binary serialization, when introduced, uses
postcard(neverbincode) - related-test fanout is reduced and ranked; related tests carry
relation_reasonandrelation_confidence; high-fanout files show fewer irrelevant top related tests - activation/value evidence detects common Rust test data patterns
(let bindings, constants, builder methods, table-driven cases,
rstest cases, enum variants,
Option/Resultconstructors, fixture factories);activation_unknowncount falls without new false positives - oracle-shape evidence recognizes
assert_matchesexact variants, field assertions, whole-object equality, snapshot calls with visible field names, mock expectations, and event/state/persistence assertions - agent seam packet v2 carries recommended test name, recommended test file, nearest strong test to imitate, candidate input values, assertion shape with example, patterns to imitate, patterns to avoid, and confidence — enough to write the targeted test directly
- LSP code actions surface "Copy seam packet", "Copy suggested
assertion", "Open related test", and "Refresh ripr analysis" for
diagnostics that carry
seam_id; no automatic edits - calibration scaffold compares static
SeamGripClassagainst cargo-mutants outcomes; runtime mutation vocabulary stays inside calibration/runtime reports; static reports keep the audit vocabulary
Work items:
| Work item | Status | Notes |
|---|---|---|
cache/repo-seam-facts-v1 |
done | Landed in #255. Workspace-level Vec<ClassifiedSeam> fact cache at target/ripr/cache/repo-seam-facts/{schema_version}/{key_hash}.json. serde_json behind a codec module boundary; never bincode. Cache key hashes the same Rust file set fed to build_index (production seam sources + test evidence sources), workspace root, cfg/features, config, test intent, suppressions, analyzer version, and schema version — so test-only edits invalidate. Cold path on miss / corrupt; store failures never fail analysis. Renders (JSON, Markdown, diagnostics, hover, packets) stay outside the cache. |
analysis/related-test-precision-v1 |
done | Landed in #310. Adds relation_reason and relation_confidence to related tests; ranks related tests in repo exposure report, agent packets, and LSP hover. Reduces noisy fanout without removing related_tests_total. Schema bumps: cache 0.1→0.2, agent_seam_packets 0.2→0.3, repo_exposure 0.1→0.2. Comment/string-stripping defense added for import_path_affinity. |
analysis/value-extraction-v2 |
done | Adds syntactic value resolution for let bindings, same-file constants/statics, builder and fixture-override methods, table-driven loops, rstest cases, enum variants, and one-level Option/Result constructors. Keeps string/comment shadows, cross-file constants, and unrelated builder tokens from inflating observed values. |
analysis/oracle-shape-v2 |
done | Expands oracle-shape detection for field assertions, whole-object equality over visible struct literals, event/state/persistence observers, mock expectations, and simple custom assertion helpers. Keeps is_err broad and exact assert_matches!(..., Err(...)) strong without learned priors or helper-body analysis. |
context/agent-seam-packets-v2 |
done | Schema 0.3 packets now carry recommended_test, nearest_strong_test_to_imitate, candidate_values, assertion_shape (kind + example), patterns_to_imitate, patterns_to_avoid, and recommendation confidence. Uses ranked related tests from analysis/related-test-precision-v1 when available; no automatic edits or generated test skeletons. |
lsp/seam-code-actions-v1 |
done | Seam diagnostics now surface code actions for copying the selected seam packet, copying a concrete suggested assertion when the agent packet assertion shape is available, opening the nearest related test when a related-test location is present, and refreshing ripr analysis. Finding diagnostic context-copy actions still work. No automatic edits, generated tests, CodeLens, or in-memory overlays. |
calibration/cargo-mutants-v1 |
done | Adds advisory cargo xtask mutation-calibration report generation. Imported cargo-mutants JSON/output is joined to static SeamGripClass evidence by seam_id first and unambiguous normalized file/line second; span-based locations are imported, ambiguous file/line candidates stay unassigned, and unmatched runtime mutants remain visible; runtime mutation vocabulary stays inside target/ripr/reports/mutation-calibration.{json,md}. |
campaign/seam-evidence-usability-closeout |
done | Final Campaign 5A state transition. Closed the campaign after #255, #310, #313, #314, #315, #316, and #327 landed; operationalization items moved to Campaign 5B. |
Dependencies:
cache/repo-seam-facts-v1does not block the precision items technically, but landing it first lets the precision PRs benchmark warm/cold paths without rerunning full inventory.analysis/related-test-precision-v1should land beforecontext/agent-seam-packets-v2so v2 packets can use ranked related tests aspatterns_to_imitate/patterns_to_avoid.analysis/oracle-shape-v2can land independently now thatanalysis/value-extraction-v2has stabilized the value evidence floor.lsp/seam-code-actions-v1should land aftercontext/agent-seam-packets-v2so the "Copy suggested assertion" action can use the v2assertion_shapefield.calibration/cargo-mutants-v1is independent and can land any time.
Commands:
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-pr
cargo xtask goldens check
cargo xtask check-output-contracts
cargo xtask check-static-languageBlocking conditions:
- bincode introduced as a serialization dependency (use postcard)
- rendered outputs cached (only fact layers may be cached)
- mutation-runtime language (
killed,survived,proven,adequate) leaking from calibration into static reports - output drift without golden evidence
seamDiagnosticsflipped on by default before cache lands
Review policy:
This campaign is product work, not refactor work. Each work item should preserve the spec/test/code/output trail. PRs that mix implementation with refactoring should be split.
Closeout:
Campaign 5A is complete. Landed PR chain:
- #255
cache/repo-seam-facts-v1 - #310
analysis/related-test-precision-v1 - #313
analysis/value-extraction-v2 - #314
analysis/oracle-shape-v2 - #315
context/agent-seam-packets-v2 - #316
lsp/seam-code-actions-v1 - #327
calibration/cargo-mutants-v1
The active campaign now moves to Campaign 5B. Config, SARIF, and badge count remapping are operationalization work, not unfinished 5A precision work.
Campaign ID: operationalization
Status: active
Objective:
Make ripr deployable: repository config governs analyzer behavior,
SARIF and CI policy modes integrate with PR workflows, and the badge
schema can be remapped onto seam-native counts.
Work items:
| Work item | Status | Notes |
|---|---|---|
config/ripr-config-v1 |
done | Repo-root ripr.toml governs analysis mode, oracle policy, severity mapping, suppressions path, report caps, and LSP seam-diagnostic defaults while explicit CLI/LSP options still win. |
ci/sarif-ci-policy |
done | SARIF and policy modes consume configured severity and suppression policy; RIPR-SPEC-0008 pins the rule IDs, severity mapping, suppression visibility, advisory default, renderer, and opt-in baseline policy. |
badge/seam-native-count-mapping |
ready | Map ripr and ripr+ badge counts onto seam-native counts after SARIF consumes configured severity and suppression semantics. |
Review policy:
5B started with config/ripr-config-v1, then landed SARIF rendering and the
opt-in baseline policy. Badge count remapping is now the next ready item and
should reuse the configured severity and suppression semantics SARIF consumes.
Campaign ID: modularize-ripr-submodules
Status: planned (ready to start after Campaign 4B stabilizes)
Objective:
Refactor internal modules under crates/ripr/src/ so each module has one
product responsibility, improving maintainability, testability, and reasoning
without splitting the package.
Why it matters:
Current modules mix responsibilities (e.g., analysis/mod.rs orchestrates pipeline
and counts summaries; analysis/rust_index.rs parses, indexes, and extracts facts).
This makes behavior changes ripple across boundaries, testing harder, and future
modularization (async, parallelism, caching) more complex. Module boundaries should
align with RIPR stages and clear responsibilities.
End state:
crates/ripr/src/
domain/ — stable data model
app/ — use-case orchestration
analysis/
diff/ — diff parsing
workspace/ — file discovery and scope
facts/ — fact model and index
syntax/ — syntax adapter
extract/ — fact extraction
probes/ — probe generation
classify/ — classification pipeline
output/ — rendering
cli/ — argv parsing and execution
lsp/ — LSP server
xtask/ — repo automation
The ripr package stays one crate with one published library and binary. Do not
split into ripr-core, ripr-cli, ripr-lsp, or schema crates.
Hard constraints:
- Do not split the crate
- No JSON schema changes
- No static output language changes
- No new probe families or classification behavior changes
- Preserve all public behavior and CLI surface
- Re-bless goldens only if the PR intentionally changes output
Work items:
| Work item | Status | Notes |
|---|---|---|
modularization/infrastructure-and-planning |
in-progress | This PR: lay down documentation, infrastructure, and establish first-PR pattern |
analysis/summary-extraction |
pending | PR 1: Extract duplicated summary and sort logic from analysis/mod.rs |
analysis/pipeline-extraction |
pending | PR 2: Make analysis/mod.rs a façade over pipeline.rs |
diff/module-split |
pending | PR 3: Split analysis/diff.rs into diff/{mod,model,load,parse}.rs |
workspace/module-split |
pending | PR 4: Split workspace concerns into workspace/{mod,discover,scope,production,paths}.rs |
facts/model-extraction |
pending | PR 5: Move fact DTOs into analysis/facts/model.rs |
syntax/adapter-extraction |
pending | PR 6: Move syntax adapters into analysis/syntax/adapter.rs |
facts/builder-extraction |
pending | PR 7: Move index construction into analysis/facts/build.rs |
syntax/ra-extraction |
pending | PR 8: Move parser-backed logic into analysis/syntax/ra.rs |
syntax/lexical-extraction |
pending | PR 9: Move lexical fallback into analysis/syntax/lexical.rs |
extract/fact-extraction |
pending | PR 10: Move extractors into analysis/extract/{calls,literals,oracles,probe_shapes,text}.rs |
probes/family-extraction |
pending | PR 11: Create analysis/probes/family.rs |
probes/expectations-extraction |
pending | PR 12: Create analysis/probes/expectations.rs |
probes/id-extraction |
pending | PR 13: Create analysis/probes/ids.rs |
probes/lexical-extraction |
pending | PR 14: Create analysis/probes/lexical.rs |
probes/diff-repo-split |
pending | PR 15: Split diff and repo probe seeding |
classify/context-extraction |
pending | PR 16: Create analysis/classify/context.rs with ProbeContext |
classify/related-tests |
pending | PR 17: Move related-test discovery into stage module |
classify/reach-stage |
pending | PR 18: Move reach evidence into stage module |
classify/flow-propagation |
pending | PR 19: Move flow and propagation stages |
classify/activation-stage |
pending | PR 20: Move activation stage |
classify/remaining-stages |
pending | PR 21: Move infection, reveal, decision, confidence, missing, stop reasons |
app/usecase-split |
pending | PR 22: Split app.rs into use-case modules (check, explain, context) |
output/format-extraction |
pending | PR 23: Move OutputFormat to output/format.rs |
output/render-dispatch |
pending | PR 24: Move rendering logic to output/render.rs |
cli/command-model |
pending | PR 25: Create cli/command.rs with CliCommand enum |
cli/parse-command |
pending | PR 26: Update cli/parse.rs to return CliCommand |
cli/execute-command |
pending | PR 27: Create cli/execute.rs for command execution |
domain/context-packet-dto |
pending | PR 28: Create domain/context_packet.rs with ContextPacket struct |
output/json-context-dto |
pending | PR 29: Update JSON context renderer to use ContextPacket |
lsp/context-packet-usage |
pending | PR 30: Update LSP to use ContextPacket |
api/doc-hidden-internals |
pending | PR 31: Mark internal modules #[doc(hidden)] |
api/private-internals |
pending | PR 32: Make internal modules private (breaking, optional) |
xtask/command-dispatch |
pending | PR 33: Split xtask into command and run modules |
xtask/policy-modules |
pending | PR 34: Organize policy checks into xtask/src/policy/ |
xtask/report-modules |
pending | PR 35: Organize reports into xtask/src/reports/ |
campaign/modularization-closeout |
pending | Final review and closure of Campaign 6 |
Dependencies:
- Phase 1 (summary, pipeline) establishes the extraction pattern and should merge before Phase 2
- Phases 2–5 (analysis breakdown) are lowest-risk and can proceed in parallel if CI capacity allows
- Phase 6–7 (app/CLI split) should follow analysis stabilization
- Phase 8–9 (API tightening) should follow all internal movement
- Phase 10 (xtask) is lowest-priority and can happen any time after Phase 1
Commands:
cargo fmt --check
cargo test --workspace
cargo xtask shape
cargo xtask fix-pr
cargo xtask check-architecture
cargo xtask check-public-api
cargo xtask check-pr
cargo xtask fixtures
cargo xtask goldens check
cargo xtask dogfoodBlocking conditions:
- Output or golden drift without intentional spec/test evidence
- Architecture guard or public API guard fails
- PR mixes multiple phases or responsibilities
- JSON schema change without new version docs
- Static language constraints violated
Review policy:
Each modularization PR should be a pure movement with zero behavior change. Include a production-delta summary noting which responsibilities moved to which modules. No refactoring or cleanup in the same PR. Include the standard acceptance checklist in the PR template.