vitest: --testNamePattern filtering and duplicate name detection by jsk11235 · Pull Request #108 · imbue-ai/offload

jsk11235 · 2026-03-13T18:31:22Z

Summary

Filter vitest test execution using --testNamePattern with space-separated test names (matching vitest's internal format)
Reject duplicate space-separated test names at discovery time since --testNamePattern cannot distinguish them
Add vitest duplicate name check to the onboarding skill (Step 5), using offload collect as the verification loop
Add offload collect verification to the onboarding run step (Step 9) before full execution
Remove file selector args from vitest execution command (unnecessary given unique test names)
Update stale comments in junit.rs for 1:1 test ID model (was incorrectly describing one-to-many and summing)

Test plan

cargo fmt --check passes
cargo clippy passes
cargo nextest run passes (128/128)
ratchets check passes

🤖 Generated with Claude Code

github-actions

Vet found 1 issue.

github-actions · 2026-03-13T18:32:11Z

src/report/junit.rs

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

 /// Loads test durations from an existing JUnit XML file.
 ///


[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The load_test_durations doc comment says 'For test IDs with multiple testcases (one-to-many), durations are summed' but the implementation uses max, not sum. The code takes the maximum duration when a test ID appears multiple times (if duration > *existing { *existing = duration; }), it does not sum them.

github-actions

Vet found 0 issues.

github-actions

Vet found 0 issues.

github-actions

Vet found 1 issue.

github-actions · 2026-03-13T19:02:43Z

src/report/junit.rs

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

 /// Loads test durations from an existing JUnit XML file.
 ///


[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The doc comment for load_test_durations was changed to say 'For test IDs with multiple testcases (one-to-many), durations are summed', but the implementation does NOT sum durations — it takes the maximum duration via if duration > *existing { *existing = duration; }. The doc comment is inaccurate.

github-actions · 2026-03-13T19:43:58Z

Vet found 1 issue.

src/provider.rs:22

[commit_message_mismatch] (severity 3/5) (confidence 0.95)

The user request is specifically about vitest changes (--testNamePattern filtering, duplicate name checking, doc comment cleanup). However, the diff includes many unrelated changes: adding CostEstimate struct, cost estimation to providers/sandboxes, --show-estimated-cost CLI flag, cpu_cores configuration fields, Modal CPU pricing, and multiple new issue tracker entries (code-103 through code-109). These are separate features not mentioned in the user request.

github-actions

Vet found 1 issue.

github-actions · 2026-03-13T19:44:47Z

src/report/junit.rs

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

 /// Loads test durations from an existing JUnit XML file.
 ///


[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The load_test_durations doc comment was updated to say 'For test IDs with multiple testcases (one-to-many), durations are summed', but the actual implementation still uses max (not sum) for duplicate test IDs. The code takes the maximum duration when a test ID appears multiple times, it does not sum them.

DanverImbue

Code looks pretty good. One nit, but otherwise I'm on board.

DanverImbue · 2026-03-16T17:48:36Z

skills/offload-onboard/SKILL.md

+If the user agrees, rename the tests to make the space-separated names unique. Prefer wrapping in descriptive `describe()` blocks over renaming individual tests, as this preserves the existing test names while adding disambiguation.
+
+If the user declines, **do not proceed with Offload onboarding** — inform them that Offload's vitest integration cannot function with duplicate test names and they will need to resolve this before onboarding can continue.
+


DanverImbue · 2026-03-16T17:50:19Z

skills/offload-onboard/SKILL.md

No action required.

I'm kind of bummed that we keep thrashing these Step numbers every time we add or remove steps inline. Probably thousands of tokens are being wasted worldwide on renumbering skills :P Wish there was a better way.

DanverImbue · 2026-03-16T17:51:21Z

src/framework/vitest.rs

+/// Convert ` > ` separators to spaces to match vitest's internal name format.
+///
+/// `vitest list --json` returns names with ` > ` between describe/test titles,
+/// but `--testNamePattern` matches against names joined with spaces.


Oof, I see now.

DanverImbue · 2026-03-16T17:52:26Z

src/framework/vitest.rs

+/// Vitest `--testNamePattern` matches by the space-separated full name, so
+/// duplicates would cause over-selection.
+fn check_unique_test_names(tests: &[TestRecord]) -> FrameworkResult<()> {
+    use std::collections::HashMap;


Please fix: inline import

DanverImbue · 2026-03-16T17:58:29Z

src/framework/vitest.rs

+    let mut seen: HashMap<String, usize> = HashMap::new();
+    for t in tests {
+        let name =
+            t.id.split_once(" > ")
+                .map(|(_, n)| to_vitest_match_name(n))
+                .unwrap_or_else(|| t.id.clone());
+        *seen.entry(name).or_insert(0) += 1;
+    }


lol in python you can do:

c = Counter() counter[name] += 1

Vitest's `describe.each` can produce multiple distinct tests that share the same ID (file, line, and name). The JUnit report deduplicates by ID, so early stopping can never see all tests as passed — and one passing variant could mask failures in others under the same ID. Add `supports_early_stopping()` to the TestFramework trait (default true). Vitest returns false. The spawn worker checks this before triggering early stop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

A single test ID can map to multiple testcases (e.g. vitest describe.each without parameter interpolation). Previously the HashMap-based outcome tracking collapsed these, causing mismatches between discovered count and report count. Changes: - MasterJunitReport groups testcases by ID per-testsuite: an ID passes only if ALL its testcases in a suite passed, fails if ANY failed. Flakiness detected across suites as before. - Orchestrator compares unique discovered IDs (not raw record count) against unique IDs in the report. - Added testcase_count() for raw testcase totals. - Disabled early stopping for vitest (no 1:1 ID guarantee). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove line-number-based test targeting from the vitest framework. Test IDs now use `{file} > {name}` format instead of `{file}:{line} > {name}`. Execution selectors are file paths only. This fixes browser mode where Vite bundling shifts line numbers, making file:line locators unreliable. The one-to-many report grouping handles describe.each duplicates (same file + name). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add --testNamePattern ^(name1|name2|...)$ to vitest run commands so only the batch's tests execute (not all tests in matched files). Names are regex-escaped via the regex crate. This fixes browser mode where file-only selectors cause all tests in a file to run. Add check_unique_test_names() during discovery — duplicate test names across files would cause --testNamePattern to over-select, so we error out early with a clear message listing the duplicates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add --reporter=verbose alongside --reporter=json so vitest outputs human-readable test progress to stdout (captured in batch logs) while still writing JSON results to the output file. This makes batch logs useful for debugging empty result files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Vitest --testNamePattern matches against the leaf test()/it() title, not the full describe chain. Previously we passed the full name after the file prefix (e.g., "suite > nested > leaf"), which only matched flat tests without describe blocks. Now we extract just the leaf segment (e.g., "leaf") via rsplit_once(" > "). The duplicate name check now also validates leaf name uniqueness, since --testNamePattern cannot distinguish tests with the same leaf name across different describe blocks or files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Vitest's --testNamePattern matches against the full test name where describe/suite names are joined with spaces, not ' > '. But vitest list --json returns names with ' > ' separators. Convert ' > ' to spaces when building the pattern so nested tests in describe blocks are matched correctly. Also check duplicate names using the space-separated form, with an error message explaining how to fix (rename tests or use unique describe blocks). Results: 124/127 tests matched (up from 53/127 before this fix). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Vitest's --testNamePattern uses space-separated names (not ' > ') for matching test names. Restore the ' > ' to space conversion that achieved 124/127 test matches. The 3 remaining unmatched tests are vitest browser-mode edge cases: - cdp.test-d.ts tests (TypeScript declaration files skipped in browser) - viewport nested describe (inconsistent vitest name joining) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Step 1 now detects vitest as a framework. A new Step 2 checks for duplicate space-separated test names before proceeding, since Offload's --testNamePattern cannot distinguish them. If duplicates are found, the agent stops and asks the user to deduplicate before continuing onboarding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Step 10 now instructs agents to run offload collect first to verify test discovery works before attempting full execution. This catches config errors (wrong paths, duplicate vitest names, missing tools) without the cost of spinning up sandboxes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move the vitest duplicate check from Step 2 (manual npx vitest list) to new Step 5 (after offload.toml creation) using offload collect. The agent now iterates on offload collect until all duplicate space-separated test names are resolved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions

Vet found 0 issues.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Test names are guaranteed unique by check_unique_test_names(), so --testNamePattern alone suffices for test selection. File selectors were a micro-optimization not worth the complexity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rename test_execution_command_builds_file_args to test_execution_command_builds_name_pattern (file selectors were removed). Fix "leaf name" comment from old approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove references to one-to-many test ID mapping from doc comments. Fix load_test_durations doc which incorrectly claimed durations were summed (code uses max). Simplify MasterJunitReport and TestStatus docs to reflect the current 1:1 ID to testcase model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions

Vet found 0 issues.

github-actions · 2026-03-16T19:05:40Z

Vet found 1 issue.

README.md:16

[commit_message_mismatch] (severity 2/5) (confidence 0.80)

The diff adds a benchmarks section to README.md, benchmark SVG bar images, and a new benchmark skill file (skills/offload-benchmark/SKILL.md). These are not mentioned in the user request, which focuses on vitest --testNamePattern filtering, duplicate name detection, onboarding skill updates, offload collect verification, removing file selector args, and updating stale comments in junit.rs.

github-actions bot reviewed Mar 13, 2026

View reviewed changes

jsk11235 force-pushed the vitest-name-pattern-filter branch from 0820f52 to 42fb886 Compare March 13, 2026 19:43

github-actions bot reviewed Mar 13, 2026

View reviewed changes

DanverImbue reviewed Mar 16, 2026

View reviewed changes

jacobkirmayer-imbue and others added 11 commits March 16, 2026 11:04

jsk11235 force-pushed the vitest-name-pattern-filter branch from e23232c to f84f8d8 Compare March 16, 2026 18:10

github-actions bot reviewed Mar 16, 2026

View reviewed changes

jacobkirmayer-imbue and others added 4 commits March 16, 2026 11:47

vitest: move HashMap import to module scope

f4d27be

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vitest: fix outdated comments in test code

4f3d725

Rename test_execution_command_builds_file_args to test_execution_command_builds_name_pattern (file selectors were removed). Fix "leaf name" comment from old approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jsk11235 changed the title ~~vitest: filter by --testNamePattern and reject duplicate names~~ vitest: --testNamePattern filtering and duplicate name detection Mar 16, 2026

github-actions bot reviewed Mar 16, 2026

View reviewed changes

jsk11235 merged commit 666da0b into main Mar 16, 2026
5 checks passed

jsk11235 deleted the vitest-name-pattern-filter branch March 16, 2026 19:08

		@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

		/// Loads test durations from an existing JUnit XML file.
		///

		If the user agrees, rename the tests to make the space-separated names unique. Prefer wrapping in descriptive `describe()` blocks over renaming individual tests, as this preserves the existing test names while adding disambiguation.

		If the user declines, do not proceed with Offload onboarding — inform them that Offload's vitest integration cannot function with duplicate test names and they will need to resolve this before onboarding can continue.

Conversation

jsk11235 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

DanverImbue left a comment

Choose a reason for hiding this comment

Uh oh!

DanverImbue Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

DanverImbue Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

DanverImbue Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

DanverImbue Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

DanverImbue Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jsk11235 commented Mar 13, 2026 •

edited

Loading