Skip to content

vitest: --testNamePattern filtering and duplicate name detection#108

Merged
jsk11235 merged 15 commits intomainfrom
vitest-name-pattern-filter
Mar 16, 2026
Merged

vitest: --testNamePattern filtering and duplicate name detection#108
jsk11235 merged 15 commits intomainfrom
vitest-name-pattern-filter

Conversation

@jsk11235
Copy link
Collaborator

@jsk11235 jsk11235 commented Mar 13, 2026

Summary

  • Filter vitest test execution using --testNamePattern with space-separated test names (matching vitest's internal format)
  • Reject duplicate space-separated test names at discovery time since --testNamePattern cannot distinguish them
  • Add vitest duplicate name check to the onboarding skill (Step 5), using offload collect as the verification loop
  • Add offload collect verification to the onboarding run step (Step 9) before full execution
  • Remove file selector args from vitest execution command (unnecessary given unique test names)
  • Update stale comments in junit.rs for 1:1 test ID model (was incorrectly describing one-to-many and summing)

Test plan

  • cargo fmt --check passes
  • cargo clippy passes
  • cargo nextest run passes (128/128)
  • ratchets check passes

🤖 Generated with Claude Code

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 1 issue.

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

/// Loads test durations from an existing JUnit XML file.
///

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The load_test_durations doc comment says 'For test IDs with multiple testcases (one-to-many), durations are summed' but the implementation uses max, not sum. The code takes the maximum duration when a test ID appears multiple times (if duration > *existing { *existing = duration; }), it does not sum them.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 0 issues.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 0 issues.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 1 issue.

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

/// Loads test durations from an existing JUnit XML file.
///

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The doc comment for load_test_durations was changed to say 'For test IDs with multiple testcases (one-to-many), durations are summed', but the implementation does NOT sum durations — it takes the maximum duration via if duration > *existing { *existing = duration; }. The doc comment is inaccurate.

@github-actions
Copy link

Vet found 1 issue.


src/provider.rs:22

[commit_message_mismatch] (severity 3/5) (confidence 0.95)

The user request is specifically about vitest changes (--testNamePattern filtering, duplicate name checking, doc comment cleanup). However, the diff includes many unrelated changes: adding CostEstimate struct, cost estimation to providers/sandboxes, --show-estimated-cost CLI flag, cpu_cores configuration fields, Modal CPU pricing, and multiple new issue tracker entries (code-103 through code-109). These are separate features not mentioned in the user request.

@jsk11235 jsk11235 force-pushed the vitest-name-pattern-filter branch from 0820f52 to 42fb886 Compare March 13, 2026 19:43
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 1 issue.

@@ -491,19 +498,8 @@ pub type SharedJunitReport = Arc<Mutex<MasterJunitReport>>;

/// Loads test durations from an existing JUnit XML file.
///

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[documentation_implementation_mismatch] (severity 3/5) (confidence 0.95)

The load_test_durations doc comment was updated to say 'For test IDs with multiple testcases (one-to-many), durations are summed', but the actual implementation still uses max (not sum) for duplicate test IDs. The code takes the maximum duration when a test ID appears multiple times, it does not sum them.

Copy link
Collaborator

@DanverImbue DanverImbue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks pretty good. One nit, but otherwise I'm on board.

If the user agrees, rename the tests to make the space-separated names unique. Prefer wrapping in descriptive `describe()` blocks over renaming individual tests, as this preserves the existing test names while adding disambiguation.

If the user declines, **do not proceed with Offload onboarding** — inform them that Offload's vitest integration cannot function with duplicate test names and they will need to resolve this before onboarding can continue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required.

I'm kind of bummed that we keep thrashing these Step numbers every time we add or remove steps inline. Probably thousands of tokens are being wasted worldwide on renumbering skills :P Wish there was a better way.

/// Convert ` > ` separators to spaces to match vitest's internal name format.
///
/// `vitest list --json` returns names with ` > ` between describe/test titles,
/// but `--testNamePattern` matches against names joined with spaces.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, I see now.

/// Vitest `--testNamePattern` matches by the space-separated full name, so
/// duplicates would cause over-selection.
fn check_unique_test_names(tests: &[TestRecord]) -> FrameworkResult<()> {
use std::collections::HashMap;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix: inline import

Comment on lines +98 to +105
let mut seen: HashMap<String, usize> = HashMap::new();
for t in tests {
let name =
t.id.split_once(" > ")
.map(|(_, n)| to_vitest_match_name(n))
.unwrap_or_else(|| t.id.clone());
*seen.entry(name).or_insert(0) += 1;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol in python you can do:

c = Counter()
counter[name] += 1

jacobkirmayer-imbue and others added 11 commits March 16, 2026 11:04
Vitest's `describe.each` can produce multiple distinct tests that
share the same ID (file, line, and name). The JUnit report deduplicates
by ID, so early stopping can never see all tests as passed — and one
passing variant could mask failures in others under the same ID.

Add `supports_early_stopping()` to the TestFramework trait (default
true). Vitest returns false. The spawn worker checks this before
triggering early stop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A single test ID can map to multiple testcases (e.g. vitest
describe.each without parameter interpolation). Previously the
HashMap-based outcome tracking collapsed these, causing mismatches
between discovered count and report count.

Changes:
- MasterJunitReport groups testcases by ID per-testsuite: an ID
  passes only if ALL its testcases in a suite passed, fails if ANY
  failed. Flakiness detected across suites as before.
- Orchestrator compares unique discovered IDs (not raw record count)
  against unique IDs in the report.
- Added testcase_count() for raw testcase totals.
- Disabled early stopping for vitest (no 1:1 ID guarantee).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove line-number-based test targeting from the vitest framework.
Test IDs now use `{file} > {name}` format instead of
`{file}:{line} > {name}`. Execution selectors are file paths only.

This fixes browser mode where Vite bundling shifts line numbers,
making file:line locators unreliable. The one-to-many report
grouping handles describe.each duplicates (same file + name).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --testNamePattern ^(name1|name2|...)$ to vitest run commands so
only the batch's tests execute (not all tests in matched files).
Names are regex-escaped via the regex crate. This fixes browser mode
where file-only selectors cause all tests in a file to run.

Add check_unique_test_names() during discovery — duplicate test names
across files would cause --testNamePattern to over-select, so we
error out early with a clear message listing the duplicates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --reporter=verbose alongside --reporter=json so vitest outputs
human-readable test progress to stdout (captured in batch logs) while
still writing JSON results to the output file. This makes batch logs
useful for debugging empty result files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Vitest --testNamePattern matches against the leaf test()/it() title,
not the full describe chain. Previously we passed the full name after
the file prefix (e.g., "suite > nested > leaf"), which only matched
flat tests without describe blocks. Now we extract just the leaf
segment (e.g., "leaf") via rsplit_once(" > ").

The duplicate name check now also validates leaf name uniqueness,
since --testNamePattern cannot distinguish tests with the same leaf
name across different describe blocks or files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Vitest's --testNamePattern matches against the full test name where
describe/suite names are joined with spaces, not ' > '. But vitest
list --json returns names with ' > ' separators. Convert ' > ' to
spaces when building the pattern so nested tests in describe blocks
are matched correctly.

Also check duplicate names using the space-separated form, with an
error message explaining how to fix (rename tests or use unique
describe blocks).

Results: 124/127 tests matched (up from 53/127 before this fix).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Vitest's --testNamePattern uses space-separated names (not ' > ')
for matching test names. Restore the ' > ' to space conversion that
achieved 124/127 test matches.

The 3 remaining unmatched tests are vitest browser-mode edge cases:
- cdp.test-d.ts tests (TypeScript declaration files skipped in browser)
- viewport nested describe (inconsistent vitest name joining)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 1 now detects vitest as a framework. A new Step 2 checks for
duplicate space-separated test names before proceeding, since
Offload's --testNamePattern cannot distinguish them. If duplicates
are found, the agent stops and asks the user to deduplicate before
continuing onboarding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 10 now instructs agents to run offload collect first to verify
test discovery works before attempting full execution. This catches
config errors (wrong paths, duplicate vitest names, missing tools)
without the cost of spinning up sandboxes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the vitest duplicate check from Step 2 (manual npx vitest list)
to new Step 5 (after offload.toml creation) using offload collect.
The agent now iterates on offload collect until all duplicate
space-separated test names are resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jsk11235 jsk11235 force-pushed the vitest-name-pattern-filter branch from e23232c to f84f8d8 Compare March 16, 2026 18:10
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 0 issues.

jacobkirmayer-imbue and others added 4 commits March 16, 2026 11:47
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test names are guaranteed unique by check_unique_test_names(), so
--testNamePattern alone suffices for test selection. File selectors
were a micro-optimization not worth the complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename test_execution_command_builds_file_args to
test_execution_command_builds_name_pattern (file selectors were
removed). Fix "leaf name" comment from old approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove references to one-to-many test ID mapping from doc comments.
Fix load_test_durations doc which incorrectly claimed durations were
summed (code uses max). Simplify MasterJunitReport and TestStatus
docs to reflect the current 1:1 ID to testcase model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jsk11235 jsk11235 changed the title vitest: filter by --testNamePattern and reject duplicate names vitest: --testNamePattern filtering and duplicate name detection Mar 16, 2026
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet found 0 issues.

@github-actions
Copy link

Vet found 1 issue.


README.md:16

[commit_message_mismatch] (severity 2/5) (confidence 0.80)

The diff adds a benchmarks section to README.md, benchmark SVG bar images, and a new benchmark skill file (skills/offload-benchmark/SKILL.md). These are not mentioned in the user request, which focuses on vitest --testNamePattern filtering, duplicate name detection, onboarding skill updates, offload collect verification, removing file selector args, and updating stale comments in junit.rs.

@jsk11235 jsk11235 merged commit 666da0b into main Mar 16, 2026
5 checks passed
@jsk11235 jsk11235 deleted the vitest-name-pattern-filter branch March 16, 2026 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants