Skip to content

Add matrix filtering, retry, resume, cleanup, and HTML reports to harness#2979

Merged
Myestery merged 2 commits intoskillsfrom
harness-issues
Mar 20, 2026
Merged

Add matrix filtering, retry, resume, cleanup, and HTML reports to harness#2979
Myestery merged 2 commits intoskillsfrom
harness-issues

Conversation

@Myestery
Copy link
Contributor

Summary

Test plan

  • cd tests/harness && npm run build compiles cleanly
  • npm test — all new matrix, retry, and HTML report tests pass
  • npx tsx src/run.ts --help shows all new flags
  • --merge-reports with sample summary.json files produces merged-summary.json and report.html
  • Scenario YAML with matrix and retry blocks loads correctly
  • --resume-from with invalid step ID throws descriptive error

Closes #2911, #2916, #2915, #2913, #2912, #2903

artifactPaths: string[];
}

async function cleanupGolemState(cwd: string): Promise<void> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really cleaning up everything that is created by a test. I think we should restart the whole server with --clean. (And maybe warn the user interactively, if not running on github)

…ness

Use golem server clean + restart for full cleanup between scenarios,
with interactive confirmation when running on a TTY outside CI.
…ness

- Golem state cleanup uses golem server clean + restart between scenarios
- Interactive TTY confirmation when not running on CI
- --no-cleanup flag to skip cleanup
- HTML summary report generation
@Myestery Myestery merged commit 993ff47 into skills Mar 20, 2026
23 checks passed
@Myestery Myestery deleted the harness-issues branch March 20, 2026 20:29
@github-actions github-actions bot locked and limited conversation to collaborators Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants