Skip to content

Document ensemble test generation workflow (#159)#160

Merged
that-github-user merged 2 commits intomainfrom
issue-159-ensemble-test-generation
Mar 30, 2026
Merged

Document ensemble test generation workflow (#159)#160
that-github-user merged 2 commits intomainfrom
issue-159-ensemble-test-generation

Conversation

@that-github-user
Copy link
Copy Markdown
Owner

Summary

  • Adds "Recommended workflows" section to README documenting the two-phase ensemble pattern: generate tests first, then implement against converged tests
  • Includes an ensemble-generated test suite (test_pathfinding_generated.py) produced by running thinktank with 5 agents — 3/5 succeeded, all converging on correct maze path length assertions
  • Demonstrates the exact bug that motivated this: a single-agent test asserted path length 13 when the correct answer is 9, causing 13+ runs to show 0% pass rate on correct implementations

Evidence

We ran thinktank run "write tests for A* pathfinding" -n 5 and compared the maze path length assertions:

All 3 passing agents independently computed the same expected values. If one had disagreed, convergence analysis would have flagged it before the test became the oracle.

Test plan

  • npm run build passes
  • npm test passes (250 tests)
  • Generated test file collects correctly with pytest --collect-only
  • README workflow section is clear and actionable

Closes #159

🤖 Generated with Claude Code

unknown and others added 2 commits March 29, 2026 23:19
- Add "Recommended workflows" section to README with two-phase pattern:
  generate tests via ensemble first, then implement against converged tests
- Include ensemble-generated test suite (test_pathfinding_generated.py)
  produced by 3/5 agents, all converging on correct maze path length (9)
- Add test collection wrapper script for --collect-only validation

Addresses #159 (Option A: documented workflow)
Also references #31 (A* showcase evidence)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@that-github-user that-github-user merged commit ca6e68f into main Mar 30, 2026
2 of 4 checks passed
that-github-user pushed a commit that referenced this pull request Mar 31, 2026
- Fix exit-127 preflight test: accept both "Test command not found" and
  generic failure message (Linux returns 127, Windows may differ)
- Add GitHub Packages publish step to release workflow: publishes as
  @that-github-user/thinktank-ai to npm.pkg.github.com alongside
  the existing npmjs.org publish
- Add packages:write permission to release workflow

Fixes CI failure from #160. Resolves "No packages published" on GitHub.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
that-github-user added a commit that referenced this pull request Mar 31, 2026
* Fix CI test failure and add GitHub Packages publishing

- Fix exit-127 preflight test: accept both "Test command not found" and
  generic failure message (Linux returns 127, Windows may differ)
- Add GitHub Packages publish step to release workflow: publishes as
  @that-github-user/thinktank-ai to npm.pkg.github.com alongside
  the existing npmjs.org publish
- Add packages:write permission to release workflow

Fixes CI failure from #160. Resolves "No packages published" on GitHub.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix formatting (trailing comma)

* Remove accidentally committed artifacts, update .gitignore

---------

Co-authored-by: unknown <that-github-user@github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@that-github-user that-github-user deleted the issue-159-ensemble-test-generation branch April 1, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ensemble test generation: use thinktank to validate test suites before running implementations

1 participant