-
Notifications
You must be signed in to change notification settings - Fork 0
Write launch blog post: 'Why we run 5 AI agents and pick the best' #141
Copy link
Copy link
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
For HackerNews / dev.to / blog launch, we need a compelling narrative:
Proposed structure
- The problem: AI coding is unreliable. pass@1 is a gamble.
- The insight: pass@5 >> pass@1 (academic evidence)
- The solution: thinktank runs N agents in parallel
- The twist: we use social choice theory (Copeland) to pick the winner
- Real data: our scoring evaluation shows Copeland and Borda agree 86%
- Dogfooding: thinktank built itself (63 PRs, 237 tests)
- The future: A* showcase, arXiv paper
Key metrics to include
- 63 PRs merged in one session
- 237 tests, all passing
- 45 thinktank runs dogfooding itself
- Copeland-Borda 86% concordance finding
- Test pass rate improvement from 21% to 72%
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request