Skip to content

Write launch blog post: 'Why we run 5 AI agents and pick the best' #141

@that-github-user

Description

@that-github-user

Summary

For HackerNews / dev.to / blog launch, we need a compelling narrative:

Proposed structure

  1. The problem: AI coding is unreliable. pass@1 is a gamble.
  2. The insight: pass@5 >> pass@1 (academic evidence)
  3. The solution: thinktank runs N agents in parallel
  4. The twist: we use social choice theory (Copeland) to pick the winner
  5. Real data: our scoring evaluation shows Copeland and Borda agree 86%
  6. Dogfooding: thinktank built itself (63 PRs, 237 tests)
  7. The future: A* showcase, arXiv paper

Key metrics to include

  • 63 PRs merged in one session
  • 237 tests, all passing
  • 45 thinktank runs dogfooding itself
  • Copeland-Borda 86% concordance finding
  • Test pass rate improvement from 21% to 72%

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions