Skip to content

[Resource]: sniffbench #403

@jharris1679

Description

@jharris1679

Resource Information

Description

Benchmark suite for evaluating coding agents. A/B test your Claude Code configuration with real evaluation cases. Register variants, run comprehension interviews or closed-issue evaluations, and compare token usage, cost, and efficiency across runs.

Automated Notification

  • This is a GitHub-hosted resource and will receive an automatic notification issue when merged

Checklist for New Resources

  • Verified link works and points to correct resource
  • Description is concise (1-2 sentences max)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions