Skip to content

Conversation

@deloreyj
Copy link
Collaborator

@deloreyj deloreyj commented May 9, 2025

This PR adds evals with Claude 3.5 Sonnet and Gemini 2.0 Flash. The Anthropic config is currently commented out in packages/eval-tools/src/test-models.ts because our rate limit is too low to complete a full eval run. But the evals can be run locally. For CI, we'll stick with these for now:

  1. gpt-4o
  2. gpt-4o-mini
  3. gemini-2.0-flash

@deloreyj deloreyj force-pushed the anthropic-evals branch from 3feeef8 to e2da93a Compare May 9, 2025 19:28
@deloreyj deloreyj force-pushed the anthropic-evals branch from e2da93a to 804495b Compare May 9, 2025 19:31
@deloreyj deloreyj changed the title feat: add anthropic models to evals feat: add anthropic and gemini evals May 9, 2025
@deloreyj deloreyj requested review from Maximo-Guk and cmsparks May 13, 2025 14:47
@deloreyj deloreyj merged commit a20708d into main May 15, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants