Skip to content

Conversation

@cmsparks
Copy link
Contributor

@cmsparks cmsparks commented Apr 22, 2025

This fixes some issues with our evals:

  • They didn't run in CI
  • We didn't report tool calling metadata
  • There was some inaccuracy in the factuality eval

@cmsparks cmsparks force-pushed the csparks/evals-improvements branch 6 times, most recently from 96c905a to b3305b2 Compare April 22, 2025 16:04
@cmsparks cmsparks force-pushed the csparks/evals-improvements branch from b3305b2 to ac0c6c9 Compare April 22, 2025 16:09
@cmsparks cmsparks force-pushed the csparks/evals-improvements branch from ac0c6c9 to 1ec31b6 Compare April 22, 2025 16:11
@cmsparks cmsparks merged commit 4717f26 into main Apr 22, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants