-
Notifications
You must be signed in to change notification settings - Fork 239
chore(compass-assistant): add all eval cases, tags and CSV conversion script COMPASS-9823 COMPASS-9758 #7304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive evaluation cases from a product design document to the Compass Assistant test suite and introduces support for tagging eval cases with categories.
- Adds a CSV conversion script to automate generation of eval cases from a spreadsheet
- Introduces a tagging system to categorize eval cases by type (connection-error, explain-plan, end-user-input, etc.)
- Refactors existing eval case files to use named exports and consistent structure
Reviewed Changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/compass-assistant/test/eval-cases/generated-cases.ts | Contains 346 auto-generated eval cases from CSV covering various MongoDB topics |
| packages/compass-assistant/test/eval-cases/index.ts | Updates imports to use named exports and includes new generated cases |
| packages/compass-assistant/test/assistant.eval.ts | Adds tags support to SimpleEvalCase type and evaluation framework |
| packages/compass-assistant/scripts/convert-csv-to-eval-cases.ts | New script to convert CSV data to TypeScript eval case files |
| packages/compass-assistant/test/eval-cases/model-data.ts | Refactored to use named export and adds tags |
| packages/compass-assistant/test/eval-cases/atlas-search.ts | Refactored to use named export and adds tags |
| packages/compass-assistant/test/eval-cases/aggregation-pipeline.ts | Refactored to use named export and adds tags |
| packages/compass-assistant/test/entrypoints/explain-plan.ts | Adds explain-plan tag to generated cases |
| packages/compass-assistant/package.json | Adds conversion script and fast-csv dependency |
| packages/compass-assistant/.gitignore | Excludes CSV source file from version control |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
packages/compass-assistant/scripts/convert-csv-to-eval-cases.ts
Outdated
Show resolved
Hide resolved
packages/compass-assistant/scripts/convert-csv-to-eval-cases.ts
Outdated
Show resolved
Hide resolved
8d720c9 to
af841c0
Compare
| import type { SimpleEvalCase } from '../assistant.eval'; | ||
|
|
||
| const evalCases: SimpleEvalCase[] = [ | ||
| export const aggregationPipelineEvalCases: SimpleEvalCase[] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ones in aggregation-pipeline.ts, atlas-search.ts and model-data.ts all came from the PD. So I think we should either split generated-cases.ts and merge it into separate files matching different areas (probably loosely mapping to tags?) or just remove these files for now and go with just generated-cases.ts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitting opens up a can of worms like different imports etc. I kind of prefer 1 file for the sake of sanity in the future (and easy cleanup/migrations).
Agree re: deletions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okI think let's remove the ones that are also in generated-cases.ts from these files and then we can put future manual ones in their own categorised files.
Adds all the eval cases from our PD doc and tags support.
Example of results: https://www.braintrust.dev/app/mongodb-education-ai/p/Compass%20Assistant/experiments/gagik%2Feval-cases-1757509172