Skip to content

Conversation

@gagik
Copy link
Contributor

@gagik gagik commented Sep 10, 2025

Adds all the eval cases from our PD doc and tags support.

Example of results: https://www.braintrust.dev/app/mongodb-education-ai/p/Compass%20Assistant/experiments/gagik%2Feval-cases-1757509172

@Copilot Copilot AI review requested due to automatic review settings September 10, 2025 12:57
@gagik gagik requested a review from a team as a code owner September 10, 2025 12:57
@gagik gagik changed the title chore: add all eval cases, tags and CSV conversion script COMPASS-9758 chore: add all eval cases, tags and CSV conversion script COMPASS-9823 COMPASS-9758 Sep 10, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive evaluation cases from a product design document to the Compass Assistant test suite and introduces support for tagging eval cases with categories.

  • Adds a CSV conversion script to automate generation of eval cases from a spreadsheet
  • Introduces a tagging system to categorize eval cases by type (connection-error, explain-plan, end-user-input, etc.)
  • Refactors existing eval case files to use named exports and consistent structure

Reviewed Changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/compass-assistant/test/eval-cases/generated-cases.ts Contains 346 auto-generated eval cases from CSV covering various MongoDB topics
packages/compass-assistant/test/eval-cases/index.ts Updates imports to use named exports and includes new generated cases
packages/compass-assistant/test/assistant.eval.ts Adds tags support to SimpleEvalCase type and evaluation framework
packages/compass-assistant/scripts/convert-csv-to-eval-cases.ts New script to convert CSV data to TypeScript eval case files
packages/compass-assistant/test/eval-cases/model-data.ts Refactored to use named export and adds tags
packages/compass-assistant/test/eval-cases/atlas-search.ts Refactored to use named export and adds tags
packages/compass-assistant/test/eval-cases/aggregation-pipeline.ts Refactored to use named export and adds tags
packages/compass-assistant/test/entrypoints/explain-plan.ts Adds explain-plan tag to generated cases
packages/compass-assistant/package.json Adds conversion script and fast-csv dependency
packages/compass-assistant/.gitignore Excludes CSV source file from version control

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

import type { SimpleEvalCase } from '../assistant.eval';

const evalCases: SimpleEvalCase[] = [
export const aggregationPipelineEvalCases: SimpleEvalCase[] = [
Copy link
Contributor

@lerouxb lerouxb Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones in aggregation-pipeline.ts, atlas-search.ts and model-data.ts all came from the PD. So I think we should either split generated-cases.ts and merge it into separate files matching different areas (probably loosely mapping to tags?) or just remove these files for now and go with just generated-cases.ts.

Copy link
Contributor Author

@gagik gagik Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitting opens up a can of worms like different imports etc. I kind of prefer 1 file for the sake of sanity in the future (and easy cleanup/migrations).
Agree re: deletions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okI think let's remove the ones that are also in generated-cases.ts from these files and then we can put future manual ones in their own categorised files.

@gagik gagik changed the title chore: add all eval cases, tags and CSV conversion script COMPASS-9823 COMPASS-9758 chore(compass-assistant): add all eval cases, tags and CSV conversion script COMPASS-9823 COMPASS-9758 Sep 10, 2025
@gagik gagik merged commit 42c5bda into main Sep 11, 2025
77 of 82 checks passed
@gagik gagik deleted the gagik/eval-cases branch September 11, 2025 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants