feat: Enable unit tests for dataset presets by attafosu · Pull Request #194 · mlcommons/endpoints

attafosu · 2026-03-20T05:38:52Z

What does this PR do?

Adds unit tests for dataset presets

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

- Add test_dataset_presets.py with 20 test cases for 6 presets across 5 datasets - Add comprehensive testing guide and schema reference documentation Tests verify that transforms work correctly without end-to-end runs, enabling fast regression detection when transform code changes. Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

github-actions · 2026-03-20T05:39:02Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist · 2026-03-20T05:39:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a dedicated unit testing framework for dataset presets. Its primary goal is to ensure the reliability and correctness of data transformations applied by these presets, without the overhead of full end-to-end benchmark runs. This enhancement will significantly improve the development workflow by providing immediate feedback on preset changes and reducing the risk of introducing regressions in data preparation logic.

Highlights

New Documentation: Added a new documentation file, DATASET_PRESET_TESTING.md, which provides a comprehensive guide on how to run and contribute unit tests for dataset presets.
Unit Test Implementation: Implemented extensive unit tests for several dataset presets, including CNNDailyMail, AIME25, GPQA, LiveCodeBench, and OpenOrca, to validate their correct instantiation, data transformation logic, and expected output formats.
Slow Test Marking: Introduced the @pytest.mark.slow decorator for tests that depend on heavier libraries like transformers, enabling selective execution and faster test runs when excluding these marked tests.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a great addition, enabling unit tests for dataset presets and providing clear documentation. The tests cover several presets and verify key aspects like instantiation and transform application. My review includes a few suggestions to improve the test suite's efficiency by reducing redundant computations, ensure consistency in marking slow tests, and enhance test coverage for one of the presets. Overall, this is a valuable contribution to the project's test infrastructure.

DATASET_PRESET_TESTING.md

tests/unit/dataset_manager/test_dataset_presets.py

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

attafosu added 8 commits March 19, 2026 20:17

Cleanup local directory

7f353c4

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Sanitize documentation

aa000c3

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Cleanup

2aa2ebb

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Decorate slow tests

3f5119c

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Update DATASET_SCHEMA_REFERENCE.md

0c4f2a3

Cleanup

a33d778

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

Remove redundant dataset schema

285c409

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

attafosu requested a review from a team as a code owner March 20, 2026 05:38

github-actions bot requested review from arekay-nv and nvzhihanj March 20, 2026 05:39

gemini-code-assist bot reviewed Mar 20, 2026

View reviewed changes

Add fixtures to simplify unit tests

3b81c18

Signed-off-by: attafosu <thomas.atta-fosu@intel.com>

attafosu merged commit 89ea457 into feat/attafosu/sglang-openai-api-compatibility Mar 20, 2026
1 check passed

github-actions bot locked and limited conversation to collaborators Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enable unit tests for dataset presets#194

feat: Enable unit tests for dataset presets#194
attafosu merged 9 commits intofeat/attafosu/sglang-openai-api-compatibilityfrom
feat/attafosu/dataset-unit-tests

attafosu commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

attafosu commented Mar 20, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 20, 2026 •

edited

Loading