Execution model & Model/Evaluation status tracking by aristizabal95 · Pull Request #631 · mlcommons/medperf

aristizabal95 · 2024-12-05T21:07:16Z

This PR makes several changes to the codebase, converting the Results entities into Execution entities. Executions are now created when a benchmark pipeline (model -> evaluation) starts, and are used to keep track of the execution of the model and evaluator MLCubes. This way, benchmark owners can identify pipelines that are not working as expected, and have visibility of the progress of their benchmark.

…izabal95/medperf-2 into execution-model

changes are mainly about when can a user rerun or create a new execution, etc...

It turns out that renaming a db model is complicated

aristizabal95 added 25 commits November 5, 2024 16:16

Rename results to executions

55b8478

Create report fields

0e7ecbc

Include executions endpoints

2ff26db

Add note regarding existing results endpoints

95e8dd8

Fix execution endpoint name

d006860

Implement query parameters on main entities

16cf202

revert django update

dcd46e3

Merge branch 'general-query-param-search' of https://github.com/arist…

8fb7d8c

…izabal95/medperf-2 into execution-model

WIP turn result entity into execution entity

ac4d3c6

Implement list query filtering in the CLI

3ca03a6

Add list filters to main entities

50cd30b

Merge branch 'general-query-param-search' of https://github.com/arist…

fb4cdc9

…izabal95/medperf-2 into execution-model

Move results code to executions. WIP Integrate exec reporting

3fc5317

make results field optional

941baee

Don't send reports on tests. Fix evaluation report issue

1f13977

Allow updating executions

520340d

remove owner query for /me/results

369ada4

Fix benchmark execution flow

fa8fffd

rename results tests to executions

c7cbf32

Fix rest tests

3abb082

Merge branch 'general-query-param-search' of https://github.com/arist…

f8223a5

…izabal95/medperf-2 into execution-model

Fix tests that called results

c01a0ed

Fix bugs related to result -> execution change

b3f281c

Fix existing tests

02c816f

Fix remaining existing tests

bdae51c

aristizabal95 added type: enhancement New feature or request component: client issues regarding the CLI component: server issues regarding the server topic: benchmark registry labels Dec 5, 2024

aristizabal95 self-assigned this Dec 5, 2024

hasan7n added 6 commits June 11, 2025 18:09

update executions logic

22b6594

changes are mainly about when can a user rerun or create a new execution, etc...

make finalized True for existing instances

f63f62e

add new flags to commands, filter latest executions

c8d467e

type hints for executions util

a301402

fix bug in sending model report logic

394daef

fix bug in migrations

88f16a2

hasan7n had a problem deploying to testing-external-code June 11, 2025 17:10 — with GitHub Actions Failure

update medperf run command

574b432

hasan7n had a problem deploying to testing-external-code June 11, 2025 17:26 — with GitHub Actions Failure

fix some bugs

0a653fa

hasan7n had a problem deploying to testing-external-code June 11, 2025 17:57 — with GitHub Actions Failure

hasan7n added 2 commits June 12, 2025 00:38

preserve predictions using timestamps

6eff5a2

fix integration tests

68bcb10

hasan7n had a problem deploying to testing-external-code June 11, 2025 22:38 — with GitHub Actions Failure

hasan7n added 4 commits June 12, 2025 09:49

add a local outputs folder for metrics container

919ad4b

udpate cli tests

53fe596

fix server tests

a435e90

add new server tests

8eb2d92

hasan7n temporarily deployed to testing-external-code June 12, 2025 12:16 — with GitHub Actions Inactive

hasan7n added 2 commits June 12, 2025 14:26

fix postgresql dev utility

f982d6e

rename execution back to result

147d527

It turns out that renaming a db model is complicated

hasan7n had a problem deploying to testing-external-code June 12, 2025 15:08 — with GitHub Actions Failure

hasan7n added 2 commits June 12, 2025 18:03

rename remaining execution changes for consistency

8c3ba78

modify migrations to have existing results finalized

d8682a5

hasan7n had a problem deploying to testing-external-code June 12, 2025 16:08 — with GitHub Actions Failure

update unit test

df1b0be

hasan7n temporarily deployed to testing-external-code June 12, 2025 16:11 — with GitHub Actions Inactive

hasan7n approved these changes Jun 12, 2025

View reviewed changes

hasan7n merged commit cea56d4 into mlcommons:main Jun 12, 2025
9 checks passed

github-actions bot locked and limited conversation to collaborators Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execution model & Model/Evaluation status tracking#631

Execution model & Model/Evaluation status tracking#631
hasan7n merged 58 commits intomlcommons:mainfrom
aristizabal95:execution-model

aristizabal95 commented Dec 5, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aristizabal95 commented Dec 5, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants