Skip to content

feat(benchmarking): add benchmarking code into MLFlow models to better monitor accuracy #189

@ThomasHepworth

Description

@ThomasHepworth

At the moment, we run our benchmarking code manually to assess model changes and guard against regressions. This is imperfect because:

  1. It relies on someone manually checking match rates against our ground-truth datasets.
  2. It’s time-consuming to run more than one dataset when deploying changes.
  3. We don’t have a complete history of model performance over time.

Now that we have MLflow running in GitHub, it would make sense to set up a new repo (or workflow) that automatically runs model evaluation whenever we publish a new GitHub release, and tracks accuracy there.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions