Skip to content

Conversation

@kdestin
Copy link
Member

@kdestin kdestin commented Jan 8, 2025

Description

This pull request adds initial support for using pytest to run Jupyter Notebooks to facilitate sample validation.

image

Specifically this pull request:

  • Adds infrastructure to enable testing of samples
    • Test resource deployment with bicep
    • pytest as a frontend for running sample validation
      • Custom plugin for automatic discovery of samples as "pytest tests" and executes them with papermill
      • Custom plugin that only runs samples that have changed in a pull request
  • Fixes some broken evaluate notebooks

Background

Azure/azureml-examples tests its samples using a github actions as a test runner. Maintaining this prior art revealed some pain points:

  • Difficult to orchestrate validation runs (everything runs in parallel all at once, with limited options for control).
  • Hard to run a non-trivial number of samples locally.
  • Monitoring story isn't ideal. GitHub's UI for actions isn't optimized for repos with hundreds of workflows, which.
  • Onboarding new samples into the test suite is manual and often isn't done by contributors (can be unclear whether untested samples are skipped intentionally).

Additionally, infrastructure deployment is based on a large collection of bash scripts, which can make reasoning about the resources that get deployed difficult.


Using pytest as a test runner for azureai-samples addresses several of those pain points:

  • The testing workflow is local-first. (Deploy resources with bicep, then run pytest).
  • Samples (specifically Jupyter Notebooks and Python samples with included tests) are automatically discovered, and must be manually opted out.
  • Test run orchestration can be controlled (sequential by default, configurable with plugins like pytest-xdist, pytest-retry, pytest-randomly, etc...)
  • Native support for generating test reports, in a format widely understood by other tools (junit xml reports)

Bicep for deployment should make maintaining infrastructure deployments easier (rule of least power).

Checklist

  • I have read the contribution guidelines
  • I have coordinated with the docs team (mldocs@microsoft.com) if this PR deletes files or changes any file names or file extensions.
  • This notebook or file is added to the CODEOWNERS file, pointing to the author or the author's team.

    This commit introduces a custom pytest plugin that forces
    pytest to only collect samples that have changed either:

      * in the working tree compared to HEAD
      * in HEAD compared to main

    We assume that a sample has changed if any file has changed in
    the directory we collected a test from, which is consistent with
    how the contributing guidelines specify to package a sample.
    But this criteria to detect change may need to be iterated on later.
    To prevent pytest from trying to run template.ipynb
    Project deployments sometimes error when they happen concurrently
    Referred to the ARM export of a resource group with a set up
    ai project.
@kdestin kdestin requested a review from a team as a code owner January 8, 2025 23:24
@kdestin kdestin force-pushed the testing-infrastructure branch 6 times, most recently from 1fc2ccb to 644654f Compare January 9, 2025 21:28
    Redeploying CognitiveServices/accounts will sometimes fail with
    "publicNetworkAccess" is required.
@kdestin kdestin force-pushed the testing-infrastructure branch 2 times, most recently from bba19b7 to becef6a Compare January 13, 2025 19:36
@kdestin kdestin merged commit 78704bd into Azure-Samples:main Jan 13, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants