-
Notifications
You must be signed in to change notification settings - Fork 125
How to Build, Test and Deploy a New Windows AMI
Windows CI runners use custom AMIs (Amazon Machine Images) that are built on
the LF account and shared across both the PyTorch and LF AWS accounts. The
deployment process involves building the AMI, testing it on canary, and then
rolling it out to production.
Workflow Diagram
┌─────────────────────────┐
│ 1. Build AMI │
│ (LF account) │
│ AMI built once and │
│ shared publicly │
└───────────┬─────────────┘
│
▼
┌─────────────────────────┐
│ 2. Deploy to Canary │
│ (account 391835788720) │
│ Verify AMI available │
│ in account 308535385114│
└───────────┬─────────────┘
│
▼
┌─────────────────────────┐
│ 3. Test AMI │
│ Run trunk & binaries │
│ workflows on test PR │
└───────────┬─────────────┘
│
▼
┌─────────────────────────┐
│ 4. Deploy to Prod │
│ Land ci-infra and │
│ gha-infra PRs │
└─────────────────────────┘
Step-by-Step Instructions
Run the https://github.com/pytorch/test-infra/blob/main/.github/workflows/build-windows-ami.yml workflow.
Once the workflow completes, it will display the AMI ID and AMI Name in the
output. Note these down — you will need them for the following steps.
Log into AWS account 308535385114 and confirm the new AMI is available.
Look for the AMI with:
- Owner: 391835788720
- Name pattern: e.g. Windows 2019 GHA CI - 20260325213408
Land a PR in https://github.com/pytorch/test-infra that references the new AMI, enabled wincanarylf testing.
Example: https://github.com/pytorch/test-infra/commit/241f90abba732b5be91590f3
Create a PR in https://github.com/pytorch/ci-infra on a dedicated branch (e.g. atalman-win-20260325) with the new AMI configuration.
Example PR: https://github.com/pytorch/ci-infra/pull/413
Example branch: https://github.com/pytorch/ci-infra/tree/atalman-win-20260325
Run the https://github.com/pytorch/ci-infra/actions/workflows/ali-deploy-canary.yml workflow on branch from Step 4 to deploy the new AMI to the canary environment for testing.
Edit the experiments list in https://github.com/pytorch/test-infra/issues/5132 and add:
@youruser,wincanarylf,lf
Replace youruser with your GitHub username.
Open a test PR in https://github.com/pytorch/pytorch (example: https://github.com/pytorch/pytorch/pull/178531) and:
- Verify the Windows runner is using the new AMI (check the runner info in the job logs).
- Assign the labels ciflow/trunk and ciflow/binaries to trigger the relevant CI workflows.
- Confirm that both trunk and binaries jobs pass on Windows.
Once testing is successful:
- Land the https://github.com/pytorch/ci-infra/pull/413 and deploy it.
- Land the https://github.com/meta-pytorch/pytorch-gha-infra/pull/1027 and deploy it.