Releases: pytorch/test-infra
Releases · pytorch/test-infra
v20250905-153412
Upgrade scale up/down lambdas to aws sdk v3 (#7061) Upgrade to aws sdk v3 Main change is getting rid of the `promise` calls since I think they just directly return promises instead of requests This changes a lot of mocks in the testing so I'm not sure how good running just `yarn test` is Testing: Mangled scaleDown to only run `listInstances` and `listSSMParameters` In `terraform-aws-github-runner/modules/runners/lambdas/runners`: ``` yarn build; cd dist;node -e 'require("./index").scaleDown({}, {}, {});' > t.log ``` I also tried to terminate a runner and it worked Deployed to pytorch-canary and ran some jobs, seems ok
v20250905-153356
Upgrade webhook lambda for scale up/down to aws sdk v3 (#7077) Similar to #7061 Mostly just getting rid of `promise()` Testing: just `yarn test` but idk how helpful that is since it mocks everything Deployed to pytorch-canary and it seems ok?
v20250905-153317
Upgrade runner-binaries-syncer to aws sdk v3 (#7078) `yarn test` is broken on main, you can see this in https://github.com/pytorch/test-infra/blob/a32b8f647ed2df0e93a167e518cf92f5855671ce/terraform-aws-github-runner/modules/runner-binaries-syncer/lambdas/runner-binaries-syncer/Makefile#L15 Testing: Stole some environment variables from the lambda and mangled the key to upload to a dummy key then ran `yarn build; cd dist;node -e 'require("./index").handler();' > t.log` Saw that it uploaded a file, and skipped some because they didn't need to be uploaded. The one that was uploaded was arm64, which I'm thinking was uploaded manually since it lacks a tag on s3. I had to add an `await` since it wasn't working, which I think is a bug in the original code
v20250902-213100
Bump tracing-subscriber from 0.3.18 to 0.3.20 in /aws/lambda/log-clas…
v20250902-173719
[BE][EZ] Document what the enable_organizations_runner param does (#7…
v20250829-162418
[autoscalers] Only use auth to download github files if needed (#7064) This enables autoscalers to use a scale-config that's located in an organization other than the one they're located in, as long as that scale-config is located in a public repo (which all our scale configs currently are). Bug it fixes: The old code would create a github client to download the scale-config.yml file, but `createGitHubClientForRunnerOrg` will fail if you try to try to create a client for an org your app doesn't have access to. Using a full blown git client for a public file also seems unnecessary. This version uses a normal http request to pull the raw file. So authentication doesn't matter. (Aside: I considered keeping the old flow as a backup path for if we ever want the scale config to live in a public repo, but if and when that day comes I'd rather we add the logic afresh than leave dead, unused code around in the script.) Testing: Verified the getRunnerTypes functionality locally to ensure it worked end-to-end without mocks. --------- Co-authored-by: Jean Schmidt <[email protected]>
v20250828-135156
Fixing the behaviour for getRunnerTypes with scaleConfigOrg (#7062) Currently, when scaleConfigOrg is pointing to an organization that is not the one runners are assigned to, it is not always correctly selected for `getRunnerTypes` call. Triggering errors similar to: ``` ERROR [getRunnerTypes]: HttpError: Not Found ``` This is due it not be correctly matched in all places where its usage is called.
v20250826-210603
[autorevert] refactoring: extract Signal, decouple pattern detection …
v20250822-013402
[autorevert] fix query sorting (#7043) Current sorting uses the workflow dispatch time, what does not match the order for commit sequence. The correct approach is to sort by merge timestamp for all workflows. This was causing errors in the detection logic, as it was mixing the order of jobs for commit evaluation, detecting rules where it should not. ``` ================================================== SUMMARY STATISTICS ================================================== Workflow(s): Lint, trunk, pull, inductor, linux-binary-manywheel Timeframe: 4380 hours Commits checked: 33873 Auto revert patterns detected: 560 Actual reverts inside auto revert patterns detected (%): 204 (36.4%) Total revert commits in period: 601 Revert categories: nosignal: 215 (35.8%) ghfirst: 151 (25.1%) uncategorized: 105 (17.5%) ignoredsignal: 70 (11.6%) weird: 46 (7.7%) landrace: 14 (2.3%) Total reverts excluding ghfirst: 450 Reverts (excluding ghfirst) that dont match any auto revert pattern detected (%): (268) (59.6%) ********************************************************************* STATS SUMMARY: PRECISION: 36.4% RECALL: 33.9% F1: 35.1% ********************************************************************* Per workflow precision: Lint: 50 reverts out of 60 patterns (83.3%) [excluding ghfirst: 46 (76.7%)] trunk: 40 reverts out of 74 patterns (54.1%) [excluding ghfirst: 37 (50.0%)] pull: 79 reverts out of 276 patterns (28.6%) [excluding ghfirst: 74 (26.8%)] inductor: 34 reverts out of 144 patterns (23.6%) [excluding ghfirst: 31 (21.5%)] linux-binary-manywheel: 1 reverts out of 6 patterns (16.7%) [excluding ghfirst: 0 (0.0%)] ```
v20250819-162243
Bump axios from 1.7.7 to 1.8.2 in /terraform-aws-github-runner/module…