Added torchrun compatibility for distributet training across multiple GPUs in a single node (single instance) #1552
Triggered via pull request
August 8, 2024 17:23
sage-maker
synchronize
#4766
Status
Cancelled
Total duration
43m 7s
Artifacts
–
codebuild-ci.yml
on: pull_request_target
Deployment protection rules
Reviewers, timers, and other rules protecting deployments in this run
Event | Environments |
---|---|
sage-maker
approved
|
manual-approval |
Annotations
6 errors
unit-tests (py38)
Build status: FAILED
|
unit-tests (py310)
Build status: FAILED
|
unit-tests (py39)
Build status: FAILED
|
unit-tests (py311)
Build status: FAILED
|
integ-tests
Canceling since a higher priority waiting request for 'PR Checks-4766' exists
|
integ-tests
The operation was canceled.
|