- 
                Notifications
    You must be signed in to change notification settings 
- Fork 248
Update run-readme-pr-linuxaarch64.yml to use correct runner #1469
Conversation
| 🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1469
 Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 1a46c9a with merge base b2d8f2a ( NEW FAILURE - The following job has failed:
 
 This comment was automatically generated by Dr. CI and updates every 15 minutes. | 
| runner: linux.arm64.m7g.4xlarge | ||
| gpu-arch-type: cuda | ||
| gpu-arch-version: "12.1" | ||
| timeout: 60 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try passing
docker-image: "pytorch/manylinuxaarch64-builder:cuda12.1-main"
Looks like the error:
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Is related to the fact that this is using docker-image=pytorch/conda-builder:cuda12.1 image by default which is not correct for linux.arm64.m7g.4xlarge runner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like it can find the Docker-image verbatim, testing with the 12.6 version found in pt/pt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using linux_job_v2.yml you can try using latest image pytorch/manylinux2_28_aarch64-builder:cuda12.6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like the cuda version is there manylinux2_28_aarch64-builder:cuda12.6, but the CPU variant :cpu-aarch64-main with linux_job_v2 seems to be the right track
Now we're just down to missing devtoolset-10-binutils, which is curious since pt/pt uses v10 for aarch64
Edit: Resolved; the pip installs were unnecessary
| fyi: @mikekgfb we looking into it | 
* Update run-readme-pr-linuxaarch64.yml to use correct runner * Move to linux.arm64.m7g.4xlarge * Explicitly overriding the docker-image * Bumping Cuda version to 12.6 * Updating GPU Arch type * Testing various linux_job combos: v2 cuda, v2 cpu, v1 cpu * Adding permissions to linux job v2 * Switch everything to CPU linux v2 * Test with devtoolset-11 * Remove devtoolset install * Removing devtoolset from commands
#1350 used
linux-aarch64as the runner when we should be usinglinux.arm64.2xlargefor aarch64 instead