[CI] Add AMDGPU testing to precommit #16189

sarnex · 2024-11-26T16:24:55Z

We now have two AMDGPU runners, but we had to use one of them for build because we lost the PVC build machines.

I just readded the PVC machines and made them our only build systems (they are really fast too), so now both AMDGPU runners only do AMDGPU testing.

The AMDGPU tests run in about 10 minutes, and with two runners, I think we have enough runner bandwidth to justify re-enabling it in precommit testing (we also have two NVIDIA runners and we run that in precommit and it takes double the time to run).

These two machines are stable, we don't use the unstable runner anymore.

Signed-off-by: Sarnie, Nick <[email protected]>

npmiller · 2024-12-03T14:14:57Z

We've had a PR where testing failed on the gfx1031 failed, but passed on the gfx1030, since gfx1031 is not officially supported by ROCm, maybe it make more sense to only have the one gfx1030 machine in pre-commit, what do you think? Would it still be enough bandwidth with just one machine?

sarnex · 2024-12-03T15:13:37Z

@npmiller I think one machine is too little. If that test fails either consistently or sporadically and we're confident it's because of the architecture, we can just make it UNSUPPORTED on that architecture IMO. All of the other tests pass on both machines so I don't think we need to get rid of the 1031 machine.

aelovikov-intel · 2024-12-03T15:22:30Z

@npmiller I think one machine is too little. If that test fails either consistently or sporadically and we're confident it's because of the architecture, we can just make it UNSUPPORTED on that architecture IMO. All of the other tests pass on both machines so I don't think we need to get rid of the 1031 machine.

That makes the result flaky (UNSUPPORTED/PASS) though... I'm not blocking anything, just highlighting this drawback.

sarnex · 2024-12-03T15:25:37Z

@aelovikov-intel Ah yeah I see, whether the test runs or not will depend on what runner gets allocated. Since we don't have that many changes that are likely to affect HIP only nor tests where we expected to have to restrict based on AMD arch, IMO it's fine because any issue should get caught eventually. If we do this for many tests then we definitely need a better solution.

[CI] Add AMDGPU testing to precommit

5b9de0f

Signed-off-by: Sarnie, Nick <[email protected]>

sarnex marked this pull request as ready for review November 26, 2024 17:10

sarnex requested a review from a team as a code owner November 26, 2024 17:10

sarnex requested review from aelovikov-intel and uditagarwal97 November 26, 2024 17:10

aelovikov-intel approved these changes Nov 26, 2024

View reviewed changes

uditagarwal97 approved these changes Nov 26, 2024

View reviewed changes

sarnex merged commit 9ccc9ee into intel:sycl Nov 26, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Add AMDGPU testing to precommit #16189

[CI] Add AMDGPU testing to precommit #16189

Uh oh!

sarnex commented Nov 26, 2024 •

edited

Loading

Uh oh!

Uh oh!

npmiller commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

aelovikov-intel commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CI] Add AMDGPU testing to precommit #16189

[CI] Add AMDGPU testing to precommit #16189

Uh oh!

Conversation

sarnex commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

npmiller commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024

Uh oh!

aelovikov-intel commented Dec 3, 2024

Uh oh!

sarnex commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sarnex commented Nov 26, 2024 •

edited

Loading

sarnex commented Dec 3, 2024 •

edited

Loading