Add new device descriptor to query 2D block array capabilities of the Intel GPU #2261

againull · 2024-10-28T21:37:39Z

Some Intel GPU devices support 2D block array operations which may be used to optimize applications on Intel GPUs.
This extension provides a device descriptor which allows to query the 2D block array capabilities of a device.

This is exposing the following low-level compute runtime experimental extension:
https://github.com/intel/compute-runtime/blob/master/level_zero/doc/experimental_extensions/2D_BLOCK_TRANSPOSE.md
Considering that this is an experimental extension in gpu runtime level, I think it's reasonable to make it experimental at UR level as well, so I tried to follow UR documentation and added new descriptor as an experimental extension, please advice if it should be done differently.

Headers of this experimental extension are only available at compute-runtime repo and are not available here https://github.com/oneapi-src/level-zero. So, introduced changes to sparsely fetch corresponding headers from compute-runtime repo, only required headers are fetched, not entire repo.

Corresponding intel/llvm PR: intel/llvm#15905

steffenlarsen

HIP and CUDA changes look good!

nrspruit

LGTM for L0, thanks!

againull · 2024-10-30T17:33:35Z

@oneapi-src/unified-runtime-opencl-write @oneapi-src/unified-runtime-native-cpu-write Could you please take a look.

againull · 2024-10-30T19:00:28Z

Converting to a draft for now as question was raised in intel/llvm#15905 if we need to expose this in higher level.

againull · 2024-11-25T18:25:41Z

intel/llvm PR has been approved, so could you please help to merge this PR.

againull · 2024-11-26T21:34:25Z

urEnqueueKernelLaunchIncrementMultiDeviceMultiThreadTest.Success/NoUseEventsNoQueuePerThread failure is unrelated as it has been seen in other PRs as well, for example here:
#2055
https://github.com/oneapi-src/unified-runtime/actions/runs/12010801641/job/33478467395

igchor · 2024-11-26T21:40:58Z

urEnqueueKernelLaunchIncrementMultiDeviceMultiThreadTest.Success/NoUseEventsNoQueuePerThread failure is unrelated as it has been seen in other PRs as well, for example here: #2055 https://github.com/oneapi-src/unified-runtime/actions/runs/12010801641/job/33478467395

I added this to a match file: #2388

Add esimd device descriptor to check if 2d block operations are supported by the device. UR counterpart: oneapi-src/unified-runtime#2261

againull requested review from a team as code owners October 28, 2024 21:37

againull requested a review from steffenlarsen October 28, 2024 21:37

steffenlarsen approved these changes Oct 29, 2024

View reviewed changes

nrspruit approved these changes Oct 29, 2024

View reviewed changes

againull force-pushed the againull/2d_block_exp branch from e5e7fec to 680a2f3 Compare October 30, 2024 17:33

aarongreig approved these changes Oct 30, 2024

View reviewed changes

againull marked this pull request as draft October 30, 2024 18:59

againull force-pushed the againull/2d_block_exp branch 4 times, most recently from 10d1bd2 to 17ae982 Compare November 19, 2024 22:30

againull mentioned this pull request Nov 21, 2024

[SYCL] Add esimd device descriptor for 2d load/store/prefetch intel/llvm#15905

Merged

againull force-pushed the againull/2d_block_exp branch 2 times, most recently from 4f092a1 to 28b1429 Compare November 25, 2024 18:23

againull marked this pull request as ready for review November 25, 2024 18:24

kbenzie added ready to merge Added to PR's which are ready to merge v0.11.x Include in the v0.11.x release labels Nov 26, 2024

againull added 5 commits November 26, 2024 10:09

Define exp extension

6fca68d

Genereate sources

42c2b66

Add implementation

66025f0

OpenCL adapter implementation and update other adapters

a1a3a43

Remove redefinitions in L0 V2

c79df59

againull force-pushed the againull/2d_block_exp branch from 28b1429 to c79df59 Compare November 26, 2024 18:51

callumfare merged commit db83117 into oneapi-src:main Nov 27, 2024
72 of 74 checks passed

againull mentioned this pull request Dec 4, 2024

[L0] Shorten the dir name for the fecthed repo to avoid hitting Windows max limit #2413

Merged

pbalcer mentioned this pull request Dec 20, 2024

Skip compute-runtime fetch #2494

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add new device descriptor to query 2D block array capabilities of the Intel GPU #2261

Add new device descriptor to query 2D block array capabilities of the Intel GPU #2261

Uh oh!

againull commented Oct 28, 2024 •

edited

Loading

Uh oh!

steffenlarsen left a comment

Uh oh!

nrspruit left a comment

Uh oh!

againull commented Oct 30, 2024

Uh oh!

againull commented Oct 30, 2024

Uh oh!

againull commented Nov 25, 2024 •

edited

Loading

Uh oh!

againull commented Nov 26, 2024 •

edited

Loading

Uh oh!

igchor commented Nov 26, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Add new device descriptor to query 2D block array capabilities of the Intel GPU #2261

Add new device descriptor to query 2D block array capabilities of the Intel GPU #2261

Uh oh!

Conversation

againull commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steffenlarsen left a comment

Choose a reason for hiding this comment

Uh oh!

nrspruit left a comment

Choose a reason for hiding this comment

Uh oh!

againull commented Oct 30, 2024

Uh oh!

againull commented Oct 30, 2024

Uh oh!

againull commented Nov 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

againull commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

igchor commented Nov 26, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

againull commented Oct 28, 2024 •

edited

Loading

againull commented Nov 25, 2024 •

edited

Loading

againull commented Nov 26, 2024 •

edited

Loading