Skip to content

Conversation

@againull
Copy link
Contributor

@againull againull commented Oct 28, 2024

Some Intel GPU devices support 2D block array operations which may be used to optimize applications on Intel GPUs.
This extension provides a device descriptor which allows to query the 2D block array capabilities of a device.

This is exposing the following low-level compute runtime experimental extension:
https://github.com/intel/compute-runtime/blob/master/level_zero/doc/experimental_extensions/2D_BLOCK_TRANSPOSE.md
Considering that this is an experimental extension in gpu runtime level, I think it's reasonable to make it experimental at UR level as well, so I tried to follow UR documentation and added new descriptor as an experimental extension, please advice if it should be done differently.

Headers of this experimental extension are only available at compute-runtime repo and are not available here https://github.com/oneapi-src/level-zero. So, introduced changes to sparsely fetch corresponding headers from compute-runtime repo, only required headers are fetched, not entire repo.

Corresponding intel/llvm PR: intel/llvm#15905

@againull againull requested review from a team as code owners October 28, 2024 21:37
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIP and CUDA changes look good!

Copy link
Contributor

@nrspruit nrspruit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for L0, thanks!

@againull againull force-pushed the againull/2d_block_exp branch from e5e7fec to 680a2f3 Compare October 30, 2024 17:33
@againull
Copy link
Contributor Author

@oneapi-src/unified-runtime-opencl-write @oneapi-src/unified-runtime-native-cpu-write Could you please take a look.

@againull againull marked this pull request as draft October 30, 2024 18:59
@againull
Copy link
Contributor Author

Converting to a draft for now as question was raised in intel/llvm#15905 if we need to expose this in higher level.

@againull againull force-pushed the againull/2d_block_exp branch 4 times, most recently from 10d1bd2 to 17ae982 Compare November 19, 2024 22:30
@againull againull force-pushed the againull/2d_block_exp branch 2 times, most recently from 4f092a1 to 28b1429 Compare November 25, 2024 18:23
@againull againull marked this pull request as ready for review November 25, 2024 18:24
@againull
Copy link
Contributor Author

againull commented Nov 25, 2024

intel/llvm PR has been approved, so could you please help to merge this PR.

@kbenzie kbenzie added ready to merge Added to PR's which are ready to merge v0.11.x Include in the v0.11.x release labels Nov 26, 2024
@againull againull force-pushed the againull/2d_block_exp branch from 28b1429 to c79df59 Compare November 26, 2024 18:51
@github-actions github-actions bot added loader Loader related feature/bug conformance Conformance test suite issues. specification Changes or additions to the specification experimental Experimental feature additions/changes/specification level-zero L0 adapter specific issues cuda CUDA adapter specific issues hip HIP adapter specific issues opencl OpenCL adapter specific issues native-cpu Native CPU adapter specific issues labels Nov 26, 2024
@againull
Copy link
Contributor Author

againull commented Nov 26, 2024

urEnqueueKernelLaunchIncrementMultiDeviceMultiThreadTest.Success/NoUseEventsNoQueuePerThread failure is unrelated as it has been seen in other PRs as well, for example here:
#2055
https://github.com/oneapi-src/unified-runtime/actions/runs/12010801641/job/33478467395

@igchor
Copy link
Contributor

igchor commented Nov 26, 2024

urEnqueueKernelLaunchIncrementMultiDeviceMultiThreadTest.Success/NoUseEventsNoQueuePerThread failure is unrelated as it has been seen in other PRs as well, for example here: #2055 https://github.com/oneapi-src/unified-runtime/actions/runs/12010801641/job/33478467395

I added this to a match file: #2388

@callumfare callumfare merged commit db83117 into oneapi-src:main Nov 27, 2024
72 of 74 checks passed
againull added a commit to intel/llvm that referenced this pull request Nov 28, 2024
Add esimd device descriptor to check if 2d block operations are
supported by the device.
UR counterpart: oneapi-src/unified-runtime#2261
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conformance Conformance test suite issues. cuda CUDA adapter specific issues experimental Experimental feature additions/changes/specification hip HIP adapter specific issues level-zero L0 adapter specific issues loader Loader related feature/bug native-cpu Native CPU adapter specific issues opencl OpenCL adapter specific issues ready to merge Added to PR's which are ready to merge specification Changes or additions to the specification v0.11.x Include in the v0.11.x release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants