-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL] Use kernel bundles to query reduction work-group sizes #16009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
GeorgeWeb
wants to merge
7
commits into
intel:sycl
Choose a base branch
from
GeorgeWeb:georgi/sycl-reduction-wg-size
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5b7d20a to
f543115
Compare
f543115 to
4e15b07
Compare
4e15b07 to
cbdd5bb
Compare
ff69fef to
01aafd3
Compare
5e99a28 to
cc63f30
Compare
cc63f30 to
6d92aa0
Compare
Work-group sizes currently rely on device maximum rather than the max from a kernel query. This changes aims to allow querying the kernels by using kernel bundles in the reduction implementations. In order to do that, all kernel functions have to be defined as function objects and given unique names (for the `kernel_id`s), so they can be identified for a kernel_bundle to be obtained. Additionally, the kernel_bundle can be used with the command group execution to ensure we exeucte the same kernel we queried. This has been added as an optional and can be user-configured via `SYCL_REDUCTION_ENABLE_USE_KERNEL_BUNDLES=1|0`. It is set to 0 by default, which means `use_kernel_bundle` is not called. The goal of this is to be do `kernel.get_info<kernel_device_specific>` (i.e. `work_group_size`) queries, primarily to make sure that safe work-group sizes are chosen, but also for other capability queries which can be use for the implementations of the different kernel startegies.
6d92aa0 to
1f246bf
Compare
1f246bf to
fc94f20
Compare
fc94f20 to
a4b0082
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem statement
Work-group sizes currently rely on device maximum rather than the max from a kernel query.
Changes overview
This changes aims to allow querying the kernels by using kernel bundles in the reduction implementations.
In order to do that, all kernel functions have to be defined as function objects and given unique names (for the
kernel_ids), so they can be identified for akernel_bundleto be obtained.Additionally, the
kernel_bundlecan be used with the command group execution to ensure we exeucte the same kernel we queried. This has been added as an optional and can be user-configured viaSYCL_REDUCTION_ENABLE_USE_KERNEL_BUNDLES=1|0. It is set to 0 by default, which meansuse_kernel_bundleis not called.End Goal
The goal of this is to be able to do
kernel.get_info<kernel_device_specific::<query_value_here>>(i.e.work_group_size) queries, primarily to make sure that safe work-group sizes (not device maximums but kernel maximums) are chosen. This will also benefit us with querying other kernel-device-specific capabilities which can be used for implementing (or optimizing) the different kernel strategies to be as efficient as possible.