Skip to content

SYCL RT: Using recommended shortcut API for kernel specific max work group size. #1059

@fengyuan14

Description

@fengyuan14

🚀 The feature, motivation and pitch

Existing:

auto q = c10::xpu::getCurrentXPUStream(dev_id).queue();
auto ctx = q.get_context();
auto dev = q.get_device();
auto kid = ::sycl::get_kernel_id<KernelClass>();
// The kernel won't be built for devices except for the first device.
// Launching kernel on devices except for the first device will raise
// runtime error. Here is an alternative as a temporary solution to
// provide an extra hint to SYCL runtime.
// https://github.com/intel/llvm/issues/15127
auto kbundle = ::sycl::get_kernel_bundle<::sycl::bundle_state::executable>(
ctx, {dev}, {kid});
::sycl::kernel k = kbundle.get_kernel(kid);
return k.get_info<::sycl::info::kernel_device_specific::work_group_size>(dev);

TODO: intel/llvm#15650

Alternatives

No response

Additional context

No response

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions