Skip to content

Conversation

@lbushi25
Copy link
Contributor

@lbushi25 lbushi25 commented Aug 20, 2025

When a kernel uses implicit local memory such as by way of the get_work_group_scratch_memory function, the library is supposed to mark the kernel with the appropriate attribute WORK_GROUP_STATIC_ATTR to get things to work at runtime. This is done through the properties passed to the kernel invocation call. For free function kernels however, the infrastructure is not there to do this marking process and usage of the above mentioned function typically results in a UR error.

This PR makes some changes at the middle-end level to traverse the call graph wherever the compiler built-in functions __sycl_allocateLocalMemory and __sycl_dynamicLocalMemoryPlaceholder are used and mark each of the kernels found during this traversal , including free function kernels, with the WORK_GROUP_STATIC_ATTR attribute if not already present.

@lbushi25 lbushi25 marked this pull request as ready for review August 21, 2025 06:07
@lbushi25 lbushi25 requested review from a team as code owners August 21, 2025 06:07
@lbushi25 lbushi25 requested a review from slawekptak August 21, 2025 06:07
Comment on lines 74 to 77
if (F.getCallingConv() == CallingConv::SPIR_KERNEL) {
int ArgPos = GetArgumentPos(F);
SPIRKernelNames.emplace_back(F.getName(), ArgPos);
if (ArgPos >= 0 || F.hasFnAttribute(WORK_GROUP_STATIC_ATTR))
SPIRKernelNames.emplace_back(F.getName(), ArgPos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these changes can be reverted.

With the changes below we mark all kernels using the feature automatically if they weren't marked by headers, so there is no need to analyze arguments of kernels without the attribute

void double_kernel(float *src, float *dst) {
size_t lid = syclext::this_work_item::get_nd_item<1>().get_local_linear_id();

float *local_mem = (float *)syclexp::get_work_group_scratch_memory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are it, we need to make sure that other extensions for local/work-group memory are covered as well (i.e. work with free function kernels):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are it, we need to make sure that other extensions for local/work-group memory are covered as well (i.e. work with free function kernels):

I have added the test.

@AlexeySachkov
Copy link
Contributor

@jzc, @asudarsa, @maarquitos14, @intel/dpcpp-tools-reviewers, could you please take a look as well? Notable portion of the change is actually authored by me, so I don't think that I should be the one approving it

@AlexeySachkov AlexeySachkov requested a review from a team August 22, 2025 08:39

#include "helpers.hpp"

#include <cassert>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should be the last one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatter disagrees, I have made the change manually so hopefully it wont fail the formatter pre-commit check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the formatter doesn't accept this so I am reverting it.


SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((syclexp::nd_range_kernel<1>))
void scratch_kernel(float *src, float *dst) {
size_t lid = syclext::this_work_item::get_nd_item<1>().get_local_linear_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that variable names should start with an upper-case letter.

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last few nits! Otherwise LGTM.

@lbushi25
Copy link
Contributor Author

Failures unrelated, see #19767
@intel/llvm-gatekeepers Can you merge please?

@lbushi25
Copy link
Contributor Author

lbushi25 commented Aug 27, 2025

Failures unrelated, see #19767 @intel/llvm-gatekeepers Can you merge please?

@intel/llvm-gatekeepers , Friendly ping for merge.

@ldrumm ldrumm merged commit e2093d2 into intel:sycl Aug 27, 2025
41 of 44 checks passed
AlexeySachkov pushed a commit to AlexeySachkov/llvm that referenced this pull request Sep 4, 2025
…kernels (intel#19837)

When a kernel uses implicit local memory such as by way of the
`get_work_group_scratch_memory` function, the library is supposed to
mark the kernel with the appropriate attribute `WORK_GROUP_STATIC_ATTR`
to get things to work at runtime. This is done through the properties
passed to the kernel invocation call. For free function kernels however,
the infrastructure is not there to do this marking process and usage of
the above mentioned function typically results in a UR error.

This PR makes some changes at the middle-end level to traverse the call
graph wherever the compiler built-in functions
`__sycl_allocateLocalMemory` and `__sycl_dynamicLocalMemoryPlaceholder`
are used and mark each of the kernels found during this traversal ,
including free function kernels, with the `WORK_GROUP_STATIC_ATTR`
attribute if not already present.
AlexeySachkov added a commit that referenced this pull request Sep 8, 2025
…kernels (#19978)

This is a cherry-pick of #19837

When a kernel uses implicit local memory such as by way of the
`get_work_group_scratch_memory` function, the library is supposed to
mark the kernel with the appropriate attribute `WORK_GROUP_STATIC_ATTR`
to get things to work at runtime. This is done through the properties
passed to the kernel invocation call. For free function kernels however,
the infrastructure is not there to do this marking process and usage of
the above mentioned function typically results in a UR error.

This PR makes some changes at the middle-end level to traverse the call
graph wherever the compiler built-in functions
`__sycl_allocateLocalMemory` and `__sycl_dynamicLocalMemoryPlaceholder`
are used and mark each of the kernels found during this traversal ,
including free function kernels, with the `WORK_GROUP_STATIC_ATTR`
attribute if not already present.

Patch-by: Lorenc Bushi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants