Skip to content

Conversation

@tahonermann
Copy link
Contributor

@tahonermann tahonermann commented Aug 6, 2025

The sycl_kernel_entry_point attribute facilitates the generation of an offload kernel entry point function with parameters corresponding to the (potentially decomposed) kernel arguments and a body that (potentially reconstructs the arguments and) executes the kernel. This change adds symmetric support for the SYCL host through an interface that provides symbol names and (potentially decomposed) kernel arguments to the SYCL library.

Consider the following function declared with the sycl_kernel_entry_point attribute with a call to this function occurring in the implementation of a SYCL kernel invocation function such as sycl::handler::single_task().

  template<typename KernelNameType, typename KernelType>
  [[clang::sycl_kernel_entry_point(KernelNameType)]]
  void kernel_entry_point(KernelType kernel) {
    kernel();
  }

The body of the above function specifies the parameters and body of the generated offload kernel entry point. Clearly, a call to the above function by a SYCL kernel invocation function is not intended to execute the body as written. Previously, code generation emitted an empty function body so that calls to the function had no effect other than to trigger the generation of the offload kernel entry point. The function body is therefore available to hook for SYCL library support and is now substituted with a call to a (SYCL library provided) function template named sycl_kernel_launch() with the kernel name type passed as the first template argument, the symbol name of the offload kernel entry point passed as a string literal for the first function argument, and the (possibly decomposed) parameters passed as the remaining explicit function arguments. Given a call like this:

  kernel_entry_point<struct KN>([]{})

the body of the instantiated kernel_entry_point() specialization would be substituted as follows with "kernel-symbol-name" substituted for the generated symbol name and kernel forwarded (This assumes no kernel argument decomposition; if decomposition was required, kernel would be replaced with its corresponding decomposed arguments).

  sycl_kernel_launch<KN>("kernel-symbol-name", kernel)

Name lookup and overload resolution for the sycl_kernel_launch() function is performed at the point of definition of the sycl_kernel_entry_point attributed function (or the point of instantiation for an instantiated function template specialization). If overload resolution fails, the program is ill-formed.

Implementation of the sycl_kernel_launch() function might require additional information provided by the SYCL library. This is facilitated by removing the previous prohibition against use of the sycl_kernel_entry_point attribute with a non-static member function. If the sycl_kernel_entry_point attributed function is a non-static member function, then overload resolution for the sycl_kernel_launch() function template may select a non-static member function in which case, this will be implicitly passed as the implicit object argument.

If a sycl_kernel_entry_point attributed function is a non-static member function, use of this in a potentially evaluated expression is prohibited in the definition (since this is not a kernel argument and will not be available within the generated offload kernel entry point function).

Support for kernel argument decomposition and reconstruction is not yet implemented.

@tahonermann tahonermann added the SYCL https://registry.khronos.org/SYCL label Aug 6, 2025
@tahonermann tahonermann requested a review from bader August 6, 2025 22:57
@tahonermann tahonermann force-pushed the sycl-upstream-fe-sycl_kernel_entry_point-host branch from 56407fb to 5b42f6b Compare August 7, 2025 02:27
@tahonermann tahonermann force-pushed the sycl-upstream-fe-sycl_kernel_entry_point-host branch from 7a913b2 to 70f34c3 Compare August 25, 2025 18:22
@github-actions
Copy link

github-actions bot commented Sep 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@YuriPlyakhin
Copy link
Contributor

nit: typo in PR description: void kernel_entry_point(KernelType kerne) { -> void kernel_entry_point(KernelType kernel) {

…t attribute.

The `sycl_kernel_entry_point` attribute facilitates the generation of an
offload kernel entry point function with parameters corresponding to the
(potentially decomposed) kernel arguments and a body that (potentially
reconstructs the arguments and) executes the kernel. This change adds
symmetric support for the SYCL host through an interface that provides
symbol names and (potentially decomposed) kernel arguments to the SYCL
library.

Consider the following function declared with the `sycl_kernel_entry_point`
attribute with a call to this function occurring in the implementation of
a SYCL kernel invocation function such as `sycl::handler::single_task()`.
  template<typename KernelNameType, typename KernelType>
  [[clang::sycl_kernel_entry_point(KernelNameType)]]
  void kernel_entry_point(KernelType kerne) {
    kernel();
  }

The body of the above function specifies the parameters and body of the
generated offload kernel entry point. Clearly, a call to the above function
by a SYCL kernel invocation function is not intended to execute the body
as written. Previously, code generation emitted an empty function body so
that calls to the function had no effect other than to trigger the generation
of the offload kernel entry point. The function body is therefore available
to hook for SYCL library support and is now substituted with a call to a
(SYCL library provided) function template named `sycl_enqueue_kernel_launch()`
with the kernel name type passed as the first template argument, the
symbol name of the offload kernel entry point passed as a string literal for
the first function argument, and the (possibly decomposed) parameters passed
as the remaining explicit function arguments. Given a call like this:
  kernel_entry_point<struct KN>([]{})
the body of the instantiated `kernel_entry_point()` specialization would be
substituted as follows with "kernel-symbol-name" substituted for the
generated symbol name and `kernel` forwarded (This assumes no kernel
argument decomposition; if decomposition was required, `kernel` would be
replaced with its corresponding decomposed arguments).
  sycl_enqueue_kernel_launch<KN>("kernel-symbol-name", kernel)

Name lookup and overload resolution for the `sycl_enqueue_kernel_launch()`
function is performed at the point of definition of the
`sycl_kernel_entry_point` attributed function (or the point of instantiation
for an instantiated function template specialization). If overload
resolution fails, the program is ill-formed.

Implementation of the `sycl_enqueue_kernel_launch()` function might require
additional information provided by the SYCL library. This is facilitated by
removing the previous prohibition against use of the `sycl_kernel_entry_point`
attribute with a non-static member function. If the `sycl_kernel_entry_point`
attributed function is a non-static member function, then overload resolution
for the `sycl_enqueue_kernel_launch()` function template may select a
non-static member function in which case, `this` will be implicitly passed
as the implicit object argument.

If a `sycl_kernel_entry_point` attributed function is a non-static member
function, use of `this` in a potentially evaluated expression is prohibited
in the definition (since `this` is not a kernel argument and will not be
available within the generated offload kernel entry point function).

Support for kernel argument decomposition and reconstruction is not yet
implemented.
@tahonermann tahonermann force-pushed the sycl-upstream-fe-sycl_kernel_entry_point-host branch from 87f0c2b to 2a1c23d Compare November 10, 2025 22:35
Fznamznon and others added 6 commits November 10, 2025 14:58
…#51)

* Add support for host kernel launch stmt generation

This adds generation of a call to sycl_enqueue_kernel_launch function
aka "launcher" function. The launcher function can be a memeber of a
class or a free function defined at namespace scope. The lookup is
performed from SKEP attributed function scope. Because unqualified
lookup requires Scope object present and it only exists during parsing
stage and already EOLed at the point where templates instantiated, I had
to move some parts of SYCLKernelCallStmt generation to earlier stages
and now TreeTransform knows how to process SYCLKernelCallStmt.
I also had to invent a new expression - UnresolvedSYCLKernelExpr which
represents a string containing kernel name of a kernel that doesn't
exist yet. This expression is supposed to be transformed to a
StringLiteral during template instantiation phase. It should never reach
AST consumers like CodeGen of constexpr evaluators. This still requires
more testing and FIXME cleanups, but since it evolved into a quite
complicated patch I'm pushing it for earlier feedback.

* Remove a fixme from SemaSYCL

* Do not crash if original body was invalid

* Add AST test for skep-attributed member

* Fix a warning

* Extend codegen test a bit

* Find and replace

UnresolvedSYCLKernelNameExpr -> UnresolvedSYCLKernelLaunchExpr

* Implement the thing

* One more find and replace

* I don't know how it looks like

* Find and replace again

* Switch to UnresolvedSYCLKernelEntryPointStmt

* Apply suggestions from code review

* Remove log.txt

* Implement visiting

* Add tests

* Apply suggestions from code review

Co-authored-by: Tom Honermann <[email protected]>

* IdExpr -> KernelLaunchIdExpr

* Don't rely on compound

* UnresolvedSYCLKernelEntryPointStmt -> UnresolvedSYCLKernelCall

* Fix warnings

* Rename sycl_enqueue_kernel_launch -> sycl_kernel_launch

* Apply suggestions from code review

Co-authored-by: Tom Honermann <[email protected]>

* Remove array decay

* Add windows run line to the sema test

---------

Co-authored-by: Tom Honermann <[email protected]>
In case a function with skep attribute is instantiated two times with
the same kernel name the attribute is invalid due to the conflicting name.
Make sure to exit from instantiation of UnresolvedSYCLKernelCallStmt in
this case.
…ivially-copyable (#53)

device-copyable doesn't mean trivially-copyable, so we may encounter
arguments that need cleanup. Adds test that verifies presence of the
dtor call in the synthesized code.
- Added/edited lots of comments.
- Expanded testing for sycl_kernel_launch lookup and reorganized the tests
  to make it easier to identify and/or plug testing gaps.
- Added a missing sycl_kernel_entry_point attribute in a test.
- Reordered various declarations for better grouping and consistency.
- Renamed some variables and functions.
- Removed an unnecessary include directive.
- Adjusted the location used for the implicit sycl_kernel_launch call. The
  opening brace of the original function body is used where available and
  the general function location is used otherwise.
- Corrected lookup to unconditionally look for a template name; previously
  a spurious error about a 'template' keyword could be issued.
- Added missing code gen checks for use of a member function as the SYCL
  kernel entry point function.
- Other misc style edits for consistency.
…atic member function with an explicit object parameter.
Reworked diagnostics to include a synthesized code context and generation
of notes detailing implicit calls to sycl_kernel_launch().
@github-actions
Copy link

github-actions bot commented Nov 18, 2025

🐧 Linux x64 Test Results

  • 111821 tests passed
  • 4478 tests skipped

✅ The build succeeded and all tests passed.

@Fznamznon Fznamznon self-requested a review November 18, 2025 17:31
Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minors. I don't like the size of the PR but it is not horribly huge. I think this should be moved out of draft state.

Prompted by code review, this rewording is intended to clarify the
prose and better tie it to the example code. Mention of kernel
argument decomposition and reconstruction is removed since such support
is not yet implemented. Additional changes include use of identifiers
that match the SYCL specification for consistency, avoidance of
duplication, a change of std::forward<>() to std::move(), and oher
misc edits.
@tahonermann tahonermann marked this pull request as ready for review December 1, 2025 15:55
@tahonermann
Copy link
Contributor Author

@erichkeane, @bader, this PR is now ready for community review.

@bader bader requested a review from Fznamznon December 5, 2025 22:56
// empty body; the original body is emitted in the offload kernel entry
// point and the synthesized kernel launch code is only relevant for host
// compilation.
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find myself wondering whether we should insert some sort of 'halt' instruction here. It seems that this is an 'error' case, but we're emitting it if it is ODR used (since we don't have a way of determining reachability).

however, since we're skipping the body, this is obviously just a case where it is UB to actually EXECUTE this function. I know we can now do device asserts, so some sort of panic/terminate/etc would be appropriate here.

}
}

if (isa<CXXConstructorDecl>(FD)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is happening in this file change? What relationship does this have to the kernel-launch support?

}
if (const auto *MD = dyn_cast<CXXMethodDecl>(FD)) {
if (!MD->isStatic()) {
if (MD->isExplicitObjectMemberFunction()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this patch now allow new 'types' of functions? IMO, this should be a separate patch in a quite sizable patch.

That said, allowing member functions is interesting, but I don't see why we would reject explicit-object-member-functions, they are effectively static?

for (const Expr *Arg : Args) {
ExprValueKind EVK = Arg->getValueKind();
const char *ValueCategory =
(EVK == VK_LValue ? "lvalue"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats going on here? This sort of printing is more of a 'dump' kinda thing, right? IF we're using this for diagnostics, we've done SOMETHING seriously wrong either in implementation or design.


template<typename KernelName, typename... Ts>
void sycl_kernel_launch(const char *, Ts... Args) {}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this change doing here?


// A generic kernel launch function.
template<typename KN, typename... Ts>
void sycl_kernel_launch(const char *, Ts...) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been lovely for the purposes of actually reviewing such a large patch if these were just done in a separate review, even if they were ineffectual at the time.


// A generic kernel launch function.
template<typename KN, typename... Ts>
void sycl_kernel_launch(const char *, Ts...) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least... do we intend the library to support these? Maybe we should have an Inputs header for all of these...

template<typename KN, typename KT>
void sycl_kernel_launch(const char *, KT);
// expected-error@+4 {{no matching function for call to 'sycl_kernel_launch'}}
// expected-note-re@+2 {{in implicit call to 'sycl_kernel_launch' with template argument 'BADKN<5>' and function arguments (lvalue of type 'const char[{{[0-9]*}}]', xvalue of type 'BADKT<5>', xvalue of type 'int') required here}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm VERY much not a fan of how this is being displayed/printed here. This is really kind of an awful way of presenting this information. WE have planty of ways of printing argument lists, inventing a new one isn't a good idea.

@AaronBallman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

SYCL https://registry.khronos.org/SYCL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants