-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[SYCL] SYCL host kernel launch support for the sycl_kernel_entry_point attribute. #152403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[SYCL] SYCL host kernel launch support for the sycl_kernel_entry_point attribute. #152403
Conversation
56407fb to
5b42f6b
Compare
7a913b2 to
70f34c3
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
nit: typo in PR description: |
…t attribute.
The `sycl_kernel_entry_point` attribute facilitates the generation of an
offload kernel entry point function with parameters corresponding to the
(potentially decomposed) kernel arguments and a body that (potentially
reconstructs the arguments and) executes the kernel. This change adds
symmetric support for the SYCL host through an interface that provides
symbol names and (potentially decomposed) kernel arguments to the SYCL
library.
Consider the following function declared with the `sycl_kernel_entry_point`
attribute with a call to this function occurring in the implementation of
a SYCL kernel invocation function such as `sycl::handler::single_task()`.
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void kernel_entry_point(KernelType kerne) {
kernel();
}
The body of the above function specifies the parameters and body of the
generated offload kernel entry point. Clearly, a call to the above function
by a SYCL kernel invocation function is not intended to execute the body
as written. Previously, code generation emitted an empty function body so
that calls to the function had no effect other than to trigger the generation
of the offload kernel entry point. The function body is therefore available
to hook for SYCL library support and is now substituted with a call to a
(SYCL library provided) function template named `sycl_enqueue_kernel_launch()`
with the kernel name type passed as the first template argument, the
symbol name of the offload kernel entry point passed as a string literal for
the first function argument, and the (possibly decomposed) parameters passed
as the remaining explicit function arguments. Given a call like this:
kernel_entry_point<struct KN>([]{})
the body of the instantiated `kernel_entry_point()` specialization would be
substituted as follows with "kernel-symbol-name" substituted for the
generated symbol name and `kernel` forwarded (This assumes no kernel
argument decomposition; if decomposition was required, `kernel` would be
replaced with its corresponding decomposed arguments).
sycl_enqueue_kernel_launch<KN>("kernel-symbol-name", kernel)
Name lookup and overload resolution for the `sycl_enqueue_kernel_launch()`
function is performed at the point of definition of the
`sycl_kernel_entry_point` attributed function (or the point of instantiation
for an instantiated function template specialization). If overload
resolution fails, the program is ill-formed.
Implementation of the `sycl_enqueue_kernel_launch()` function might require
additional information provided by the SYCL library. This is facilitated by
removing the previous prohibition against use of the `sycl_kernel_entry_point`
attribute with a non-static member function. If the `sycl_kernel_entry_point`
attributed function is a non-static member function, then overload resolution
for the `sycl_enqueue_kernel_launch()` function template may select a
non-static member function in which case, `this` will be implicitly passed
as the implicit object argument.
If a `sycl_kernel_entry_point` attributed function is a non-static member
function, use of `this` in a potentially evaluated expression is prohibited
in the definition (since `this` is not a kernel argument and will not be
available within the generated offload kernel entry point function).
Support for kernel argument decomposition and reconstruction is not yet
implemented.
87f0c2b to
2a1c23d
Compare
…#51) * Add support for host kernel launch stmt generation This adds generation of a call to sycl_enqueue_kernel_launch function aka "launcher" function. The launcher function can be a memeber of a class or a free function defined at namespace scope. The lookup is performed from SKEP attributed function scope. Because unqualified lookup requires Scope object present and it only exists during parsing stage and already EOLed at the point where templates instantiated, I had to move some parts of SYCLKernelCallStmt generation to earlier stages and now TreeTransform knows how to process SYCLKernelCallStmt. I also had to invent a new expression - UnresolvedSYCLKernelExpr which represents a string containing kernel name of a kernel that doesn't exist yet. This expression is supposed to be transformed to a StringLiteral during template instantiation phase. It should never reach AST consumers like CodeGen of constexpr evaluators. This still requires more testing and FIXME cleanups, but since it evolved into a quite complicated patch I'm pushing it for earlier feedback. * Remove a fixme from SemaSYCL * Do not crash if original body was invalid * Add AST test for skep-attributed member * Fix a warning * Extend codegen test a bit * Find and replace UnresolvedSYCLKernelNameExpr -> UnresolvedSYCLKernelLaunchExpr * Implement the thing * One more find and replace * I don't know how it looks like * Find and replace again * Switch to UnresolvedSYCLKernelEntryPointStmt * Apply suggestions from code review * Remove log.txt * Implement visiting * Add tests * Apply suggestions from code review Co-authored-by: Tom Honermann <[email protected]> * IdExpr -> KernelLaunchIdExpr * Don't rely on compound * UnresolvedSYCLKernelEntryPointStmt -> UnresolvedSYCLKernelCall * Fix warnings * Rename sycl_enqueue_kernel_launch -> sycl_kernel_launch * Apply suggestions from code review Co-authored-by: Tom Honermann <[email protected]> * Remove array decay * Add windows run line to the sema test --------- Co-authored-by: Tom Honermann <[email protected]>
In case a function with skep attribute is instantiated two times with the same kernel name the attribute is invalid due to the conflicting name. Make sure to exit from instantiation of UnresolvedSYCLKernelCallStmt in this case.
…ivially-copyable (#53) device-copyable doesn't mean trivially-copyable, so we may encounter arguments that need cleanup. Adds test that verifies presence of the dtor call in the synthesized code.
- Added/edited lots of comments. - Expanded testing for sycl_kernel_launch lookup and reorganized the tests to make it easier to identify and/or plug testing gaps. - Added a missing sycl_kernel_entry_point attribute in a test. - Reordered various declarations for better grouping and consistency. - Renamed some variables and functions. - Removed an unnecessary include directive. - Adjusted the location used for the implicit sycl_kernel_launch call. The opening brace of the original function body is used where available and the general function location is used otherwise. - Corrected lookup to unconditionally look for a template name; previously a spurious error about a 'template' keyword could be issued. - Added missing code gen checks for use of a member function as the SYCL kernel entry point function. - Other misc style edits for consistency.
…atic member function with an explicit object parameter.
Reworked diagnostics to include a synthesized code context and generation of notes detailing implicit calls to sycl_kernel_launch().
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
Fznamznon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minors. I don't like the size of the PR but it is not horribly huge. I think this should be moved out of draft state.
Prompted by code review, this rewording is intended to clarify the prose and better tie it to the example code. Mention of kernel argument decomposition and reconstruction is removed since such support is not yet implemented. Additional changes include use of identifiers that match the SYCL specification for consistency, avoidance of duplication, a change of std::forward<>() to std::move(), and oher misc edits.
|
@erichkeane, @bader, this PR is now ready for community review. |
| // empty body; the original body is emitted in the offload kernel entry | ||
| // point and the synthesized kernel launch code is only relevant for host | ||
| // compilation. | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find myself wondering whether we should insert some sort of 'halt' instruction here. It seems that this is an 'error' case, but we're emitting it if it is ODR used (since we don't have a way of determining reachability).
however, since we're skipping the body, this is obviously just a case where it is UB to actually EXECUTE this function. I know we can now do device asserts, so some sort of panic/terminate/etc would be appropriate here.
| } | ||
| } | ||
|
|
||
| if (isa<CXXConstructorDecl>(FD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is happening in this file change? What relationship does this have to the kernel-launch support?
| } | ||
| if (const auto *MD = dyn_cast<CXXMethodDecl>(FD)) { | ||
| if (!MD->isStatic()) { | ||
| if (MD->isExplicitObjectMemberFunction()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this patch now allow new 'types' of functions? IMO, this should be a separate patch in a quite sizable patch.
That said, allowing member functions is interesting, but I don't see why we would reject explicit-object-member-functions, they are effectively static?
| for (const Expr *Arg : Args) { | ||
| ExprValueKind EVK = Arg->getValueKind(); | ||
| const char *ValueCategory = | ||
| (EVK == VK_LValue ? "lvalue" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats going on here? This sort of printing is more of a 'dump' kinda thing, right? IF we're using this for diagnostics, we've done SOMETHING seriously wrong either in implementation or design.
|
|
||
| template<typename KernelName, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts... Args) {} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this change doing here?
|
|
||
| // A generic kernel launch function. | ||
| template<typename KN, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts...) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have been lovely for the purposes of actually reviewing such a large patch if these were just done in a separate review, even if they were ineffectual at the time.
|
|
||
| // A generic kernel launch function. | ||
| template<typename KN, typename... Ts> | ||
| void sycl_kernel_launch(const char *, Ts...) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or at least... do we intend the library to support these? Maybe we should have an Inputs header for all of these...
| template<typename KN, typename KT> | ||
| void sycl_kernel_launch(const char *, KT); | ||
| // expected-error@+4 {{no matching function for call to 'sycl_kernel_launch'}} | ||
| // expected-note-re@+2 {{in implicit call to 'sycl_kernel_launch' with template argument 'BADKN<5>' and function arguments (lvalue of type 'const char[{{[0-9]*}}]', xvalue of type 'BADKT<5>', xvalue of type 'int') required here}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm VERY much not a fan of how this is being displayed/printed here. This is really kind of an awful way of presenting this information. WE have planty of ways of printing argument lists, inventing a new one isn't a good idea.
The
sycl_kernel_entry_pointattribute facilitates the generation of an offload kernel entry point function with parameters corresponding to the (potentially decomposed) kernel arguments and a body that (potentially reconstructs the arguments and) executes the kernel. This change adds symmetric support for the SYCL host through an interface that provides symbol names and (potentially decomposed) kernel arguments to the SYCL library.Consider the following function declared with the
sycl_kernel_entry_pointattribute with a call to this function occurring in the implementation of a SYCL kernel invocation function such assycl::handler::single_task().The body of the above function specifies the parameters and body of the generated offload kernel entry point. Clearly, a call to the above function by a SYCL kernel invocation function is not intended to execute the body as written. Previously, code generation emitted an empty function body so that calls to the function had no effect other than to trigger the generation of the offload kernel entry point. The function body is therefore available to hook for SYCL library support and is now substituted with a call to a (SYCL library provided) function template named
sycl_kernel_launch()with the kernel name type passed as the first template argument, the symbol name of the offload kernel entry point passed as a string literal for the first function argument, and the (possibly decomposed) parameters passed as the remaining explicit function arguments. Given a call like this:the body of the instantiated
kernel_entry_point()specialization would be substituted as follows with "kernel-symbol-name" substituted for the generated symbol name andkernelforwarded (This assumes no kernel argument decomposition; if decomposition was required,kernelwould be replaced with its corresponding decomposed arguments).sycl_kernel_launch<KN>("kernel-symbol-name", kernel)Name lookup and overload resolution for the
sycl_kernel_launch()function is performed at the point of definition of thesycl_kernel_entry_pointattributed function (or the point of instantiation for an instantiated function template specialization). If overload resolution fails, the program is ill-formed.Implementation of the
sycl_kernel_launch()function might require additional information provided by the SYCL library. This is facilitated by removing the previous prohibition against use of thesycl_kernel_entry_pointattribute with a non-static member function. If thesycl_kernel_entry_pointattributed function is a non-static member function, then overload resolution for thesycl_kernel_launch()function template may select a non-static member function in which case,thiswill be implicitly passed as the implicit object argument.If a
sycl_kernel_entry_pointattributed function is a non-static member function, use ofthisin a potentially evaluated expression is prohibited in the definition (sincethisis not a kernel argument and will not be available within the generated offload kernel entry point function).Support for kernel argument decomposition and reconstruction is not yet implemented.