@@ -49,6 +49,8 @@ This extension also depends on the following other SYCL extensions:
4949 sycl_ext_oneapi_properties]
5050* link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[
5151 sycl_ext_oneapi_kernel_properties]
52+ * link:../experimental/sycl_ext_oneapi_enqueue_functions.asciidoc[
53+ sycl_ext_oneapi_enqueue_functions]
5254
5355
5456== Status
@@ -241,6 +243,13 @@ A function decorated with one of these properties can still be called as a
241243normal function in either host or device code.
242244The property has no effect in such cases.
243245
246+ [_Note:_ Many of the APIs specified below have a template parameter `Func`,
247+ which identifies a free function kernel.
248+ This kernel function may be defined in any translation unit in the application.
249+ It is not necessary for the function to be defined in the same translation unit
250+ as the instantiation of the template taking the `Func` parameter.
251+ _{endnote}_]
252+
244253=== New traits for kernel functions
245254
246255This extension defines the following traits that can be used to tell whether a
@@ -323,16 +332,123 @@ Otherwise `value` is `false`.
323332The helper trait `is_kernel_v` provides the value of `value`.
324333|====
325334
335+ === New free functions to launch a kernel
336+
337+ This extension adds the following helper which captures a kernel function
338+ address as a template parameter.
339+
340+ [frame=all,grid=none,separator="@"]
341+ !====
342+ a@
343+ [source,c++]
344+ ----
345+ namespace sycl::ext::oneapi::experimental {
346+
347+ template<auto *Func>
348+ struct kernel_function_s {};
349+
350+ template<auto *Func>
351+ inline constexpr kernel_function_s<Func> kernel_function;
352+
353+ } // namespace sycl::ext::oneapi::experimental
354+ ----
355+ !====
356+
357+ It also adds the following free functions which launch a free function kernel.
358+
359+ [frame=all,grid=none,separator="@"]
360+ !====
361+ a@
362+ [source,c++]
363+ ----
364+ namespace sycl::ext::oneapi::experimental {
365+
366+ template <auto *Func, typename Args...>
367+ void single_task(queue q, kernel_function_s<Func> k, Args&&... args);
368+
369+ template <auto *Func, typename Args...>
370+ void single_task(handler &h, kernel_function_s<Func> k, Args&&... args);
371+
372+ } // namespace sycl::ext::oneapi::experimental
373+ ----
374+ !====
375+
376+ _Constraints_: Available only if `is_single_task_kernel_v<Func>` is `true`.
377+ Available only if `+std::is_invocable_v<decltype(Func), Args...>+` is `true`.
378+
379+ _Effects_: Enqueues a kernel object to the `queue` or `handler` as a single task.
380+ Each value in the `args` pack is passed to the corresponding argument in
381+ `Func`, converting it to the argument's type if necessary.
382+
383+ '''
384+
385+ [frame=all,grid=none,separator="@"]
386+ !====
387+ a@
388+ [source,c++]
389+ ----
390+ namespace sycl::ext::oneapi::experimental {
391+
392+ template <auto *Func, int Dimensions, typename... Args>
393+ void nd_launch(queue q, nd_range<Dimensions> r,
394+ kernel_function_s<Func> k, Args&&... args);
395+
396+ template <auto *Func, int Dimensions, typename... Args>
397+ void nd_launch(handler &h, nd_range<Dimensions> r,
398+ kernel_function_s<Func> k, Args&&... args);
399+
400+ } // namespace sycl::ext::oneapi::experimental
401+ ----
402+ !====
403+
404+ _Constraints_: Available only if `is_nd_range_kernel_v<Func, Dimensions>` is
405+ `true`.
406+ Available only if `+std::is_invocable_v<decltype(Func), Args...>+` is `true`.
407+
408+ _Effects_: Enqueues a kernel object to the `queue` or `handler` as an ND-range
409+ kernel, using the number of work-items specified by the ND-range `r`.
410+ Each value in the `args` pack is passed to the corresponding argument in
411+ `Func`, converting it to the argument's type if necessary.
412+
413+ '''
414+
415+ [frame=all,grid=none,separator="@"]
416+ !====
417+ a@
418+ [source,c++]
419+ ----
420+ namespace sycl::ext::oneapi::experimental {
421+
422+ template <auto *Func, int Dimensions,
423+ typename Properties, typename... Args>
424+ void nd_launch(queue q,
425+ launch_config<nd_range<Dimensions>, Properties> c,
426+ kernel_function_s<Func> k, Args&& args...);
427+
428+ template <auto *Func, int Dimensions,
429+ typename Properties, typename... Args>
430+ void nd_launch(handler &h,
431+ launch_config<nd_range<Dimensions>, Properties> c,
432+ kernel_function_s<Func> k, Args&& args...);
433+
434+ } // namespace sycl::ext::oneapi::experimental
435+ ----
436+ !====
437+
438+ _Constraints_: Available only if `is_nd_range_kernel_v<Func, Dimensions>` is
439+ `true`.
440+ Available only if `+std::is_invocable_v<decltype(Func), Args...>+` is `true`.
441+
442+ _Effects_: Enqueues a kernel object to the `queue` or `handler` as an ND-range
443+ kernel, using the launch configuration specified by `c`.
444+ Each value in the `args` pack is passed to the corresponding argument in
445+ `Func`, converting it to the argument's type if necessary.
446+
326447=== New kernel bundle member functions
327448
328449This extension adds the following new functions which add kernel bundle support
329450for free function kernels.
330451
331- [_Note:_ Many of the functions in this section have a template parameter
332- `Func`, which identifies a free function kernel.
333- This kernel function may be defined in any translation unit in the application.
334- _{endnote}_]
335-
336452|====
337453a|
338454[frame=all,grid=none]
@@ -798,7 +914,8 @@ The allowed types are:
798914The following example demonstrates how to define a free function kernel and then
799915enqueue it on a device.
800916
801- ```
917+ [source,c++]
918+ ----
802919#include <sycl/sycl.hpp>
803920namespace syclext = sycl::ext::oneapi;
804921namespace syclexp = sycl::ext::oneapi::experimental;
@@ -818,23 +935,14 @@ int main() {
818935 sycl::queue q;
819936 sycl::context ctxt = q.get_context();
820937
821- // Get a kernel bundle that contains the free function kernel "iota".
822- auto exe_bndl =
823- syclexp::get_kernel_bundle<iota, sycl::bundle_state::executable>(ctxt);
824-
825- // Get a kernel object for the "iota" function from that bundle.
826- sycl::kernel k_iota = exe_bndl.ext_oneapi_get_kernel<iota>();
827-
828938 float *ptr = sycl::malloc_shared<float>(NUM, q);
829- q.submit([&](sycl::handler &cgh) {
830- // Set the values of the kernel arguments.
831- cgh.set_args(3.14f, ptr);
832939
833- sycl::nd_range ndr{{NUM}, {WGSIZE}};
834- cgh.parallel_for(ndr, k_iota);
835- }).wait();
940+ sycl::nd_range ndr{{NUM}, {WGSIZE}};
941+ syclexp::nd_launch(q, ndr, syclexp::kernel_function<iota>, 3.14f, ptr);
942+
943+ q.wait();
836944}
837- ```
945+ ----
838946
839947=== Free function kernels which are templates or overloaded
840948
@@ -843,7 +951,8 @@ It is also legal to define several overloads for a free function kernel.
843951The following example demonstrates how to get a kernel identifier in such
844952cases.
845953
846- ```
954+ [source,c++]
955+ ----
847956#include <sycl/sycl.hpp>
848957namespace syclexp = sycl::ext::oneapi::experimental;
849958
@@ -864,17 +973,26 @@ void ping(int *x) {
864973}
865974
866975int main() {
976+ sycl::queue q;
977+ sycl::context ctxt = q.get_context();
978+
979+ float *fptr = sycl::malloc_shared<float>(NUM, q);
980+ int *iptr = sycl::malloc_shared<int>(NUM, q);
981+ sycl::nd_range ndr{{NUM}, {WGSIZE}};
982+
867983 // When the free function kernel is templated, pass the address of a
868984 // specific instantiation.
869- sycl::kernel_id iota_float = syclexp::get_kernel_id <iota<float>>( );
870- sycl::kernel_id iota_int = syclexp::get_kernel_id <iota<int>>( );
985+ syclexp::nd_launch(q, ndr, syclexp::kernel_function <iota<float>>, 3.14f, fptr );
986+ syclexp::nd_launch(q, ndr, syclexp::kernel_function <iota<int>>, 3, iptr );
871987
872988 // When there are multiple overloads of a free function kernel, use a cast
873989 // to disambiguate.
874- sycl::kernel_id ping_float = syclexp::get_kernel_id<(void(*)(float))ping>();
875- sycl::kernel_id ping_int = syclexp::get_kernel_id<(void(*)(int))ping>();
990+ syclexp::nd_launch(q, ndr, syclexp::kernel_function<(void(*)(float))ping>, fptr);
991+ syclexp::nd_launch(q, ndr, syclexp::kernel_function<(void(*)(int))ping>, iptr);
992+
993+ q.wait();
876994}
877- ```
995+ ----
878996
879997
880998[[level-zero-and-opencl-compatibility]]
@@ -1065,46 +1183,6 @@ argument, effectively turning the call into a no-op.
10651183
10661184== Issues
10671185
1068- * We're pretty sure that we want to define some syntax that allows a free
1069- function kernel to be enqueued using the APIs defined in
1070- link:../experimental/sycl_ext_oneapi_enqueue_functions.asciidoc[
1071- sycl_ext_oneapi_enqueue_functions], but we haven't settled on the exact API
1072- yet.
1073- One option is like this:
1074- +
1075- ```
1076- SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((syclexp::nd_range_kernel<1>))
1077- void iota(float start, float *ptr) { /*...*/ }
1078-
1079- int main() {
1080- sycl::queue q;
1081- float *ptr = sycl::malloc_shared<float>(N, q);
1082- sycl::nd_launch<iota>(q, sycl::nd_range{{N}, {WGS}}, 1.f, ptr);
1083- }
1084- ```
1085- +
1086- Another option is like this:
1087- +
1088- ```
1089- SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((syclexp::nd_range_kernel<1>))
1090- void iota(float start, float *ptr) { /*...*/ }
1091-
1092- int main() {
1093- sycl::queue q;
1094- float *ptr = sycl::malloc_shared<float>(N, q);
1095- sycl::nd_launch(q, sycl::nd_range{{N}, {WGS}}, kfp<iota>, 1.f, ptr);
1096- }
1097- ```
1098- +
1099- Where `kfp` would have some nicer name.
1100- +
1101- With either form above, it seems like we have enough type information for the
1102- header to check that the types of the actual kernel arguments are implicitly
1103- convertible to the types of the formal kernel parameters, and we can raise a
1104- compile-time error if they are not.
1105- In addition, the header can perform any necessary implicit conversions when
1106- setting the kernel argument values.
1107-
11081186* We are debating whether we should allow a free function kernel to be defined
11091187 with an initial "iteration index" parameter such as:
11101188+
0 commit comments