intel · gmlueck · Oct 7, 2025 · gmlueck · Oct 10, 2025 · sergey-semenov
@@ -0,0 +1,213 @@
+= sycl_ext_oneapi_record_event
+
+:source-highlighter: coderay
+:coderay-linenums-mode: table
+
+// This section needs to be after the document title.
+:doctype: book
+:toc2:
+:toc: left
+:encoding: utf-8
+:lang: en
+:dpcpp: pass:[DPC++]
+:endnote: &#8212;{nbsp}end{nbsp}note
+
+// Set the default source code type in this document to C++,
+// for syntax highlighting purposes.  This is needed because
+// docbook uses c++ and html5 uses cpp.
+:language: {basebackend@docbook:c++:cpp}
+
+
+== Notice
+
+[%hardbreaks]
+Copyright (C) 2025 Intel Corporation.  All rights reserved.
+
+Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
+of The Khronos Group Inc.  OpenCL(TM) is a trademark of Apple Inc. used by
+permission by Khronos.
+
+
+== Contact
+
+To report problems with this extension, please open a new issue at:
+
+https://github.com/intel/llvm/issues
+
+
+== Dependencies
+
+This extension is written against the SYCL 2020 revision 10 specification.
+All references below to the "core SYCL specification" or to section numbers in
+the SYCL specification refer to that revision.
+
+This extension also depends on the following other SYCL extensions:
+
+* link:../experimental/sycl_ext_oneapi_enqueue_functions.asciidoc[
+  sycl_ext_oneapi_enqueue_functions]
+
+
+== Status
+
+This is a proposed extension specification, intended to gather community
+feedback.
+Interfaces defined in this specification may not be implemented yet or may be in
+a preliminary state.
+The specification itself may also change in incompatible ways before it is
+finalized.
+*Shipping software products should not rely on APIs defined in this
+specification.*
+
+
+== Overview
+
+This extension adds the ability to reuse the same `event` object in multiple
+command submissions, rather than creating a new event for each submission.
+This pattern may perform better on some implementations because fewer event
+objects need to be created and destroyed.
+The pattern may also be more familiar to users porting CUDA code to SYCL.
+
+
+== Specification
+
+=== Feature test macro
+
+This extension provides a feature-test macro as described in the core SYCL
+specification.  An implementation supporting this extension must predefine the
+macro `SYCL_EXT_ONEAPI_RECORD_EVENT` to one of the values defined in the table
+below.  Applications can test for the existence of this macro to determine if
+the implementation supports this feature, or applications can test the macro's
+value to determine which of the extension's features the implementation
+supports.
+
+[%header,cols="1,5"]
+|===
+|Value
+|Description
+
+|1
+|The APIs of this experimental extension are not versioned, so the
+ feature-test macro always has this value.
+|===
+
+=== New kernel launch property
+
+This extension adds a new kernel launch property:
+
+[source,c++]
+----
+namespace sycl::ext::oneapi::experimental {
+
+struct record_event {
+  record_event(event* evt);  (1)
+};
+using record_event_key = record_event;
+
+} // namespace sycl::ext::oneapi::experimental
+----
+
+This property may be passed as a launch property to the following command
+submission functions from
+link:../experimental/sycl_ext_oneapi_enqueue_functions.asciidoc[
+sycl_ext_oneapi_enqueue_functions]:
+
+* The `submit` overload that takes parameters of type `queue` and `Properties`.
+* The `single_task` overloads that take parameters of type `queue` and
+  `Properties`.
+* The `parallel_for` overloads that take parameters of type `queue` and
+  `launch_config`.
+* The `nd_launch` overloads that take parameters of type `queue` and
+  `launch_config`.
+* The `memcpy` overload that takes parameters of type `queue` and `Properties`.
+* The `copy` overload that takes parameters of type `queue` and `Properties`.
+* The `memset` overload that takes parameters of type `queue` and `Properties`.
+* The `fill` overload that takes parameters of type `queue` and `Properties`.
+* The `prefetch` overload that takes parameters of type `queue` and
+  `Properties`.
+* The `mem_advise` overload that takes parameters of type `queue` and
+  `Properties`.
+* The `barrier` overload that takes parameters of type `queue` and `Properties`.
+* The `partial_barrier` overload that takes parameters of type `queue` and
+  `Properties`.
+* The `execute_graph` overload that takes parameters of type `queue` and
+  `Properties`.
+
+_Effects (1)_: Constructs a `record_event` property with a pointer to an `event`
+object.
+When `evt` is not null, the following happens.
+The status of the event is disassociated with any previously submitted command,
+and its status is reset to `info::event_command_status::submitted`.
+For the `submit` function, this happens when the command group function returns
+back to `submit`.
+The event is then associated with the newly submitted command.
+Assuming the event remains associated with this command, the event's status
+changes according to the execution status of that command.
+When `evt` is null, the property has no effect on the command submission.
+
+_Remarks:_
+
+* If a recorded event is used as a command dependency for some other command
+  _C2_ (e.g. via `handler::depends_on`), the dependency is captured at the point
+  when _C2_ is submitted.
+  The dependency does _not_ change if the event is subsequently overwritten via
+  `record_event`.
+
+* If another host thread is blocked in a call to `event::wait` when that same
+  event is associated with a new command via `record_event`, it is unspecified
+  whether the call to `event:wait` unblocks.
+
+
+== Example
+
+[source,c++]
+----
+#include <sycl/sycl.hpp>
+namespace syclex = sycl::ext::oneapi::experimental;
+
+int main() {
+  sycl::queue q1;
+  sycl::queue q2;
+  sycl::event e;
+  sycl::range r{GLOBAL};
+
+  // Launch a command and record an event which tracks its completion.
+  syclex::launch_config cfg{r, syclex::record_event{&e}};
+  syclex::parallel_for(q1, cfg, [=](sycl::item<> it) { /* ... */ });
+
+  // Launch another command which depends on that event and also
+  // record completion of this new command using the same event.
+  syclex::submit(q2, syclex::record_event{&e}, [&](sycl::handler cgh) {
+    cgh.depends_on(e);
+    syclex::parallel_for(cgh, r, [=](sycl::item<> it) { /* ... */ });
+  });
+
+  // Wait for both commands to complete.
+  e.wait();
+}
+----
+
+
+== Implementation notes
+
+It is expected that the implementation will often be able to reuse the
+underlying backend event object when a SYCL event is passed to `record_event`.
+However, there will still be cases when the implementation needs to release the
+underlying backend event and create a new one.
+For example, this will happen when the existing backend event is from a
+different backend or from a different context than the command being submitted.
+In these cases, we expect that the implementation will release the backend event
+and associate the SYCL event with a new backend event.
+
+
+== Issues
+
+* Is it possible to implement the behavior specified above regarding
+  `event::wait` and `record_event`?
+  What if the implementation needs to release the backend event when another
+  host thread is blocked in a call to `event:wait`?
+  Can we guarantee that the call to `event::wait` either remains blocked or
+  becomes unblocked?
+  (Either is fine.)
+  Or, is it possible that this will lead to a crash?
+  If a crash is possible, we need to weaken the specification to say this
+  condition is UB.