Skip to content
Open
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
eec1fe5
[SYCL][Docs] Add sycl_ext_oneapi_inter_process_communication
steffenlarsen Sep 4, 2025
e4aefbf
Set ptracer permission in test
steffenlarsen Sep 15, 2025
7c85049
Document known issues
steffenlarsen Sep 15, 2025
6debadb
Disable on Windows L0
steffenlarsen Sep 15, 2025
d3b6c56
Add tracker
steffenlarsen Sep 16, 2025
29cbc37
Open handle directly from data
steffenlarsen Sep 17, 2025
baa805f
Adjust for UR changes
steffenlarsen Sep 17, 2025
7170da9
Address source-check issues
steffenlarsen Sep 17, 2025
ce89cfe
Address more source checks
steffenlarsen Sep 17, 2025
49ce33d
Add missing symbols
steffenlarsen Sep 17, 2025
c3f139b
Remove known issue
steffenlarsen Sep 18, 2025
d1ba0de
Add put explicit internal release arg
steffenlarsen Sep 22, 2025
4070634
Adjust sycl impl
steffenlarsen Sep 22, 2025
5d5dc0f
Merge remote-tracking branch 'intel/sycl' into ext_oneapi_inter_proce…
steffenlarsen Sep 22, 2025
db03d7b
Remove unused param
steffenlarsen Sep 22, 2025
0075694
Add missing Windows symbol
steffenlarsen Sep 22, 2025
7a332b9
Add missing newline
steffenlarsen Sep 24, 2025
ce651ce
Merge remote-tracking branch 'intel/sycl' into ext_oneapi_inter_proce…
steffenlarsen Sep 24, 2025
acf286f
Change IPC API (SYCL)
steffenlarsen Oct 6, 2025
fa44def
Change IPC API (UR)
steffenlarsen Oct 6, 2025
7873e13
Fix formatting
steffenlarsen Oct 6, 2025
2001baa
Fix failures
steffenlarsen Oct 6, 2025
3555827
Merge remote-tracking branch 'intel/sycl' into ext_oneapi_inter_proce…
steffenlarsen Oct 13, 2025
68ea80e
Address feedback
steffenlarsen Oct 13, 2025
d1d9a1a
Fix unittest
steffenlarsen Oct 13, 2025
01e3a65
Switch to std::byte
steffenlarsen Oct 13, 2025
3bf5099
Fix include
steffenlarsen Oct 13, 2025
ebdc9e3
Fix tests
steffenlarsen Oct 14, 2025
02e9613
Fix stylistic issues
steffenlarsen Oct 14, 2025
eeb0a8a
Make device lookup when failing
steffenlarsen Oct 15, 2025
0103ab4
Merge remote-tracking branch 'intel/sycl' into ext_oneapi_inter_proce…
steffenlarsen Oct 16, 2025
0f3b66f
Fix typo and add shortcut
steffenlarsen Oct 17, 2025
31aa6a3
Address code comments
steffenlarsen Oct 17, 2025
33d34a9
Address spec comments
steffenlarsen Oct 17, 2025
a1a7945
Clarify put result
steffenlarsen Oct 17, 2025
3bc26d0
Add note about matching open and close
steffenlarsen Oct 17, 2025
6cb2cf3
Add note about validity of opened ptrs after put
steffenlarsen Oct 17, 2025
7883b93
Clarify last comment a little more
steffenlarsen Oct 17, 2025
4410d55
Address newest comments
steffenlarsen Oct 17, 2025
63315d9
Make arguments const and make work-around
steffenlarsen Oct 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion llvm/include/llvm/SYCLLowerIR/DeviceConfigFile.td
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ def Aspectext_oneapi_clock_sub_group : Aspect<"ext_oneapi_clock_sub_group">;
def Aspectext_oneapi_clock_work_group : Aspect<"ext_oneapi_clock_work_group">;
def Aspectext_oneapi_clock_device : Aspect<"ext_oneapi_clock_device">;
def Aspectext_oneapi_is_integrated_gpu : Aspect<"ext_oneapi_is_integrated_gpu">;
def Aspectext_oneapi_ipc_memory : Aspect<"ext_oneapi_ipc_memory">;

// Deprecated aspects
def AspectInt64_base_atomics : Aspect<"int64_base_atomics">;
Expand Down Expand Up @@ -176,7 +177,8 @@ def : TargetInfo<"__TestAspectList",
Aspectext_oneapi_clock_sub_group,
Aspectext_oneapi_clock_work_group,
Aspectext_oneapi_clock_device,
Aspectext_oneapi_is_integrated_gpu],
Aspectext_oneapi_is_integrated_gpu,
Aspectext_oneapi_ipc_memory],
[]>;
// This definition serves the only purpose of testing whether the deprecated aspect list defined in here and in SYCL RT
// match.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
= sycl_ext_oneapi_inter_process_communication

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]
:endnote: &#8212;{nbsp}end{nbsp}note

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) 2025 Intel Corporation. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 10 specification. All
references below to the "core SYCL specification" or to section numbers in the
SYCL specification refer to that revision.


== Status

This is an experimental extension specification, intended to provide early
access to features and gather community feedback. Interfaces defined in this
specification are implemented in {dpcpp}, but they are not finalized and may
change incompatibly in future versions of {dpcpp} without prior notice.
*Shipping software products should not rely on APIs defined in this
specification.*


== Overview

This extension adds the ability for SYCL programs to share device USM memory
allocations between processes. This is done by the allocating process creating
a new IPC memory handle through the new free frunctions and transferring the
returned handle data to the other processes. The other processes can use the
handle data to retrieve the corresponding device USM memory.


== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro `SYCL_EXT_ONEAPI_INTER_PROCESS_COMMUNICATION` to one of the values defined
in the table below. Applications can test for the existence of this macro to
determine if the implementation supports this feature, or applications can test
the macro's value to determine which of the extension's features the
implementation supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|The APIs of this experimental extension are not versioned, so the
feature-test macro always has this value.
|===

=== Extension to `enum class aspect`

[source]
----
namespace sycl {
enum class aspect {
...
ext_oneapi_ipc_memory
}
}
----

If a SYCL device has this aspect, that device supports the `get` and `open`
functions specified in the following section.


=== Inter-process communicable memory


This extension adds new free functions under the `ipc_memory` experimental
namespace.

```
namespace sycl::ext::oneapi::experimental::ipc_memory {

using handle_data_t = std::vector<std::byte>;

handle_data_t get(void *ptr, const sycl::context &ctx);

void put(handle_data_t &handle_data, const sycl::context &ctx);

static void *open(handle_data_t handle_data, const sycl::context &ctx,
const sycl::device &dev);

static void close(void *ptr, const sycl::context &ctx);

}
```

|====
a|
[frame=all,grid=none]
!====
a!
[source]
----
handle_data_t get(void *ptr, const sycl::context &ctx)
----
!====

_Preconditions:_ `ptr` is a pointer to USM device memory on some device _D_ in
context `ctx`. `ctx` is the same context as `ptr` was allocated against, using
the USM device memory allocation routines.

_Returns:_ An IPC "handle" to this USM memory allocation. The bytes of this
handle can be transferred to another process on the same system, and the other
process can use the handle to get a pointer to the same USM allocation through a
call to the `open` function.

_Throws:_ An exception with the `errc::feature_not_supported` error code if
device _D_ does not have `aspect::ext_oneapi_ipc_memory`.

!====
a!
[source]
----
void put(handle_data_t &handle_data, const sycl::context &ctx)
----
!====

_Preconditions:_ `handle_data` is the IPC "handle" to USM device memory that was
returned from a call to `get` either in this process or in some other process on
the same system. The USM device memory has not yet been freed in this process.

_Effects:_ Deallocates resources associated with the handle. These resources are
automatically deallocated when the USM device memory is freed, so it is not
strictly necessary to call the `put` function.

!====
a!
[source]
----
static void *open(handle_data_t &handle_data, const sycl::context &ctx,
const sycl::device &dev)
----
!====

_Preconditions:_ `handle_data` is the IPC "handle" to USM device memory that was
returned from a call to the `get` function either in this process or in some
other process on the same system. That USM device memory is accessible on device
`dev`.

_Returns:_ A pointer to the same USM device memory represented by `handle_data`.
This pointer can be used in any API taking a USM device memory pointer, except
it cannot be passed to `sycl::free`. Instead, use the `close` function to free
this memory pointer.

_Throws:_

* An exception with the `errc::feature_not_supported` error code if device
`dev` does not have `aspect::ext_oneapi_ipc_memory`.
* An exception with the `errc::invalid` error code if the handle data
`handle_data` has an unexpected number of bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the number of bytes is correct, but the content is garbage? Can we broaden this exception like this:

  • An exception with the errc::invalid error code if the handle data handle_data does not represent a valid IPC handle for USM memory on this host system.

That would also cover the case when the handle had the wrong number of bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reluctant to promise an error code if handle data is corrupt. The byte size is something we can check at a SYCL/UR level, but the validity of the data is up to UMF to decide and the API documentation doesn't specify which error is returned if the data is off. Maybe a precondition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should push this requirement down to the UMF and down to Level Zero if necessary. They should be able to define an error code for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open an issue against UMF, but I think it might be tricky. What happens if the data looks correct to UMF, but using it causes UB or invalid access in the implementation? I suppose that's something they can look into though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UMF shouldn't look at the handle content at all. It should just pass it to the underlying driver API, and that API should return an error code if the handle content is invalid. This assumes that each driver is able to diagnose an error in such a case, but that seems quite reasonable to me. For example, cudaIpcOpenMemHandle is defined to return cudaErrorInvalidResourceHandle in this case. I don't see an error code listed for Level Zero, but that seems like a bug. They should be able to return an error code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be some wrapper information in the UMF IPC handle, but I don't know the exact details. Maybe @vinser52 has some insight into this?


!====
a!
[source]
----
static void close(void *ptr, const sycl::context &ctx)
----
!====

_Effects:_ Closes a device USM pointer previously returned by a call to
the `open` function.

|====


== Issues

=== Level Zero file descriptor duplication dependency

The IPC memory APIs in Level Zero on Linux currently requires the ability to
duplicate file descriptors between processes. For security this is not allowed
by default on Linux-based systems, so in order for the IPC memory APIs to work
with Level Zero on Linux the user must either call `prctl(PR_SET_PTRACER, ...)`
in the IPC handle owner process or enable the functionality globally using

```bash
sudo bash -c "echo 0 > /proc/sys/kernel/yama/ptrace_scope"
```

See also https://github.com/oneapi-src/unified-memory-framework/tree/main?tab=readme-ov-file#level-zero-memory-provider.


=== Level Zero IPC memory Windows support

The new IPC memory APIs are not currently supported on the Level Zero backend on
Windows systems.

42 changes: 42 additions & 0 deletions sycl/include/sycl/ext/oneapi/experimental/ipc_memory.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
//==------- ipc_memory.hpp --- SYCL inter-process communicable memory ------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#pragma once

#if (!defined(_HAS_STD_BYTE) || _HAS_STD_BYTE != 0)

#include <sycl/detail/defines_elementary.hpp>
#include <sycl/detail/export.hpp>

#include <cstddef>
#include <vector>

namespace sycl {
inline namespace _V1 {

class context;
class device;

namespace ext::oneapi::experimental::ipc_memory {

using handle_data_t = std::vector<std::byte>;

__SYCL_EXPORT handle_data_t get(void *Ptr, const sycl::context &Ctx);

__SYCL_EXPORT void put(handle_data_t &HandleData, const sycl::context &Ctx);

__SYCL_EXPORT void *open(handle_data_t &HandleData, const sycl::context &Ctx,
const sycl::device &Dev);

__SYCL_EXPORT void close(void *Ptr, const sycl::context &Ctx);

} // namespace ext::oneapi::experimental::ipc_memory
} // namespace _V1
} // namespace sycl

#endif
1 change: 1 addition & 0 deletions sycl/include/sycl/info/aspects.def
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,4 @@ __SYCL_ASPECT(ext_oneapi_clock_sub_group, 91)
__SYCL_ASPECT(ext_oneapi_clock_work_group, 92)
__SYCL_ASPECT(ext_oneapi_clock_device, 93)
__SYCL_ASPECT(ext_oneapi_is_integrated_gpu, 94)
__SYCL_ASPECT(ext_oneapi_ipc_memory, 95)
1 change: 1 addition & 0 deletions sycl/include/sycl/sycl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ can be disabled by setting SYCL_DISABLE_FSYCL_SYCLHPP_WARNING macro.")
#include <sycl/ext/oneapi/experimental/group_helpers_sorters.hpp>
#include <sycl/ext/oneapi/experimental/group_load_store.hpp>
#include <sycl/ext/oneapi/experimental/group_sort.hpp>
#include <sycl/ext/oneapi/experimental/ipc_memory.hpp>
#include <sycl/ext/oneapi/experimental/prefetch.hpp>
#include <sycl/ext/oneapi/experimental/profiling_tag.hpp>
#include <sycl/ext/oneapi/experimental/raw_kernel_arg.hpp>
Expand Down
1 change: 1 addition & 0 deletions sycl/source/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,7 @@ set(SYCL_COMMON_SOURCES
"handler.cpp"
"image.cpp"
"interop_handle.cpp"
"ipc_memory.cpp"
"kernel.cpp"
"kernel_bundle.cpp"
"physical_mem.cpp"
Expand Down
4 changes: 4 additions & 0 deletions sycl/source/detail/device_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1597,6 +1597,10 @@ class device_impl : public std::enable_shared_from_this<device_impl> {
get_info_impl_nocheck<UR_DEVICE_INFO_IS_INTEGRATED_GPU>().value_or(
0);
}
CASE(ext_oneapi_ipc_memory) {
return get_info_impl_nocheck<UR_DEVICE_INFO_IPC_MEMORY_SUPPORT_EXP>()
.value_or(0);
}
else {
return false; // This device aspect has not been implemented yet.
}
Expand Down
1 change: 1 addition & 0 deletions sycl/source/detail/ur_device_info_ret_types.inc
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,7 @@ MAP(UR_DEVICE_INFO_NODE_MASK, uint32_t)
// These aren't present in the specification, extracted from ur_api.h
// instead.
MAP(UR_DEVICE_INFO_2D_BLOCK_ARRAY_CAPABILITIES_EXP, ur_exp_device_2d_block_array_capability_flags_t)
MAP(UR_DEVICE_INFO_IPC_MEMORY_SUPPORT_EXP, ur_bool_t)
MAP(UR_DEVICE_INFO_ASYNC_USM_ALLOCATIONS_SUPPORT_EXP, ur_bool_t)
MAP(UR_DEVICE_INFO_BINDLESS_IMAGES_1D_USM_SUPPORT_EXP, ur_bool_t)
MAP(UR_DEVICE_INFO_BINDLESS_IMAGES_2D_USM_SUPPORT_EXP, ur_bool_t)
Expand Down
1 change: 1 addition & 0 deletions sycl/source/feature_test.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ inline namespace _V1 {
#define SYCL_KHR_DEFAULT_CONTEXT 1
#define SYCL_EXT_INTEL_EVENT_MODE 1
#define SYCL_EXT_ONEAPI_TANGLE 1
#define SYCL_EXT_ONEAPI_IPC 1

// Unfinished KHR extensions. These extensions are only available if the
// __DPCPP_ENABLE_UNFINISHED_KHR_EXTENSIONS macro is defined.
Expand Down
Loading