Skip to content

[offload] proposal to add context to the offloading API #171129

@pbalcer

Description

@pbalcer

The current liboffload API lacks a concept of context, relying instead on devices (e.g., in USM allocation) or a global map of allocations (e.g., for memory info queries or free). This design can negatively impact performance and makes it difficult to fully implement the SYCL 2020 specification, especially for interoperability with native APIs such as OpenCL and Level Zero.

Additionally, the lack of context makes access to host allocation ambiguous, for potential Level-Zero and OpenCL plugin implementations. Excerpts from specifications:
Level Zero: “Host allocations are accessible by the host and all devices within the driver’s context”.
https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/api.html#zememallochost

OpenCL: “If the specified OpenCL device supports cross-device access capabilities, the allocation is also accessible by other OpenCL devices in the context that have cross-device access capabilities.”.
https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_unified_shared_memory.html#_shared_allocations

In SYCL, USM “host” memory is accessible on a device D if both of the following are true:

  • Device D has the aspect “usm_host_allocations”, and
  • Device D is a member of the context used to allocate the USM host memory.

Without context, the behavior is essentially implementation defined.

Adding context would help: a) improve performance by eliminating the global allocation map, b)
reduce implementation complexity, c) enable full support for SYCL 2020 native backend interop capabilities, d) allow us to map more cleanly to existing OpenCL and Level-Zero primitives.

Context could be defined in a way that would allow plugin implementations to omit using native platform context, instead relying on it purely for the purpose of dispatching.

The proposal is to introduce a new context entity, created at runtime as a collection of devices:

ol_result_t OL_APICALL olCreateContext(size_t DevicesCount, ol_device_handle_t *Devices, ol_context_handle_t *Context);

Or from a native backend (OpenCL, Level-Zero) context:

ol_result_t OL_APICALL olCreateContextFromNative(void *NativeContextHandle, ol_context_handle_t *Context);

Context would then be added as a parameter to olGetMemInfo, olMemAlloc, and olMemFree, like so:

OL_APIEXPORT ol_result_t OL_APICALL olMemFree(ol_context_handle_t Context, void *Address);

(Analogous changes would apply to olMemAlloc, olGetMemInfo, etc.)


SYCL specification defines context as "the runtime data structures and state required by a SYCL backend API to interact with a group of devices associated with a platform.". In practice, all runtime APIs that operate on memory, take context as an argument, in order to identify what memory is owned by what set of devices. For example, here's how SYCL defines its alloc and free methods:

void* sycl::malloc_host(size_t numBytes, const sycl::context& syclContext);

void sycl::free(void* ptr, const sycl::context& syclContext);

https://github.khronos.org/SYCL_Reference/iface/usm_allocations.html#sycl-malloc-host

Because these functions accept context, the implementation can dispatch these calls to the appropriate platform and set of devices.

SYCL contexts map 1:1 to OpenCL and Level-Zero primitives and, with the use of special context-management functions, to CUDA and HIP APIs, which is how Intel SYCL compiler supports context today on those platforms: https://github.com/intel/llvm/blob/2fea9b6ba4361a67ba45f6c786155c94be697ad1/unified-runtime/source/adapters/cuda/context.hpp#L157.

To facilitate interoperability with existing OpenCL / level-zero based applications, SYCL defines additional functions that allow users to create a context (or retrieve) using native backend handles:

const sycl::context sycl_context = sycl::make_context<sycl::backend::ext_oneapi_level_zero>(hContextInteropInput);

Here's an example of how this is used to enable interop between OpenMP, SYCL and Level-Zero in a single application:
https://github.com/argonne-lcf/HPC-Patterns/blob/main/sycl_omp_ze_interopt/interop_omp_ze_sycl.cpp#L65

ping @jhuber6 @KseniyaTikhomirova @gmlueck

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions