-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[EP ABI] Initial support for kernel-based EPs #26206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This draft PR implements support for kernel-based execution providers (EPs) within the ONNX Runtime EP plugin architecture. The changes enable plugin EPs to register custom kernels directly with the ORT runtime, expanding beyond the current node-based computation model.
- Adds comprehensive kernel registration infrastructure for plugin EPs
- Implements memory copy kernels as examples (MemcpyFromHost/MemcpyToHost)
- Extends the EP API with kernel definition and creation functionality
Reviewed Changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
onnxruntime/test/framework/ep_plugin_provider_test.cc |
Updates test to pass kernel registry parameter |
onnxruntime/test/autoep/library/kernels/utils.h |
Defines kernel creation utilities and macros |
onnxruntime/test/autoep/library/kernels/memcpy.h |
Declares example Memcpy kernel interface |
onnxruntime/test/autoep/library/kernels/memcpy.cc |
Implements example Memcpy kernel with registration |
onnxruntime/test/autoep/library/kernels/data_types.h |
Declares MLDataTypes singleton for type management |
onnxruntime/test/autoep/library/kernels/data_types.cc |
Implements MLDataTypes for tensor type retrieval |
onnxruntime/test/autoep/library/ep_kernel_registration.h |
Declares kernel registration functions |
onnxruntime/test/autoep/library/ep_kernel_registration.cc |
Implements kernel registration logic |
onnxruntime/test/autoep/library/ep.h |
Adds kernel creation method declarations to EP |
onnxruntime/test/autoep/library/ep.cc |
Implements kernel creation methods in example EP |
onnxruntime/core/session/utils.h |
Declares CopyTensors utility function |
onnxruntime/core/session/utils.cc |
Implements CopyTensors utility function |
onnxruntime/core/session/provider_policy_context.cc |
Updates EP creation to use new factory method |
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.h |
Extends PluginExecutionProvider with kernel registry support |
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc |
Implements kernel registry initialization in plugin EP |
onnxruntime/core/session/plugin_ep/ep_kernel_registration.h |
Declares kernel registration infrastructure |
onnxruntime/core/session/plugin_ep/ep_kernel_registration.cc |
Implements plugin EP kernel wrapper and registration |
onnxruntime/core/session/plugin_ep/ep_api.h |
Declares new EP API functions for kernel support |
onnxruntime/core/session/plugin_ep/ep_api.cc |
Implements new EP API functions for kernel support |
onnxruntime/core/session/onnxruntime_c_api.cc |
Refactors CopyTensors to use shared utility |
include/onnxruntime/core/session/onnxruntime_ep_c_api.h |
Adds kernel-related types and API declarations |
include/onnxruntime/core/session/onnxruntime_cxx_inline.h |
Implements C++ wrapper methods for kernel APIs |
include/onnxruntime/core/session/onnxruntime_cxx_api.h |
Declares C++ KernelDefBuilder class |
cmake/onnxruntime_unittests.cmake |
Updates build to include kernel source files |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
std::pair<int, int> GetSinceVersion() const; | ||
|
||
///< Wraps OrtEpApi::KernelDef_GetExecutionProvider | ||
const char* GetExecutionProvider() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any of the information for any getters is optional, suggest returning a status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying C API function OrtEpApi::KernelDef_GetExecutionProvider
returns the const char*
directly (doesn't return a status).
Sorry, I don't fully understand the comment. Is the request to make this return a status instead? What is meant by "information for any getters is optional"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a general comment not to be bound to throwing exceptions by default. In this case, the data is returned without error reporting.
return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "Invalid arguments provided to CopyTensors."); | ||
} | ||
|
||
const OrtMemoryInfo* src_memory_info = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: moved this into a shared utility function that can be used by the new API KernelInfo_CopyTensors
* | ||
* \since Version 1.24. | ||
*/ | ||
typedef OrtStatus*(ORT_API_CALL* OrtKernelCreateFunc)(_In_ OrtKernelCreateContext* ctx, // unused/reserved as of 1.24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably remove this ctx
parameter. It is a stand-in for the FuncManager
parameter in the related KernelCreateFn
used by provider-bridge EPs:
using KernelCreateFn = std::function<Status(FuncManager& func_mgr, const OpKernelInfo& info, std::unique_ptr<OpKernel>& out)>; |
It doesn't look like any EPs in the ORT code base use the FuncManager
parameter at all, but I kept it here (with a more generic name) just in case we find a use for it in the future. Would appreciate opinions.
/// Singleton that returns sets of OrtMLDataType instances using the public C API. | ||
/// Analogous to the internal utilities in include/onnxruntime/core/framework/data_types.h | ||
/// </summary> | ||
class MLDataTypes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering moving this to the public C++ API header (but not a singleton). Seems like all kernel-based plugin EPs would benefit from this.
ORT_API2_STATUS(GetTensorMLDataType, _In_ ONNXTensorElementDataType elem_type, | ||
_Outptr_ const OrtMLDataType** out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently only added an API to get tensor data types. We would need to add similar APIs for sequences, maps, etc.
Also, I'm not too sure if we should keep using the term "ML data type". I kept it to remain consistent with the internal names, but perhaps we can rename?
*/ | ||
ORT_API2_STATUS(KernelDefBuilder_Build, _In_ OrtKernelDefBuilder* kernel_def_builder, | ||
_Outptr_ OrtKernelDef** kernel_def_out); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not yet add all KernelDefBuilder functions. It's missing aliasing, "may inplace". However, these things may not be used commonly and could be added later.
*/ | ||
ORT_API2_STATUS(KernelDef_GetOutputMemType, _In_ const OrtKernelDef* kernel_def, | ||
_In_ size_t output_index, _Out_ OrtMemType* mem_type); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have not added all getters for KernelDef because they are not really used by EPs. An EP retrieves a kernel def during GetCapability
to check if a kernel for a node has been registered. Notably, there is only one EP (ACL EP) that actually gets a property from a KernelDef
returned by a lookup, and that property is the operator type, which it could instead get from the node.
#include "utils.h" | ||
|
||
ONNX_OPERATOR_KERNEL_EX( | ||
MemcpyFromHost, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this EP need to register its own memcpy kernels or can it use the generic ones from the CPU EP (added in #26088)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need it. It was a way to test the kernel registration utilities. Perhaps it would be best to create a different kernel-based example EP and leave this one unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we use this as an example EP implementation for reference, it might be better to show EP authors that they can avoid implementing their own memcpy kernels unless they require some special behavior not provided by the generic ones.
also, the testing of the generic memcpy kernels was relying on this EP not providing its own, but we could update the test set up if needed. on a semi-related note, I don't know how well OpTester will work with an EP that has both a kernel registry and support for compiling nodes.
KernelDefBuilder& SetExecutionProvider(const char* ep_name); | ||
KernelDefBuilder& SetInputMemType(size_t input_index, OrtMemType mem_type); | ||
KernelDefBuilder& SetOutputMemType(size_t output_index, OrtMemType mem_type); | ||
KernelDefBuilder& AddTypeConstraint(const char* arg_name, const OrtMLDataType* data_types); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it is one data type with this overload, right?
KernelDefBuilder& AddTypeConstraint(const char* arg_name, const OrtMLDataType* data_types); | |
KernelDefBuilder& AddTypeConstraint(const char* arg_name, const OrtMLDataType* data_type); |
* | ||
* \since Version 1.24 | ||
*/ | ||
ORT_API2_STATUS(KernelInfo_CopyTensors, _In_ const OrtKernelInfo* info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to reuse the CopyTensors
API or do we need a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing CopyTensors
API takes an OrtEnv
as input, which is not available to EPs (if I'm not mistaken)
const onnxruntime::KernelCreateInfo* create_info = | ||
graph_support_info->kernel_lookup.LookUpKernel(ep_node->GetInternalNode()); | ||
|
||
*out_kernel_def = static_cast<const OrtKernelDef*>(create_info->kernel_def.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check whether the lookup fails to find anything (create_info == nullptr
)?
const OrtLogger& logger, | ||
/*out*/ std::unique_ptr<PluginExecutionProvider>& plugin_ep); | ||
|
||
explicit PluginExecutionProvider(UniqueOrtEp ep, const OrtSessionOptions& session_options, OrtEpFactory& ep_factory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when should one use PluginExecutionProvider::PluginExecutionProvider()
vs. PluginExecutionProvider::Create()
?
|
||
// Table of BuildKernelCreateInfo functions for each operator | ||
static const BuildKernelCreateInfoFn build_kernel_create_info_funcs[] = { | ||
BuildKernelCreateInfo<void>, // Dummy to avoid table becoming empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the dummy entry was originally added to support reduced op builds for certain EPs. we probably don't need it in this example.
} | ||
|
||
if (status != nullptr) { | ||
ep_api.ReleaseKernelRegistry(*kernel_registry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add a C++ API type for OrtKernelRegistry?
} | ||
|
||
static void CheckFileIsEmpty(const PathString& filename) { | ||
std::ifstream ifs{filename}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also check that the file was opened? ASSERT_TRUE(ifs)
Description
This PR adds an initial set of C APIs necessary to support kernel registration for plugin EPs.
Example use
The example plugin EP implementation now registers
MemcpyFromHost
andMemcpyToHost
operator kernels using the new APIs. New utilities in the example implementation make the process of defining operator kernels very similar to the existing process used by provider-bridge EPs.First, the operator kernel class is defined:
Then, a macro defines a function that can be called to register the operator with the EP's kernel registry:
Lastly, the functions defined by the above macro are entered into a table:
The example EP processes the entries in the above table to add information about the supported operator kernels to the EP's kernel registry (
OrtKernelRegistry
).Additionally, during the call to
OrtEp::GetCapability
, an EP can now lookup registered kernel definitions via the new APIEpGraphSupportInfo_LookUpKernel
. Note that an EP would not normally lookup kernels forMemcpy**Host
, which are inserted by ORT. Instead, it would be used to look up other registered operator kernels likeConv
, for example.EP implementation details
An EP instance (i.e.,
OrtEp
) that needs to register operator kernels with ONNX Runtime must implement the followingOrtEp::GetKernelRegistry()
function:Returns:
OrtStatus*
Parameters:
OrtEp* this_ptr
: The OrtEp instance.const OrtKernelRegistry** kernel_registry
: Output parameter set to the EP's kernel registry, which must remain valid throughout the lifetime of the EP.Remarks: A kernel registry contains kernel creation information for operator kernels supported by an EP.
Note: Implementation of this function is optional. If set to NULL, ORT assumes the EP compiles nodes.
If defined by the EP, the
OrtEp::GetKernelRegistry()
function is called by ONNX Runtime after creating an instance of theOrtEp
in order to retrieve the EP's kernel registry.APIs used by EP to add entries to kernel registry
An EP's kernel registry (
OrtKernelRegistry
) contains information necessary for the (later) creation of operator kernels supported by an EP. Conceptually, a kernel registry contains an array of "kernel creation information" elements, one per operator. Each such element consists of:OrtKernelDef
), which specifies operator type, supported versions, type constraints, I/O memory types, etc.OrtKernelCreateFunc
that ORT calls to create an instance of the kernel (OrtKernelImpl
).OrtEp
) that is passed to theOrtKernelCreateFunc
.An EP uses the following
OrtEpApi::KernelRegistry_AddKernel()
function to add an entry for one supported operator.Returns:
OrtStatus*
Parameters:
OrtKernelRegistry* kernel_registry
: The OrtKernelRegistry instance.const OrtKernelDef* kernel_def
: The kernel definition, which includes operator type, version, EP name, type constraints, etc.OrtKernelCreateFunc kernel_create_func
: Function that creates an instance of the operator kernel as a OrtKernelImpl instance.void* kernel_create_func_state
: Custom state passed to the kernel creation function. Can be null.Remarks: Refer to OrtEp::GetKernelRegistry, which returns an EP's kernel registry to ORT.
Building a kernel definition
An EP uses a kernel definition builder (
OrtKernelDefBuilder
) to create a kernel definition (OrtKernelDef
). The following table lists some of the C APIs related to building a kernel definition. The aboveONNX_OPERATOR_KERNEL_EX
macro uses these APIs.Returns:
OrtStatus*
Parameters:
OrtKernelDefBuilder* kernel_def_builder
: The OrtKernelDefBuilder instance.const char* op_type
: A null-terminated string representing the operator type.Returns:
OrtStatus*
Parameters:
OrtKernelDefBuilder* kernel_def_builder
: The OrtKernelDefBuilder instance.const char* domain
: A null-terminated string representing the operator's domain.Returns:
OrtStatus*
Parameters:
OrtKernelDefBuilder* kernel_def_builder
: The OrtKernelDefBuilder instance.OrtKernelDef** kernel_def_out
: The new OrtKernelDef instance.Defining a kernel implementation
An EP defines a kernel implementation by initializing an instance of
OrtKernelImpl
(shown below) with function pointers for computation, release, etc.As shown previously, the example EP creates a
Memcpy
class that inherits fromOrtKernelImpl
and implements the above functions.Defining a kernel creation function
An EP must provide a function of type
OrtKernelCreateFunc
that ORT can later call to create an instance of a kernel (OrtKernelImpl
). The signature of theOrtKernelCreateFunc
is shown below.The example EP declares kernel creation functions via use of the previously mentioned
ONNX_OPERATOR_KERNEL_EX
macro. If one were to expand the macro call, the kernel creation function forMemcpyFromHost
would look similar to the following snippet:Motivation and Context