Skip to content

Commit 67a911e

Browse files
committed
Finish dpnp_offload section
1 parent 9762f43 commit 67a911e

File tree

1 file changed

+52
-31
lines changed

1 file changed

+52
-31
lines changed

docs/source/user_guide/dpnp_offload.rst

Lines changed: 52 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,13 @@ Compiling and Offloading ``dpnp`` Functions
55

66
Data Parallel Extension for NumPy* (``dpnp``) is a drop-in ``NumPy*``
77
replacement library built on top of oneMKL. ``numba-dpex`` allows various
8-
``dpnp`` library functions to be jit-compiled thorugh its ``dpjit`` decorator.
8+
``dpnp`` library function calls to be jit-compiled thorugh its
9+
``numba_dpex.dpjit`` decorator.
910

1011
``numba-dpex`` implements its own runtime library to support offloading ``dpnp``
11-
library functions to SYCL devices. For ``dpnp`` function signatures that are
12-
offloaded, ``numba-dpex`` implements their corresponding function calls through
13-
Numba*'s |numba.extending.overload|_ and |numba.extending.intrinsic|_
14-
constructs.
15-
16-
During compiling a Python function decorated with the ``numba_dpex.dpjit``
17-
decorator, ``numba-dpex`` generates ``dpnp`` function calls through its runtime
18-
library and injects them into the LLVM IR through |numba.extending.intrinsic|_.
12+
library functions to SYCL devices. For each ``dpnp`` function signature to be
13+
offloaded, ``numba-dpex`` implements the corresponding direct SYCL function call
14+
in the runtime and the function call is inlined in the generated LLVM IR.
1915

2016
.. code-block:: python
2117
@@ -42,41 +38,62 @@ numba-dpex.
4238
Repository map
4339
--------------
4440

45-
- The code for numba-dpex's dpnp integration runtime resides in the
41+
- The code for numba-dpex's ``dpnp`` integration runtime resides in the
4642
:file:`numba_dpex/core/runtime` sub-module.
47-
- All the |numba.extending.overload|_ for ``dpnp`` function signatures are
48-
implemented in :file:`numba_dpex/dpnp_iface/arrayobj.py`
43+
- All the |numba.extending.overload|_ for ``dpnp`` array creation/initialization
44+
function signatures are implemented in
45+
:file:`numba_dpex/dpnp_iface/arrayobj.py`
46+
- Each overload's corresponding |numba.extending.intrinsic|_ is implemented in
47+
:file:`numba_dpex/dpnp_iface/_intrinsic.py`
4948
- Tests resides in :file:`numba_dpex/tests/dpjit_tests/dpnp`.
5049

5150
Design
5251
------
5352

54-
The rewrite logic to substitute NumPy functions with dpnp function calls in the
55-
Numba IR is implemented by the :class:`RewriteOverloadedNumPyFunctionsPass`
56-
pass. The :mod:`numba_dpex.dpnp_iface.stubs` module defines a set of `stub`
57-
classes for each of the NumPy functions calls that are currently substituted
58-
out. The outline of a stub class is as follows:
53+
``numba_dpex`` uses the |numba.extending.overload| decorator to create a Numba*
54+
implementation of a function that can be used in `nopython mode`_ functions.
55+
This is done through translation of ``dpnp`` function signature so that they can
56+
be called in ``numba_dpex.dpjit`` decorated code.
57+
58+
The specific SYCL operation for a certain ``dpnp`` function is performed by the
59+
runtime interface. During compiling a function decorated with the ``@dpjit``
60+
decorator, ``numba-dpex`` generates the corresponding SYCL function call through
61+
its runtime library and injects it into the LLVM IR through
62+
|numba.extending.intrinsic|_. The ``@intrinsic`` decorator is used for marking a
63+
``dpnp`` function as typing and implementing the function in nopython mode using
64+
the `llvmlite IRBuilder API`_. This is an escape hatch to build custom LLVM IR
65+
that will be inlined into the caller.
66+
67+
The code injection logic to enable ``dpnp`` functions calls in the Numba IR is
68+
implemented by :mod:`numba_dpex.core.dpnp_iface.arrayobj` module which replaces
69+
Numba*'s :mod:`numba.np.arrayobj`. Each ``dpnp`` function signature is provided
70+
with a concrete implementation to generates the actual code using Numba's
71+
``overload`` function API. e.g.:
5972

6073
.. code-block:: python
6174
62-
# numba_dpex/dpnp_iface/stubs.py - imported in numba_dpex.__init__.py
63-
64-
65-
class dpnp(Stub):
66-
class sum(Stub): # stub function
67-
pass
75+
@overload(dpnp.ones, prefer_literal=True)
76+
def ol_dpnp_ones(
77+
shape, dtype=None, order="C", device=None, usm_type="device", sycl_queue=None
78+
):
79+
...
6880
69-
Each stub is provided with a concrete implementation to generates the actual
70-
code using Numba's ``overload`` function API. E.g.,
81+
The corresponding intrinsic implementation is in :file:`numba_dpex/dpnp_iface/_intrinsic.py`.
7182

7283
.. code-block:: python
7384
74-
@overload(stubs.dpnp.sum)
75-
def dpnp_sum_impl(a):
76-
...
77-
78-
The complete implementation is in
79-
:file:`numba_dpex/dpnp_iface/dpnp_transcendentalsimpl.py`.
85+
@intrinsic
86+
def impl_dpnp_ones(
87+
ty_context,
88+
ty_shape,
89+
ty_dtype,
90+
ty_order,
91+
ty_device,
92+
ty_usm_type,
93+
ty_sycl_queue,
94+
ty_retty_ref,
95+
):
96+
...
8097
8198
Parallel Range
8299
--------------
@@ -94,8 +111,12 @@ context. ``prange`` automatically takes care of data privatization:
94111
.. |numba.extending.overload| replace:: ``numba.extending.overload``
95112
.. |numba.extending.intrinsic| replace:: ``numba.extending.intrinsic``
96113
.. |ol_dpnp_ones(...)| replace:: ``ol_dpnp_ones(...)``
114+
.. |numba.np.arrayobj| replace:: ``numba.np.arrayobj``
97115

98116
.. _low-level API: https://github.com/IntelPython/dpnp/tree/master/dpnp/backend
99117
.. _`ol_dpnp_ones(...)`: https://github.com/IntelPython/numba-dpex/blob/main/numba_dpex/dpnp_iface/arrayobj.py#L358
100118
.. _`numba.extending.overload`: https://numba.pydata.org/numba-doc/latest/extending/high-level.html#implementing-functions
101119
.. _`numba.extending.intrinsic`: https://numba.pydata.org/numba-doc/latest/extending/high-level.html#implementing-intrinsics
120+
.. _nopython mode: https://numba.pydata.org/numba-doc/latest/glossary.html#term-nopython-mode
121+
.. _`numba.np.arrayobj`: https://github.com/numba/numba/blob/main/numba/np/arrayobj.py
122+
.. _`llvmlite IRBuilder API`: http://llvmlite.pydata.org/en/latest/user-guide/ir/ir-builder.html

0 commit comments

Comments
 (0)