@@ -5,17 +5,13 @@ Compiling and Offloading ``dpnp`` Functions
5
5
6
6
Data Parallel Extension for NumPy* (``dpnp ``) is a drop-in ``NumPy* ``
7
7
replacement library built on top of oneMKL. ``numba-dpex `` allows various
8
- ``dpnp `` library functions to be jit-compiled thorugh its ``dpjit `` decorator.
8
+ ``dpnp `` library function calls to be jit-compiled thorugh its
9
+ ``numba_dpex.dpjit `` decorator.
9
10
10
11
``numba-dpex `` implements its own runtime library to support offloading ``dpnp ``
11
- library functions to SYCL devices. For ``dpnp `` function signatures that are
12
- offloaded, ``numba-dpex `` implements their corresponding function calls through
13
- Numba*'s |numba.extending.overload |_ and |numba.extending.intrinsic |_
14
- constructs.
15
-
16
- During compiling a Python function decorated with the ``numba_dpex.dpjit ``
17
- decorator, ``numba-dpex `` generates ``dpnp `` function calls through its runtime
18
- library and injects them into the LLVM IR through |numba.extending.intrinsic |_.
12
+ library functions to SYCL devices. For each ``dpnp `` function signature to be
13
+ offloaded, ``numba-dpex `` implements the corresponding direct SYCL function call
14
+ in the runtime and the function call is inlined in the generated LLVM IR.
19
15
20
16
.. code-block :: python
21
17
@@ -42,41 +38,62 @@ numba-dpex.
42
38
Repository map
43
39
--------------
44
40
45
- - The code for numba-dpex's dpnp integration runtime resides in the
41
+ - The code for numba-dpex's `` dpnp `` integration runtime resides in the
46
42
:file: `numba_dpex/core/runtime ` sub-module.
47
- - All the |numba.extending.overload |_ for ``dpnp `` function signatures are
48
- implemented in :file: `numba_dpex/dpnp_iface/arrayobj.py `
43
+ - All the |numba.extending.overload |_ for ``dpnp `` array creation/initialization
44
+ function signatures are implemented in
45
+ :file: `numba_dpex/dpnp_iface/arrayobj.py `
46
+ - Each overload's corresponding |numba.extending.intrinsic |_ is implemented in
47
+ :file: `numba_dpex/dpnp_iface/_intrinsic.py `
49
48
- Tests resides in :file: `numba_dpex/tests/dpjit_tests/dpnp `.
50
49
51
50
Design
52
51
------
53
52
54
- The rewrite logic to substitute NumPy functions with dpnp function calls in the
55
- Numba IR is implemented by the :class: `RewriteOverloadedNumPyFunctionsPass `
56
- pass. The :mod: `numba_dpex.dpnp_iface.stubs ` module defines a set of `stub `
57
- classes for each of the NumPy functions calls that are currently substituted
58
- out. The outline of a stub class is as follows:
53
+ ``numba_dpex `` uses the |numba.extending.overload | decorator to create a Numba*
54
+ implementation of a function that can be used in `nopython mode `_ functions.
55
+ This is done through translation of ``dpnp `` function signature so that they can
56
+ be called in ``numba_dpex.dpjit `` decorated code.
57
+
58
+ The specific SYCL operation for a certain ``dpnp `` function is performed by the
59
+ runtime interface. During compiling a function decorated with the ``@dpjit ``
60
+ decorator, ``numba-dpex `` generates the corresponding SYCL function call through
61
+ its runtime library and injects it into the LLVM IR through
62
+ |numba.extending.intrinsic |_. The ``@intrinsic `` decorator is used for marking a
63
+ ``dpnp `` function as typing and implementing the function in nopython mode using
64
+ the `llvmlite IRBuilder API `_. This is an escape hatch to build custom LLVM IR
65
+ that will be inlined into the caller.
66
+
67
+ The code injection logic to enable ``dpnp `` functions calls in the Numba IR is
68
+ implemented by :mod: `numba_dpex.core.dpnp_iface.arrayobj ` module which replaces
69
+ Numba*'s :mod: `numba.np.arrayobj `. Each ``dpnp `` function signature is provided
70
+ with a concrete implementation to generates the actual code using Numba's
71
+ ``overload `` function API. e.g.:
59
72
60
73
.. code-block :: python
61
74
62
- # numba_dpex/dpnp_iface/stubs.py - imported in numba_dpex.__init__.py
63
-
64
-
65
- class dpnp (Stub ):
66
- class sum (Stub ): # stub function
67
- pass
75
+ @overload (dpnp.ones, prefer_literal = True )
76
+ def ol_dpnp_ones (
77
+ shape , dtype = None , order = " C" , device = None , usm_type = " device" , sycl_queue = None
78
+ ):
79
+ ...
68
80
69
- Each stub is provided with a concrete implementation to generates the actual
70
- code using Numba's ``overload `` function API. E.g.,
81
+ The corresponding intrinsic implementation is in :file: `numba_dpex/dpnp_iface/_intrinsic.py `.
71
82
72
83
.. code-block :: python
73
84
74
- @overload (stubs.dpnp.sum)
75
- def dpnp_sum_impl (a ):
76
- ...
77
-
78
- The complete implementation is in
79
- :file: `numba_dpex/dpnp_iface/dpnp_transcendentalsimpl.py `.
85
+ @intrinsic
86
+ def impl_dpnp_ones (
87
+ ty_context ,
88
+ ty_shape ,
89
+ ty_dtype ,
90
+ ty_order ,
91
+ ty_device ,
92
+ ty_usm_type ,
93
+ ty_sycl_queue ,
94
+ ty_retty_ref ,
95
+ ):
96
+ ...
80
97
81
98
Parallel Range
82
99
--------------
@@ -94,8 +111,12 @@ context. ``prange`` automatically takes care of data privatization:
94
111
.. |numba.extending.overload | replace :: ``numba.extending.overload ``
95
112
.. |numba.extending.intrinsic | replace :: ``numba.extending.intrinsic ``
96
113
.. |ol_dpnp_ones(...) | replace :: ``ol_dpnp_ones(...) ``
114
+ .. |numba.np.arrayobj | replace :: ``numba.np.arrayobj ``
97
115
98
116
.. _low-level API : https://github.com/IntelPython/dpnp/tree/master/dpnp/backend
99
117
.. _`ol_dpnp_ones(...)` : https://github.com/IntelPython/numba-dpex/blob/main/numba_dpex/dpnp_iface/arrayobj.py#L358
100
118
.. _`numba.extending.overload` : https://numba.pydata.org/numba-doc/latest/extending/high-level.html#implementing-functions
101
119
.. _`numba.extending.intrinsic` : https://numba.pydata.org/numba-doc/latest/extending/high-level.html#implementing-intrinsics
120
+ .. _nopython mode : https://numba.pydata.org/numba-doc/latest/glossary.html#term-nopython-mode
121
+ .. _`numba.np.arrayobj` : https://github.com/numba/numba/blob/main/numba/np/arrayobj.py
122
+ .. _`llvmlite IRBuilder API` : http://llvmlite.pydata.org/en/latest/user-guide/ir/ir-builder.html
0 commit comments