|
12 | 12 |
|
13 | 13 |
|
14 | 14 |
|
15 |
| -Data-parallel Extension for Numba* (numba-dpex) is a standalone extension for |
16 |
| -the [Numba](http://numba.pydata.org) Python JIT compiler. Numba-dpex provides |
17 |
| -a generic kernel programming API and an offload feature that extends Numba's |
18 |
| -auto-parallelizer to generate data-parallel kernels for `parfor` nodes. |
19 |
| - |
20 |
| -Numba-dpex's kernel API has a design and API similar to Numba's `cuda.jit` |
21 |
| -module, but is based on the [SYCL](https://sycl.tech/) language. The |
22 |
| -code-generation for the kernel API currently supports |
23 |
| -[SPIR-V](https://www.khronos.org/spir/)-based |
24 |
| -[OpenCL](https://www.khronos.org/opencl/) and |
25 |
| -[oneAPI Level Zero](https://spec.oneapi.io/level-zero/latest/index.html) |
26 |
| -devices that are supported by Intel® DPC++ SYCL compiler runtime. Supported |
27 |
| -devices include Intel® CPUs, integrated GPUs and discrete GPUs. |
28 |
| - |
29 |
| -The offload functionality in numba-dpex is based on Numba's `parfor` |
30 |
| -loop-parallelizer. Our compiler extends Numba's `parfor` feature to generate |
31 |
| -kernels and offload them to devices supported by DPC++ SYCL compiler runtime. |
32 |
| -The offload functionality is supported via a new NumPy drop-in replacement |
33 |
| -library: [dpnp](https://github.com/IntelPython/dpnp). Note that `dpnp` and NumPy-based |
34 |
| -expressions can be used together in the same function, with `dpnp` expressions getting |
35 |
| -offloaded by `numba-dpex` and NumPy expressions getting parallelized by Numba. |
| 15 | +Data-parallel Extension for Numba* (numba-dpex) is an open-source standalone |
| 16 | +extension for the [Numba](http://numba.pydata.org) Python JIT compiler. |
| 17 | +Numba-dpex provides a [SYCL*](https://sycl.tech/)-like API for kernel |
| 18 | +programming Python. SYCL* is an open standard developed by the [Unified |
| 19 | +Acceleration Foundation](https://uxlfoundation.org/) as a vendor-agnostic way of |
| 20 | +programming different types of data-parallel hardware such as multi-core CPUs, |
| 21 | +GPUs, and FPGAs. Numba-dpex's kernel-programming API brings the same programming |
| 22 | +model and a similar API to Python. The API allows expressing portable |
| 23 | +data-parallel kernels in Python and then JIT compiling them for different |
| 24 | +hardware targets. JIT compilation is supported for hardware that use the |
| 25 | +[SPIR-V](https://www.khronos.org/spir/) intermediate representation format that |
| 26 | +includes [OpenCL](https://www.khronos.org/opencl/) CPU (Intel, AMD) devices, |
| 27 | +OpenCL GPU (Intel integrated and discrete GPUs) devices, and [oneAPI Level |
| 28 | +Zero](https://spec.oneapi.io/level-zero/latest/index.html) GPU (Intel integrated |
| 29 | +and discrete GPUs) devices. |
| 30 | + |
| 31 | +The kernel programming API does not yet support every SYCL* feature. Refer to |
| 32 | +the [SYCL* and numba-dpex feature comparison](https://intelpython.github.io/numba-dpex/latest/supported_sycl_features.html) |
| 33 | +page to get a summary of supported features. |
| 34 | +Numba-dpex only implements SYCL*'s kernel programming API, all SYCL runtime |
| 35 | +Python bindings are provided by the [dpctl](https://github.com/IntelPython/dpctl) |
| 36 | +package. |
| 37 | + |
| 38 | +Along with the kernel programming API, numba-dpex extends Numba's |
| 39 | +auto-parallelizer to bring device offload capabilities to `prange` loops and |
| 40 | +NumPy-like vector expressions. The offload functionality is supported via the |
| 41 | +NumPy drop-in replacement library: [dpnp](https://github.com/IntelPython/dpnp). |
| 42 | +Note that `dpnp` and NumPy-based expressions can be used together in the same |
| 43 | +function, with `dpnp` expressions getting offloaded by `numba-dpex` and NumPy |
| 44 | +expressions getting parallelized by Numba. |
36 | 45 |
|
37 | 46 | Refer the [documentation](https://intelpython.github.io/numba-dpex) and examples
|
38 | 47 | to learn more.
|
|
0 commit comments