Polis up overview and getting started docs.

Diptorup Deb · Diptorup Deb · commit cc382d4f2e7f · 2023-07-19T00:16:06.000-05:00
diff --git a/docs/source/getting_started.rst b/docs/source/getting_started.rst
@@ -9,73 +9,213 @@ Getting Started
 ===============
 
 
-Installation
-------------
+Installing pre-built packages
+-----------------------------
 
-Numba-dpex depends on following components:
+``numba-dpex`` along with its dependencies can be installed using ``conda``.
+It is recommended to use conda packages from the ``anaconda.org/intel`` channel
+to get the latest production releases. Nighly builds of ``numba-dpex`` are
+available on the ``dppy/label/dev`` conda channel.
 
-* numba 0.57.*
-* dpctl 0.14.*
-* dpnp 0.11.*
-* dpcpp-cpp-rt
-* dpcpp-llvm-spirv
-* spirv-tools
-
-It is recommended to use conda packages from the ``anaconda.org/intel`` channel.
+.. code-block:: bash
 
-Create conda environment:
+    conda create -n numba-dpex-env numba-dpex dpnp dpctl dpcpp-llvm-spirv spirv-tools -c intel -c conda-forge
 
-.. code-block:: bash
+Building from source
+--------------------
 
-    conda create -n numba-dpex-env numba-dpex dpnp -c ${ONEAPI_ROOT}/conda_channel
+``numba-dpex`` can be built from source using either ``conda-build`` or ``setuptools``.
 
-Build and Install Conda Package
--------------------------------
+Steps to build using ``conda-build``:
 
-Create and activate conda build environment:
+1. Create a conda environment
 
 .. code-block:: bash
 
     conda create -n build-env conda-build
     conda activate build-env
 
+2. Build using the vendored conda recipe
+
 .. code-block:: bash
 
     conda build conda-recipe -c intel -c conda-forge
 
-Install conda package:
+3. Install the conda package
 
 .. code-block:: bash
 
     conda install numba-dpex
 
-Build and Install with setuptools
----------------------------------
+Steps to build using ``setup.py``:
 
 .. code-block:: bash
 
-    conda create -n numba-dpex-env dpctl dpnp numba spirv-tools dpcpp-llvm-spirv llvmdev cython pytest -c intel -c conda-forge
+    conda create -n numba-dpex-env dpctl dpnp numba spirv-tools dpcpp-llvm-spirv llvmdev pytest -c intel -c conda-forge
     conda activate numba-dpex-env
 
+Building inside Docker
+----------------------
 
-Testing
--------
+A Dockerfile is provided on the GitHub repository to easily build ``numba-dpex``
+as well as its direct dependencies: ``dpctl`` and ``dpnp``. Users can either use
+one of the pre-built images on the ``numba-dpex`` GitHub page or use the
+bundled Dockerfile to build ``numba-dpex`` from source.
+
+Building
+~~~~~~~~
+
+Numba dpex ships with multistage Dockerfile, which means there are
+different `targets <https://docs.docker.com/build/building/multi-stage/#stop-at-a-specific-build-stage>`_
+available for build. The most useful ones:
 
-See folder ``numba_dpex/tests``.
+- runtime
+- runtime-gpu
+- numba-dpex-builder-runtime
+- numba-dpex-builder-runtime-gpu
 
-To run the tests:
+To build docker image
 
 .. code-block:: bash
 
-    python -m pytest --pyargs numba_dpex.tests
+    docker build --target runtime -t numba-dpex:runtime ./
 
-Examples
---------
 
-See folder ``numba_dpex/examples``.
+To run docker image
+
+.. code-block:: bash
+
+    docker run -it --rm numba-dpex:runtime
+
+.. note::
+
+    When trying to build a docker image with Intel GPU support, the Dockerfile
+    will attempt to use the GitHub API to get the latest Intel GPU drivers.
+    Users may run into an issue related to  Github API call limits. The issue
+    can be bypassed by providing valid GitHub credentials using the
+    ``GITHUB_USER`` and ``GITHUB_PASSWORD``
+    `build args <https://docs.docker.com/engine/reference/commandline/build/#build-arg>`_
+    to increase the call limit. A GitHub
+    `access token <https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token>`
+    can also be used instead of the password.
+
+.. note::
+
+    When building the docker image behind a firewall the proxy server settings
+    should be provided using the ``http_proxy`` and ``https_proxy`` build args.
+    These build args must be specified in lowercase.
+
+The bundled Dockerfile supports different python versions that can be specified
+via the ``PYTHON_VERSION`` build arg. By default, the docker image is based on
+the official python image based on slim debian. The requested python version
+must be from the available python docker images.
+
+The ``BASE_IMAGE`` build arg can be used to build the docker image from a
+custom image. Note that as the Dockerfile is based on debian any custom base
+image should be debian-based, like debian or ubuntu.
+
+The list of other build args are as follows. Please refer the Dockerfile to
+see currently all available build args.
+
+- ``PYTHON_VERSION``
+- ``CR_TAG``
+- ``IGC_TAG``
+- ``CM_TAG``
+- ``L0_TAG``
+- ``ONEAPI_VERSION``
+- ``DPCTL_GIT_BRANCH``
+- ``DPCTL_GIT_URL``
+- ``DPNP_GIT_BRANCH``
+- ``DPNP_GIT_URL``
+- ``NUMBA_DPEX_GIT_BRANCH``
+- ``NUMBA_DPEX_GIT_URL``
+- ``CMAKE_VERSION``
+- ``CMAKE_VERSION_BUILD``
+- ``INTEL_NUMPY_VERSION``
+- ``INTEL_NUMBA_VERSION``
+- ``CYTHON_VERSION``
+- ``SCIKIT_BUILD_VERSION``
+- ``http_proxy``
+- ``https_proxy``
+- ``GITHUB_USER``
+- ``GITHUB_PASSWORD``
+- ``BASE_IMAGE``
+
+
+Using the pre-built images
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are several pre-built docker images available:
+
+- ``runtime`` package that provides a pre-built environment with ``numba-dpex``
+              already installed. It is ideal to quickly setup and try
+              ``numba-dpex``.
+
+.. code-block:: text
+
+    ghcr.io/intelpython/numba-dpex/runtime:<numba_dpex_version>-py<python_version>[-gpu]
+
+- ``builder`` package that has all required dependencies pre-installed and is
+              ideal for building ``numba-dpex`` from source.
+
+.. code-block:: text
+
+    ghcr.io/intelpython/numba-dpex/builder:<numba_dpex_version>-py<python_version>[-gpu]
 
-To run the examples:
+- ``stages`` package primarily meant for creating a new docker image that is
+             built on top of one of the pre-built images.
+
+After setting up the docker image, to run ``numba-dpex`` the following snippet
+can be used.
+
+.. code-block:: bash
+
+    docker run --rm -it ghcr.io/intelpython/numba-dpex/runtime:0.20.0-py3.10 bash
+
+It is advisable to verify the SYCL runtime and driver installation within the
+container by either running,
+
+.. code-block:: bash
+
+    sycl-ls
+
+or,
+
+.. code-block:: bash
+
+    python -m dpctl -f
+
+.. note::
+
+    To enable GPU device, the ``device`` argument should be used and one of the
+    ``*-gpu`` images should be used.
+
+    For passing GPU into container on linux use arguments ``--device=/dev/dri``.
+    However if you are using WSL you need to pass
+    ``--device=/dev/dxg -v /usr/lib/wsl:/usr/lib/wsl`` instead.
+
+For example, to run ``numba-dpex`` with GPU support on WSL:
 
 .. code-block:: bash
 
-    python numba_dpex/examples/sum.py
+    docker run --rm -it \
+    --device=/dev/dxg -v /usr/lib/wsl:/usr/lib/wsl \
+    ghcr.io/intelpython/numba-dpex/runtime:0.20.0-py3.10-gpu
+
+
+
+Testing
+-------
+
+``numba-dpex`` uses pytest for unit testing and the following example
+shows a way to run the unit tests.
+
+.. code-block:: bash
+
+    python -m pytest --pyargs numba_dpex.tests
+
+Examples
+--------
+
+A set of examples on how to use ``numba-dpex`` can be found in
+``numba_dpex/examples``.
diff --git a/docs/source/overview.rst b/docs/source/overview.rst
@@ -4,19 +4,19 @@
 Overview
 ========
 
-Data Parallel Extension for Numba* (`numba-dpex`_) is a standalone extension for
-the `Numba*`_ Python JIT compiler. ``numba-dpex`` adds two new features to
-Numba*: an architecture-agnostic kernel programming API, and a new compilation
-target that adds typing and compilation support for the Data Parallel Extension
-for Numpy* (`dpnp`_) library. ``dpnp`` is a Python package for numerical
-computing that provides a data-parallel reimplementation of `NumPy*`_'s API.
-``numba-dpex``'s support for ``dpnp`` compilation is a new way for Numba* users to write
-code in a NumPy-like API that is already supported by Numba*, while at the same
-time automatically running such code parallelly on various types of
-architecture.
-
-``numba-dpex`` is being developed as part of `Intel AI Analytics Toolkit`_ and is
-distributed with the `Intel Distribution for Python*`_. The extension is also
+Data Parallel Extension for Numba* (`numba-dpex`_) is an extension to
+the `Numba*`_ Python JIT compiler adding an architecture-agnostic kernel
+programming API, and a new front-end to compile the Data Parallel Extension
+for Numpy* (`dpnp`_) library. The ``dpnp`` Python library is a data-parallel
+implementation of `NumPy*`_'s API using the `SYCL*`_ language.
+
+.. ``numba-dpex``'s support for ``dpnp`` compilation is a new way for Numba* users
+.. to write code in a NumPy-like API that is already supported by Numba*, while at
+.. the same time automatically running such code parallelly on various types of
+.. architecture.
+
+``numba-dpex`` is developed as part of `Intel AI Analytics Toolkit`_ and
+is distributed with the `Intel Distribution for Python*`_. The extension is
 available on Anaconda cloud and as a Docker image on GitHub. Please refer the
 :doc:`getting_started` page to learn more.
 
@@ -26,14 +26,15 @@ Main Features
 Portable Kernel Programming
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The kernel API has a design and API similar to Numba's ``cuda.jit`` module.
-However, the API uses the `SYCL*`_ language runtime and as such is extensible to
-various hardware types supported by a SYCL runtime. Presently, ``numba-dpex`` uses
-the `DPC++`_ SYCL runtime and only supports SPIR-V-based OpenCL and `oneAPI
-Level Zero`_ devices CPU and GPU devices.
+The ``numba-dpex`` kernel API has a design and API similar to Numba's
+``cuda.jit`` sub-module. The API is modeled after the `SYCL*`_ language and uses
+the `DPC++`_ SYCL runtime. Currently, compilation of kernels is supported for
+SPIR-V-based OpenCL and `oneAPI Level Zero`_ devices CPU and GPU devices. In the
+future, the API can be extended to other architectures that are supported by
+DPC++.
 
-The following vector addition example illustrates the basic features of the
-interface.
+The following example illustrates a vector addition kernel written with
+``numba-dpex`` kernel API.
 
 .. code-block:: python
 
@@ -54,33 +55,34 @@ interface.
     vecadd_kernel[dpex.Range(1024)](a, b, c)
     print(c)
 
-In the above example, we allocated three arrays on a default ``gpu`` device
-using the ``dpnp`` library. These arrays are then passed as input arguments to the
-kernel function. The compilation target and the subsequent execution of the
+In the above example, three arrays are allocated on a default ``gpu`` device
+using the ``dpnp`` library. These arrays are then passed as input arguments to
+the kernel function. The compilation target and the subsequent execution of the
 kernel is determined completely by the input arguments and follow the
 "compute-follows-data" programming model as specified in the `Python* Array API
 Standard`_. To change the execution target to a CPU, the device keyword needs to
-be changed to ``cpu`` when allocating the ``dpnp`` arrays. It is also possible to
-leave the ``device`` keyword undefined and let the ``dpnp`` library select a default
-device based on environment flag settings. Refer the
+be changed to ``cpu`` when allocating the ``dpnp`` arrays. It is also possible
+to leave the ``device`` keyword undefined and let the ``dpnp`` library select a
+default device based on environment flag settings. Refer the
 :doc:`user_manual/kernel_programming/index` for further details.
 
-``dpnp`` compilation and offload
+``dpnp`` compilation support
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-``numba-dpex`` extends Numba's type system and compilation pipeline to compile ``dpnp``
-functions and expressions in the same way as NumPy. Unlike Numba's NumPy
-compilation that is serial by default, ``numba-dpex`` always compiles ``dpnp``
-expressions into offloadable kernels and executes them in parallel. The feature
-is provided using a decorator ``dpjit`` that behaves identically to
-``numba.njit(parallel=True)`` with the addition of ``dpnp`` compilation and offload.
-Offloading by ``numba-dpex`` is not just restricted to CPUs and supports all devices
-that are presently supported by the kernel API. ``dpjit`` allows using NumPy and
-``dpnp`` expressions in the same function. All NumPy compilation and parallelization
-is done via the default Numba code-generation pipeline, whereas ``dpnp`` expressions
-are compiled using the ``numba-dpex`` pipeline.
-
-The vector addition example depicted using the kernel API can be easily
+``numba-dpex`` extends Numba's type system and compilation pipeline to compile
+``dpnp`` functions and expressions in the same way as NumPy. Unlike Numba's
+NumPy compilation that is serial by default, ``numba-dpex`` always compiles
+``dpnp`` expressions into data-parallel kernels and executes them in parallel.
+The ``dpnp`` compilation feature is provided using a decorator ``dpjit`` that
+behaves identically to ``numba.njit(parallel=True)`` with the addition of
+``dpnp`` compilation and kernel offloading. Offloading by ``numba-dpex`` is not
+just restricted to CPUs and supports all devices that are presently supported by
+the kernel API. ``dpjit`` allows using NumPy and ``dpnp`` expressions in the
+same function. All NumPy compilation and parallelization is done via the default
+Numba code-generation pipeline, whereas ``dpnp`` expressions are compiled using
+the ``numba-dpex`` pipeline.
+
+The vector addition example depicted using the kernel API can also be
 expressed in several different ways using ``dpjit``.
 
 .. code-block:: python