|
| 1 | +.. ****************************************************************************** |
| 2 | +.. * Copyright 2020-2021 Intel Corporation |
| 3 | +.. * |
| 4 | +.. * Licensed under the Apache License, Version 2.0 (the "License"); |
| 5 | +.. * you may not use this file except in compliance with the License. |
| 6 | +.. * You may obtain a copy of the License at |
| 7 | +.. * |
| 8 | +.. * http://www.apache.org/licenses/LICENSE-2.0 |
| 9 | +.. * |
| 10 | +.. * Unless required by applicable law or agreed to in writing, software |
| 11 | +.. * distributed under the License is distributed on an "AS IS" BASIS, |
| 12 | +.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 13 | +.. * See the License for the specific language governing permissions and |
| 14 | +.. * limitations under the License. |
| 15 | +.. *******************************************************************************/ |
| 16 | +
|
| 17 | +.. _oneapi_gpu: |
| 18 | + |
| 19 | +############################################################## |
| 20 | +oneAPI and GPU support in Intel(R) Extension for Scikit-learn* |
| 21 | +############################################################## |
| 22 | + |
| 23 | +Intel(R) Extension for Scikit-learn* supports oneAPI concepts, which |
| 24 | +means that algorithms can be executed on different devices: CPUs and GPUs. |
| 25 | +This is done via integration with |
| 26 | +`dpctl <https://intelpython.github.io/dpctl/latest/index.html>`_ package that |
| 27 | +implements core oneAPI concepts like queues and devices. |
| 28 | + |
| 29 | +Prerequisites |
| 30 | +------------- |
| 31 | + |
| 32 | +For execution on GPU, DPC++ compiler runtime and driver are required. Refer to `DPC++ system |
| 33 | +requirements <https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements.html>`_ for details. |
| 34 | + |
| 35 | +DPC++ compiler runtime can be installed either from PyPI or Anaconda: |
| 36 | + |
| 37 | +- Install from PyPI:: |
| 38 | + |
| 39 | + pip install dpcpp-cpp-rt |
| 40 | + |
| 41 | +- Install from Anaconda:: |
| 42 | + |
| 43 | + conda install dpcpp_cpp_rt -c intel |
| 44 | + |
| 45 | +Device offloading |
| 46 | +----------------- |
| 47 | + |
| 48 | +Intel(R) Extension for Scikit-learn* offers two options for running an algorithm on a |
| 49 | +specific device with the help of dpctl: |
| 50 | + |
| 51 | +- Pass input data as `dpctl.tensor.usm_ndarray <https://intelpython.github.io/dpctl/latest/docfiles/dpctl.tensor_api.html#dpctl.tensor.usm_ndarray>`_ to the algorithm. |
| 52 | + |
| 53 | + The computation will run on the device where the input data is |
| 54 | + located, and the result will be returned as :code:`usm_ndarray` to the same |
| 55 | + device. |
| 56 | + |
| 57 | + .. note:: |
| 58 | + All the input data for an algorithm must reside on the same device. |
| 59 | + |
| 60 | + .. warning:: |
| 61 | + The :code:`usm_ndarray` can only be consumed by the base methods |
| 62 | + like :code:`fit`, :code:`predict`, and :code:`transform`. |
| 63 | + Note that only the algorithms in Intel(R) Extension for Scikit-learn* support |
| 64 | + :code:`usm_ndarray`. The algorithms from the stock version of scikit-learn |
| 65 | + do not support this feature. |
| 66 | +- Use global configurations of Intel(R) Extension for Scikit-learn\*: |
| 67 | + 1. The :code:`target_offload` option can be used to set the device primarily |
| 68 | + used to perform computations. Accepted data types are :code:`str` and |
| 69 | + :code:`dpctl.SyclQueue`. If you pass a string to :code:`target_offload`, |
| 70 | + it should either be ``"auto"``, which means that the execution |
| 71 | + context is deduced from the location of input data, or a string |
| 72 | + with SYCL* filter selector. The default value is ``"auto"``. |
| 73 | + 2. The :code:`allow_fallback_to_host` option |
| 74 | + is a Boolean flag. If set to :code:`True`, the computation is allowed |
| 75 | + to fallback to the host device when a particular estimator does not support |
| 76 | + the selected device. The default value is :code:`False`. |
| 77 | + |
| 78 | +These options can be set using :code:`sklearnex.set_config()` function or |
| 79 | +:code:`sklearnex.config_context`. To obtain the current values of these options, |
| 80 | +call :code:`sklearnex.get_config()`. |
| 81 | + |
| 82 | +.. note:: |
| 83 | + Functions :code:`set_config`, :code:`get_config` and :code:`config_context` |
| 84 | + are always patched after the :code:`sklearnex.patch_sklearn()` call. |
| 85 | + |
| 86 | +.. rubric:: Compatibility considerations |
| 87 | + |
| 88 | +For compatibility reasons, algorithms in Intel(R) Extension for |
| 89 | +Scikit-learn* may be offloaded to the device using |
| 90 | +:code:`daal4py.oneapi.sycl_context`. However, it is recommended to use one of the options |
| 91 | +described above for device offloading instead of using :code:`sycl_context`. |
| 92 | + |
| 93 | +Example |
| 94 | +------- |
| 95 | + |
| 96 | +An example on how to patch your code with Intel CPU/GPU optimizations: |
| 97 | + |
| 98 | +.. code-block:: python |
| 99 | +
|
| 100 | + from sklearnex import patch_sklearn, config_context |
| 101 | + patch_sklearn() |
| 102 | +
|
| 103 | + from sklearn.cluster import DBSCAN |
| 104 | +
|
| 105 | + X = np.array([[1., 2.], [2., 2.], [2., 3.], |
| 106 | + [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) |
| 107 | + with config_context(target_offload="gpu:0"): |
| 108 | + clustering = DBSCAN(eps=3, min_samples=2).fit(X) |
0 commit comments