Skip to content

Commit ecba8c3

Browse files
Update documentation of gpu section (#789) (#797)
* Update documentation of gpu section * Apply suggestions from code review Co-authored-by: Ekaterina Mekhnetsova <[email protected]> * Update comments * Fix format * Update doc/sources/oneapi_gpu.rst * Update doc/sources/oneapi_gpu.rst * Update doc/sources/oneapi_gpu.rst * Update doc/sources/oneapi_gpu.rst * Update doc/sources/oneapi_gpu.rst * Update doc/sources/oneapi_gpu.rst Co-authored-by: Ekaterina Mekhnetsova <[email protected]> Co-authored-by: Nikolay Petrov <[email protected]> Co-authored-by: Ekaterina Mekhnetsova <[email protected]> Co-authored-by: vlad-nazarov <[email protected]> (cherry picked from commit a63a6c8) Co-authored-by: Michael Smirnov <[email protected]>
1 parent 09902a8 commit ecba8c3

File tree

3 files changed

+109
-56
lines changed

3 files changed

+109
-56
lines changed

doc/sources/contents.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,4 @@ Intel(R) Extension for Scikit-learn*
2828
Supported Algorithms <algorithms>
2929
Intel(R) Extension for Scikit-learn* Verbose <verbose>
3030
Global patching <global_patching>
31-
GPU support <gpu>
31+
oneAPI and GPU support <oneapi_gpu>

doc/sources/gpu.rst

Lines changed: 0 additions & 55 deletions
This file was deleted.

doc/sources/oneapi_gpu.rst

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
.. ******************************************************************************
2+
.. * Copyright 2020-2021 Intel Corporation
3+
.. *
4+
.. * Licensed under the Apache License, Version 2.0 (the "License");
5+
.. * you may not use this file except in compliance with the License.
6+
.. * You may obtain a copy of the License at
7+
.. *
8+
.. * http://www.apache.org/licenses/LICENSE-2.0
9+
.. *
10+
.. * Unless required by applicable law or agreed to in writing, software
11+
.. * distributed under the License is distributed on an "AS IS" BASIS,
12+
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
.. * See the License for the specific language governing permissions and
14+
.. * limitations under the License.
15+
.. *******************************************************************************/
16+
17+
.. _oneapi_gpu:
18+
19+
##############################################################
20+
oneAPI and GPU support in Intel(R) Extension for Scikit-learn*
21+
##############################################################
22+
23+
Intel(R) Extension for Scikit-learn* supports oneAPI concepts, which
24+
means that algorithms can be executed on different devices: CPUs and GPUs.
25+
This is done via integration with
26+
`dpctl <https://intelpython.github.io/dpctl/latest/index.html>`_ package that
27+
implements core oneAPI concepts like queues and devices.
28+
29+
Prerequisites
30+
-------------
31+
32+
For execution on GPU, DPC++ compiler runtime and driver are required. Refer to `DPC++ system
33+
requirements <https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements.html>`_ for details.
34+
35+
DPC++ compiler runtime can be installed either from PyPI or Anaconda:
36+
37+
- Install from PyPI::
38+
39+
pip install dpcpp-cpp-rt
40+
41+
- Install from Anaconda::
42+
43+
conda install dpcpp_cpp_rt -c intel
44+
45+
Device offloading
46+
-----------------
47+
48+
Intel(R) Extension for Scikit-learn* offers two options for running an algorithm on a
49+
specific device with the help of dpctl:
50+
51+
- Pass input data as `dpctl.tensor.usm_ndarray <https://intelpython.github.io/dpctl/latest/docfiles/dpctl.tensor_api.html#dpctl.tensor.usm_ndarray>`_ to the algorithm.
52+
53+
The computation will run on the device where the input data is
54+
located, and the result will be returned as :code:`usm_ndarray` to the same
55+
device.
56+
57+
.. note::
58+
All the input data for an algorithm must reside on the same device.
59+
60+
.. warning::
61+
The :code:`usm_ndarray` can only be consumed by the base methods
62+
like :code:`fit`, :code:`predict`, and :code:`transform`.
63+
Note that only the algorithms in Intel(R) Extension for Scikit-learn* support
64+
:code:`usm_ndarray`. The algorithms from the stock version of scikit-learn
65+
do not support this feature.
66+
- Use global configurations of Intel(R) Extension for Scikit-learn\*:
67+
1. The :code:`target_offload` option can be used to set the device primarily
68+
used to perform computations. Accepted data types are :code:`str` and
69+
:code:`dpctl.SyclQueue`. If you pass a string to :code:`target_offload`,
70+
it should either be ``"auto"``, which means that the execution
71+
context is deduced from the location of input data, or a string
72+
with SYCL* filter selector. The default value is ``"auto"``.
73+
2. The :code:`allow_fallback_to_host` option
74+
is a Boolean flag. If set to :code:`True`, the computation is allowed
75+
to fallback to the host device when a particular estimator does not support
76+
the selected device. The default value is :code:`False`.
77+
78+
These options can be set using :code:`sklearnex.set_config()` function or
79+
:code:`sklearnex.config_context`. To obtain the current values of these options,
80+
call :code:`sklearnex.get_config()`.
81+
82+
.. note::
83+
Functions :code:`set_config`, :code:`get_config` and :code:`config_context`
84+
are always patched after the :code:`sklearnex.patch_sklearn()` call.
85+
86+
.. rubric:: Compatibility considerations
87+
88+
For compatibility reasons, algorithms in Intel(R) Extension for
89+
Scikit-learn* may be offloaded to the device using
90+
:code:`daal4py.oneapi.sycl_context`. However, it is recommended to use one of the options
91+
described above for device offloading instead of using :code:`sycl_context`.
92+
93+
Example
94+
-------
95+
96+
An example on how to patch your code with Intel CPU/GPU optimizations:
97+
98+
.. code-block:: python
99+
100+
from sklearnex import patch_sklearn, config_context
101+
patch_sklearn()
102+
103+
from sklearn.cluster import DBSCAN
104+
105+
X = np.array([[1., 2.], [2., 2.], [2., 3.],
106+
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
107+
with config_context(target_offload="gpu:0"):
108+
clustering = DBSCAN(eps=3, min_samples=2).fit(X)

0 commit comments

Comments
 (0)