Skip to content

Commit 1749832

Browse files
Restructure documentation (#838)
* Save changes to doc structure * Restructure docs * Fix links * Fix links * Update doc build instructions * Update doc/build-doc.sh Co-authored-by: Nikolay Petrov <[email protected]> * Update requirements for doc build * Update doc/sources/usage.rst * Add patching options * Add note about interchangebility * Update doc/sources/conf.py Co-authored-by: Nikolay Petrov <[email protected]> * Fix pep * Update gpu support table Co-authored-by: Nikolay Petrov <[email protected]>
1 parent 2fcd1f7 commit 1749832

25 files changed

+656
-267
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ daal4py.egg-info
1010
*.dll
1111
**/.ipynb_checkpoints
1212
doc/_build
13+
doc/sources/samples/*.ipynb
1314

1415
# Cython generated code
1516
src/oneapi/oneapi_api.h

INSTALL.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,14 +117,16 @@ to create a Python package that can be easily managed by the package manager on
117117
## Build documentation for Intel(R) Extension for Scikit-learn
118118
### Prerequisites for creating documentation
119119

120-
See [requirements-doc.txt](requirements-doc.txt).
120+
- [requirements-doc.txt](requirements-doc.txt)
121+
- [pandoc](https://pandoc.org/installing.html)
121122

122123
### Build documentation
123124

124125
Run:
125126

126127
```
127-
cd doc && make html
128+
cd doc
129+
./build-doc.sh
128130
```
129131

130132
The documentation will be in ```doc/_build/html```.

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ One of the ways to patch scikit-learn is by modifying the code. You import an ad
4848
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
4949
```
5050

51-
👀 Read about [other ways to patch scikit-learn](https://intel.github.io/scikit-learn-intelex/index.html#usage) and [other methods for offloading to GPU devices](https://intel.github.io/scikit-learn-intelex/oneapi_gpu.html).
51+
👀 Read about [other ways to patch scikit-learn](https://intel.github.io/scikit-learn-intelex/index.html#usage) and [other methods for offloading to GPU devices](https://intel.github.io/scikit-learn-intelex/oneapi-gpu.html).
5252
Check out available [notebooks](https://github.com/intel/scikit-learn-intelex/tree/master/examples/notebooks) for more examples.
5353

5454
This software acceleration is achieved through the use of vector instructions, IA hardware-specific memory optimizations, threading, and optimizations for all upcoming Intel platforms at launch time.
@@ -70,7 +70,7 @@ Configurations:
7070

7171
## 🛠 Installation
7272

73-
[System Requirements](https://intel.github.io/scikit-learn-intelex/system_requirements.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp; [Install via pip or conda](https://github.com/intel/scikit-learn-intelex/blob/master/INSTALL.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Build from sources](INSTALL.md#build-from-sources)
73+
[System Requirements](https://intel.github.io/scikit-learn-intelex/system-requirements.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp; [Install via pip or conda](https://github.com/intel/scikit-learn-intelex/blob/master/INSTALL.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Build from sources](INSTALL.md#build-from-sources)
7474

7575
Intel(R) Extension for Scikit-learn is available at the [Python Package Index](https://pypi.org/project/scikit-learn-intelex/),
7676
on Anaconda Cloud in [Conda-Forge channel](https://anaconda.org/conda-forge/scikit-learn-intelex) and in [Intel channel](https://anaconda.org/intel/scikit-learn-intelex). You can also build the extension from [sources](INSTALL.md#build-from-sources).

doc/build-doc.sh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/bin/bash
2+
3+
# copy jupyter notebooks
4+
cd ..
5+
cp examples/notebooks/*.ipynb doc/sources/samples
6+
7+
# build the documentation
8+
cd doc
9+
make html

doc/sources/acceleration.rst

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
.. ******************************************************************************
2+
.. * Copyright 2021 Intel Corporation
3+
.. *
4+
.. * Licensed under the Apache License, Version 2.0 (the "License");
5+
.. * you may not use this file except in compliance with the License.
6+
.. * You may obtain a copy of the License at
7+
.. *
8+
.. * http://www.apache.org/licenses/LICENSE-2.0
9+
.. *
10+
.. * Unless required by applicable law or agreed to in writing, software
11+
.. * distributed under the License is distributed on an "AS IS" BASIS,
12+
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
.. * See the License for the specific language governing permissions and
14+
.. * limitations under the License.
15+
.. *******************************************************************************/
16+
17+
##############
18+
Acceleration
19+
##############
20+
21+
The software acceleration provided by |intelex| is achieved through the use of vector instructions,
22+
IA hardware-specific memory optimizations, threading, and optimizations for all upcoming Intel platforms at launch time.
23+
24+
|intelex| dynamically patches scikit-learn estimators to use Intel(R) oneAPI Data Analytics Library
25+
as the underlying solver, while getting the same solution faster.
26+
27+
|intelex| depends on daal4py. You can learn more in `daal4py documentation <https://intelpython.github.io/daal4py>`_.
28+
29+
Speedup over original Scikit-learn
30+
----------------------------------
31+
32+
.. image:: _static/scikit-learn-acceleration-2021.2.3.PNG
33+
:width: 800
34+
35+
Configurations:
36+
37+
- HW: c5.24xlarge AWS EC2 Instance using an Intel Xeon Platinum 8275CL with 2 sockets and 24 cores per socket
38+
- SW: scikit-learn version 0.24.2, scikit-learn-intelex version 2021.2.3, Python 3.8

doc/sources/algorithms.rst

Lines changed: 65 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,16 @@
1414
.. * limitations under the License.
1515
.. *******************************************************************************/
1616
17+
.. _sklearn_algorithms:
18+
1719
####################
18-
Supported algorithms
20+
Supported Algorithms
1921
####################
2022

21-
.. _sklearn_algorithms:
23+
Applying |intelex| will impact the following scikit-learn algorithms:
2224

23-
Applying Intel(R) Extension for Scikit-learn will impact the following existing scikit-learn
24-
algorithms:
25+
on CPU
26+
------
2527

2628
.. list-table::
2729
:widths: 10 10 30 15
@@ -121,10 +123,67 @@ algorithms:
121123
- Parameters ``average``, ``sample_weight``, ``max_fpr`` and ``multi_class`` are not supported.
122124
- No limitations.
123125

126+
on GPU
127+
------
128+
129+
.. list-table::
130+
:widths: 10 10 30 15
131+
:header-rows: 1
132+
:align: left
133+
134+
* - Task
135+
- Functionality
136+
- Parameters support
137+
- Data support
138+
* - Classification
139+
- SVC
140+
- All parameters except ``kernel`` = 'sigmoid_poly', ``class_weight`` != None.
141+
- Only binary dense data is supported.
142+
* - Classification
143+
- RandomForestClassifier
144+
- All parameters except ``warm_start`` = True, ``cpp_alpha`` != 0, ``criterion`` != 'gini', ``oob_score`` = True.
145+
- Multi-output, sparse data, out-of-bag score and sample_weight are not supported.
146+
* - Classification
147+
- KNeighborsClassifier
148+
- All parameters except ``algorithm`` != 'brute', ``weights`` = 'callable'
149+
- Only dense data is supported.
150+
* - Classification
151+
- LogisticRegression
152+
- All parameters except ``solver`` != 'newton-cg', ``class_weight`` != None, ``sample_weight`` != None, ``penalty`` != 'l2'
153+
- Only dense data is supported.
154+
* - Regression
155+
- RandomForestRegressor
156+
- All parameters except ``warm_start`` = True, ``cpp_alpha`` != 0, ``criterion`` != 'mse', ``oob_score`` = True.
157+
- Multi-output, sparse data, out-of-bag score and sample_weight are not supported.
158+
* - Regression
159+
- KNeighborsRegressor
160+
- All parameters except ``algorithm`` != 'brute', ``weights`` = 'callable'
161+
- Only dense data is supported.
162+
* - Regression
163+
- LinearRegression
164+
- All parameters except ``normalize`` != False and ``sample_weight`` != None.
165+
- Only dense data is supported, #observations should be >= #features.
166+
* - Clustering
167+
- KMeans
168+
- All parameters except ``precompute_distances`` and ``sample_weight`` != None. Init = 'k-means++' fallbacks to CPU.
169+
- Sparse data is not supported.
170+
* - Clustering
171+
- DBSCAN
172+
- All parameters except ``metric`` != 'euclidean', ``algorithm`` != 'brute', ``algorithm`` != 'auto'.
173+
- Only dense data is supported.
174+
* - Dimensionality reduction
175+
- PCA
176+
- All parameters except ``svd_solver`` != 'full'.
177+
- Sparse data is not supported.
178+
179+
.. seealso:: :ref:`oneapi_gpu`
180+
181+
Scikit-learn tests
182+
------------------
124183

125184
Monkey-patched scikit-learn classes and functions passes scikit-learn's own test
126185
suite, with few exceptions, specified in `deselected_tests.yaml
127186
<https://github.com/intel/scikit-learn-intelex/blob/master/deselected_tests.yaml>`__.
128187

129-
The results of the entire latest scikit-learn test suite with Intel(R) Extension for Scikit-learn: `CircleCI
130-
<https://circleci.com/gh/intel/scikit-learn-intelex>`_.
188+
The results of the entire latest scikit-learn test suite with |intelex|: `CircleCI
189+
<https://circleci.com/gh/intel/scikit-learn-intelex>`_.

doc/sources/blogs.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.. ******************************************************************************
2+
.. * Copyright 2021 Intel Corporation
3+
.. *
4+
.. * Licensed under the Apache License, Version 2.0 (the "License");
5+
.. * you may not use this file except in compliance with the License.
6+
.. * You may obtain a copy of the License at
7+
.. *
8+
.. * http://www.apache.org/licenses/LICENSE-2.0
9+
.. *
10+
.. * Unless required by applicable law or agreed to in writing, software
11+
.. * distributed under the License is distributed on an "AS IS" BASIS,
12+
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
.. * See the License for the specific language governing permissions and
14+
.. * limitations under the License.
15+
.. *******************************************************************************/
16+
17+
.. _blogs:
18+
19+
Follow us on Medium
20+
--------------------
21+
We publish blogs on Medium, so `follow us <https://medium.com/intel-analytics-software/tagged/machine-learning>`_
22+
to learn tips and tricks for more efficient data analysis the help of |intelex|.
23+
Here are our latest blogs:
24+
25+
- `Save Time and Money with Intel Extension for Scikit-learn <https://medium.com/intel-analytics-software/save-time-and-money-with-intel-extension-for-scikit-learn-33627425ae4>`_,
26+
- `Superior Machine Learning Performance on the Latest Intel Xeon Scalable Processors <https://medium.com/intel-analytics-software/superior-machine-learning-performance-on-the-latest-intel-xeon-scalable-processor-efdec279f5a3>`_,
27+
- `Leverage Intel Optimizations in Scikit-Learn <https://medium.com/intel-analytics-software/leverage-intel-optimizations-in-scikit-learn-f562cb9d5544>`_,
28+
- `Intel Gives Scikit-Learn the Performance Boost Data Scientists Need <https://medium.com/intel-analytics-software/intel-gives-scikit-learn-the-performance-boost-data-scientists-need-42eb47c80b18>`_,
29+
- `From Hours to Minutes: 600x Faster SVM <https://medium.com/intel-analytics-software/from-hours-to-minutes-600x-faster-svm-647f904c31ae>`_,
30+
- `Improve the Performance of XGBoost and LightGBM Inference <https://medium.com/intel-analytics-software/improving-the-performance-of-xgboost-and-lightgbm-inference-3b542c03447e>`_,
31+
- `Accelerate Kaggle Challenges Using Intel AI Analytics Toolkit <https://medium.com/intel-analytics-software/accelerate-kaggle-challenges-using-intel-ai-analytics-toolkit-beb148f66d5a>`_,
32+
- `Accelerate Your scikit-learn Applications <https://medium.com/intel-analytics-software/improving-the-performance-of-xgboost-and-lightgbm-inference-3b542c03447e>`_,
33+
- `Accelerate Linear Models for Machine Learning <https://medium.com/intel-analytics-software/accelerating-linear-models-for-machine-learning-5a75ff50a0fe>`_,
34+
- `Accelerate K-Means Clustering <https://medium.com/intel-analytics-software/accelerate-k-means-clustering-6385088788a1>`_.

doc/sources/conf.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
'sphinx.ext.viewcode',
6262
'sphinx.ext.githubpages',
6363
'sphinx.ext.autodoc',
64+
'nbsphinx'
6465
]
6566

6667
# Add any paths that contain templates here, relative to this directory.
@@ -73,7 +74,7 @@
7374
source_suffix = '.rst'
7475

7576
# The master toctree document.
76-
master_doc = 'contents'
77+
master_doc = 'index'
7778

7879
# The language for content autogenerated by Sphinx. Refer to documentation
7980
# for a list of supported languages.
@@ -85,11 +86,18 @@
8586
# List of patterns, relative to source directory, that match files and
8687
# directories to ignore when looking for source files.
8788
# This pattern also affects html_static_path and html_extra_path .
88-
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
89+
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'usage.rst', 'patching/*']
8990

9091
# The name of the Pygments (syntax highlighting) style to use.
9192
pygments_style = 'sphinx'
9293

94+
# substitutions
95+
96+
rst_prolog = """
97+
.. |reg| unicode:: U+000AE
98+
.. |intelex| replace:: Intel\\ |reg|\\ Extension for Scikit-learn*
99+
"""
100+
93101
# -- Options for HTML output -------------------------------------------------
94102

95103
# The theme to use for HTML and HTML Help pages. See the documentation for

doc/sources/global_patching.rst renamed to doc/sources/global-patching.rst

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,28 +17,37 @@
1717
.. _global_patching:
1818

1919
###############
20-
Global patching
20+
Global Patching
2121
###############
2222

2323
Use global patching to patch all your scikit-learn applications without any additional actions.
2424

25-
**Prerequisites for global patching**:
25+
.. rubric:: Prerequisites
2626

27-
- Intel(R) Extension for Scikit-learn
27+
- |intelex|
2828
- Scikit-learn
29+
- read and write permissions to Scikit-learn files
2930

30-
.. note::
31-
For global patching to work, you need read and write permissions to Scikit-learn files.
31+
Patch all supported algorithms
32+
===============================
3233

3334
To patch all :ref:`supported algorithms <sklearn_algorithms>`, run::
3435

3536
python sklearnex.glob patch_sklearn
3637

38+
Patch selected algorithms
39+
=========================
40+
3741
If you want to patch only some algorithms, use ``--algorithm`` or ``-a`` keys
38-
with a list of algorithms to patch. For example, to patch only SVC and RandomForestClassifier estimators, run::
42+
with a list of algorithms to patch.
43+
44+
For example, to patch only SVC and RandomForestClassifier estimators, run::
3945

4046
python sklearnex.glob patch_sklearn -a svc random_forest_classifier
4147

48+
Disable patching notifications
49+
==============================
50+
4251
If you do not want to receive patching notifications, then use ``--no-verbose`` or ``-nv`` keys::
4352

4453
python sklearnex.glob patch_sklearn -a svc random_forest_classifier -nv
@@ -47,10 +56,16 @@ If you do not want to receive patching notifications, then use ``--no-verbose``
4756
If you run the global patching command several times with different parameters,
4857
then only the last configuration will be applied.
4958

59+
Disable global patching
60+
=======================
61+
5062
To disable global patching, use the following command::
5163

5264
python sklearnex.glob unpatch_sklearn
5365

66+
Enable global patching via code
67+
===============================
68+
5469
You can also enable global patching in your code. To do this,
5570
use the ``patch_sklearn`` function with the ``global_patch`` argument::
5671

@@ -61,6 +76,9 @@ use the ``patch_sklearn`` function with the ``global_patch`` argument::
6176
After that, Scikit-learn patches will be enabled in the current application and
6277
in all others that use the same environment.
6378

79+
Disable global patching via code
80+
================================
81+
6482
To disable global patching via code, use the ``global_patch``
6583
argument in the ``unpatch_sklearn`` function::
6684

0 commit comments

Comments
 (0)