Update readme.md (#2252)

raistefintel · web-flow · commit 701c426803d0 · 2024-03-22T13:28:54.000-07:00
template 2024.1
diff --git a/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted/readme.md b/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted/readme.md
@@ -1,13 +1,13 @@
-# `Intel® Python Scikit-learn Extension Getting Started` Sample
+# Intel® Python Scikit-learn Extension Getting Started Sample
 
-The `Intel® Python Scikit-learn Extension Getting Started` sample demonstrates how to use a support vector machine classifier from Intel® Extension for Scikit-learn* for digit recognition problem. All other machine learning algorithms available with Scikit-learn can be used in the similar way. Intel® Extension for Scikit-learn* speeds up scikit-learn applications. The acceleration is achieved through the use of the Intel® oneAPI Data Analytics Library (oneDAL) [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html), which comes with [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
+The `Intel® Python Scikit-learn Extension Getting Started` sample demonstrates how to use a support vector machine classifier from Intel® Extension for Scikit-learn* for digit recognition problem. All other machine learning algorithms available with Scikit-learn can be used in the similar way. Intel® Extension for Scikit-learn* speeds up scikit-learn applications. The acceleration is achieved through the use of the Intel® oneAPI Data Analytics Library (oneDAL) [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html).
 
 
 | Area                     | Description
 |:---                      | :---
+| Category                 | Getting Started
 | What you will learn      | How to use a basic Intel® Extension for Scikit-learn* programming model for Intel CPUs
 | Time to complete         | 5 minutes
-| Category                 | Getting Started
 
 ## Prerequisites
 
@@ -28,113 +28,87 @@ In this sample, you will run a support vector classifier model from sklearn with
 
 ## Key Implementation Details
 
-This Getting Started sample code is implemented for CPU using the Python language. The example assumes you have Intel® Extension for Scikit-learn* installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit). Intel® Extension for Scikit-learn* is available as a part of Intel® AI Analytics Toolkit (AI kit).
-
-## Configure the Local Environment
+This Getting Started sample code is implemented for CPU using the Python language. Intel® Extension for Scikit-learn* is available as a part of Intel® AI Tools.
 
-> **Note**: If you have not already done so, set up your CLI
-> environment by sourcing  the `setvars` script in the root of your oneAPI installation.
->
-> Linux*:
-> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
-> - For private installations: ` . ~/intel/oneapi/setvars.sh`
-> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
->
-> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
+You will need to download and install the following toolkits, tools, and components to use the sample.
 
-### On Linux*
+**1. Get Intel® AI Tools**
 
-#### Activate Conda with Root Access
+Required AI Tools: Intel® Extension for Scikit-learn*
+<br>If you have not already, select and install these Tools via [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html). AI and Analytics samples are validated on AI Tools Offline Installer. It is recommended to select Offline Installer option in AI Tools Selector.
 
-Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command.
+**2. Install dependencies**
 ```
-source activate base
 pip install -r requirements.txt
 ```
+**Install Jupyter Notebook** by running `pip install notebook`. Alternatively, see [Installing Jupyter](https://jupyter.org/install) for detailed installation instructions.
 
-#### Activate Conda without Root Access (Optional)
+## Run the Sample
+>**Note**: Before running the sample, make sure [Environment Setup](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted#environment-setup) is completed.
+Go to the section which corresponds to the installation method chosen in [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html) to see relevant instructions:
+* [AI Tools Offline Installer (Validated)](#ai-tools-offline-installer-validated)
+* [Conda/PIP](#condapip) 
+* [Docker](#docker)
 
-By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone and activate your desired conda environment using the following commands.
+### AI Tools Offline Installer (Validated)  
+1. If you have not already done so, activate the AI Tools bundle base environment. If you used the default location to install AI Tools, open a terminal and type the following
 ```
-conda create --name usr_intelpython --clone base
-source activate usr_intelpython
+source $HOME/intel/oneapi/intelpython/bin/activate
+```
+If you used a separate location, open a terminal and type the following
+```
+source <custom_path>/bin/activate
+```
+2. Activate the Conda environment:
+```
+conda activate sklearnex
+``` 
+3. Clone the GitHub repository:
+``` 
+git clone https://github.com/oneapi-src/oneAPI-samples.git
+cd oneapi-samples/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted
 ```
 
-### Install Jupyter Notebook
-
-1. Change to the sample directory.
-2. Install Jupyter Notebook with the proper kernel.
-   ```
-   conda install jupyter nb_conda_kernels
-   ```
-
-#### View in Jupyter Notebook
-
->**Note**: This distributed execution cannot be launched from Jupyter Notebook, but you can still view inside the notebook to follow the included write-up and description.
-
-1. Change to the sample directory.
-2. Launch Jupyter Notebook.
-   ```
-   jupyter notebook
-   ```
-3. Locate and select the Notebook.
-   ```
-   Intel_Extension_For_SKLearn_GettingStarted.ipynb
-   ```
-4. Click the **Run** button to move through the cells in sequence.
-
-
-#### Troubleshooting
-
-If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the *[Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)* for more information on using the utility.
-
-
-### Run the Sample on Intel® DevCloud (Optional)
-
-1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi).
-2. On a Linux* system, open a terminal.
-3. SSH into Intel® DevCloud.
-   ```
-   ssh DevCloud
-   ```
-   > **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started).
-
-#### Run the Notebook
-
-1. Locate and select the Notebook.
-   ```
-   Intel_Extension_For_SKLearn_GettingStarted.ipynb
-   ````
-2. Run every cell in the Notebook in sequence.
-
-#### Run the Python Script
-
-1. Change to the sample directory.
-2. Configure the sample for the appropriate node.
-   <details>
-   <summary>You can specify nodes using a single line script.</summary>
-
-   ```
-   qsub  -I  -l nodes=1:xeon:ppn=2 -d .
-   ```
-
-   - `-I` (upper case I) requests an interactive session.
-   - `-l nodes=1:xeon:ppn=2` (lower case L) assigns one full GPU node.
-   - `-d .` makes the current folder as the working directory for the task.
-
-     |Available Nodes    |Command Options
-     |:---               |:---
-     |GPU	             |`qsub -l nodes=1:gpu:ppn=2 -d .`
-     |CPU	             |`qsub -l nodes=1:xeon:ppn=2 -d .`
-
+4. Launch Jupyter Notebook: 
+> **Note**: You might need to register Conda kernel to Jupyter Notebook kernel, 
+feel free to check [the instruction](https://github.com/IntelAI/models/tree/master/docs/notebooks/perf_analysis#option-1-conda-environment-creation)
+```
+jupyter notebook --ip=0.0.0.0
+```
+<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root -->
+5. Follow the instructions to open the URL with the token in your browser.
+6. Select the Notebook:
+```
+Intel_Extension_For_SKLearn_GettingStarted.ipynb
+```
+7. Change the kernel to sklearnex
+  
+8. Run every cell in the Notebook in sequence.
+
+### Conda/PIP
+> **Note**: Make sure your Conda/Python environment with AI Tools installed is activated
+1. Clone the GitHub repository:
+``` 
+git clone https://github.com/oneapi-src/oneAPI-samples.git
+cd oneapi-samples/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted
+```
+2. Launch Jupyter Notebook: 
+> **Note**: You might need to register Conda kernel to Jupyter Notebook kernel, 
+feel free to check [the instruction](https://github.com/IntelAI/models/tree/master/docs/notebooks/perf_analysis#option-1-conda-environment-creation)
+```
+jupyter notebook --ip=0.0.0.0
+```
+<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root -->
+4. Follow the instructions to open the URL with the token in your browser.
+5. Select the Notebook:
+```
+Intel_Extension_For_SKLearn_GettingStarted.ipynb
+```
 
-    >**Note**: For more information on how to specify compute nodes read *[Launch and manage jobs](https://devcloud.intel.com/oneapi/documentation/job-submission/)* in  the Intel® DevCloud Documentation.
-   </details>
+6. Run every cell in the Notebook in sequence.
 
-3. Run the script.
-   ```
-   python Intel_Extension_For_SKLearn_GettingStarted.py 
-   ```
+### Docker
+AI Tools Docker images already have Get Started samples pre-installed. Refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples.
 
 ## Example Output
 
@@ -150,9 +124,16 @@ Model accuracy on test data: 0.9833333333333333
 [CODE_SAMPLE_COMPLETED_SUCCESFULLY]
 ```
 
+## Related Samples
+
+* [Intel® Python XGBoost* Getting Started](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelPython_XGBoost_GettingStarted)
+* [Intel® Python Daal4py Getting Started](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelPython_daal4py_GettingStarted)
+
 ## License
 
 Code samples are licensed under the MIT license. See
 [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
 
 Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
+
+*Other names and brands may be claimed as the property of others. [Trademarks](https://www.intel.com/content/www/us/en/legal/trademarks.html)