Skip to content

Commit 6684fb1

Browse files
authored
Update READMEs with new template for IntelPyTorch Inference/Training Optimizations (#2305)
* update READMEs with new template * add note about AI samples under table * remove excess old README sections
1 parent 09dd971 commit 6684fb1

File tree

2 files changed

+260
-240
lines changed
  • AI-and-Analytics/Features-and-Functionality
    • IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8
    • IntelPyTorch_TrainingOptimizations_AMX_BF16

2 files changed

+260
-240
lines changed
Lines changed: 133 additions & 124 deletions
Original file line numberDiff line numberDiff line change
@@ -1,158 +1,167 @@
1-
# `PyTorch* Inference Optimizations with Advanced Matrix Extensions Bfloat16 Integer8` Sample
1+
# PyTorch* Inference Optimizations with Advanced Matrix Extensions Bfloat16 Integer8 Sample
22

33
The `PyTorch* Inference Optimizations with Advanced Matrix Extensions Bfloat16 Integer8` sample demonstrates how to perform inference using the ResNet50 and BERT models using the Intel® Extension for PyTorch (IPEX).
44

55
The Intel® Extension for PyTorch (IPEX) extends PyTorch* with optimizations for extra performance boost on Intel® hardware. While most of the optimizations will be included in future PyTorch* releases, the extension delivers up-to-date features and optimizations for PyTorch on Intel® hardware. For example, newer optimizations include AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX).
66

7-
| Area | Description
8-
|:--- |:---
9-
| What you will learn | Inference performance improvements using Intel® Extension for PyTorch (IPEX) with Intel® AMX BF16/INT8
10-
| Time to complete | 5 minutes
11-
| Category | Code Optimization
7+
| Property | Description
8+
|:--- |:---
9+
| Category | Code Optimization
10+
| What you will learn | How to start using Intel® Extension for PyTorch with Intel® AMX BF16/INT8 for inference performance improvements.
11+
| Time to complete | 5 minutes
1212

1313
## Purpose
1414

15-
The Intel® Extension for PyTorch (IPEX) allows you to speed up inference on Intel® Xeon Scalable processors with lower precision data formats and specialized computer instructions. The bfloat16 (BF16) data format uses half the bit width of floating-point-32 (FP32), which lessens the amount of memory needed and execution time to process. Likewise, the integer8 (INT8) data format uses half the bit width of BF16. You should notice performance optimization with the Intel® AMX instruction set when compared to Intel® Vector Neural Network Instructions (Intel® VNNI).
15+
The Intel® Extension for PyTorch* allows you to speed up inference on Intel® Xeon Scalable processors with lower precision data formats and specialized computer instructions. The bfloat16 (BF16) data format uses half the bit width of floating-point-32 (FP32), which lessens the amount of memory needed and execution time to process. Likewise, the integer8 (INT8) data format uses half the bit width of BF16. You should notice performance optimization with the Intel® AMX instruction set when compared to Intel® Vector Neural Network Instructions (Intel® VNNI).
1616

1717
## Prerequisites
1818

1919
| Optimized for | Description
2020
|:--- |:---
21-
| OS | Ubuntu* 18.04 or newer
21+
| OS | Ubuntu* 22.04 or newer
2222
| Hardware | 4th Gen Intel® Xeon® Scalable Processors or newer
23-
| Software | Intel® Extension for PyTorch (IPEX)
23+
| Software | Intel® Extension for PyTorch*
2424

25-
### For Local Development Environments
26-
27-
You will need to download and install the following toolkits, tools, and components to use the sample.
28-
29-
- **Intel® AI Analytics Toolkit (AI Kit)**
30-
31-
You can get the AI Kit from [Intel® AI Analytics Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit-download.html).
32-
33-
### For Intel® DevCloud
34-
35-
The necessary tools and components are already installed in the environment. You do not need to install additional components. See *[Intel® DevCloud for oneAPI](https://DevCloud.intel.com/oneapi/get_started/)* for information.
25+
> **Note**: AI and Analytics samples are validated on AI Tools Offline Installer. For the full list of validated platforms refer to [Platform Validation](https://github.com/oneapi-src/oneAPI-samples/tree/master?tab=readme-ov-file#platform-validation).
3626
3727
## Key Implementation Details
3828

39-
This code sample will perform inference on the ResNet50 and BERT models while using Intel® Extension for PyTorch (IPEX). For each pretrained model, there is a warm-up run of 20 samples before running inference on the specified number of samples (i.e. 1000) to record the time. Intel® AMX is supported on BF16 and INT8 data types starting with the 4th Gen Xeon Scalable Processors. The inference time will be compared, which showcases the speedup over FP32 when using VNNI and Intel® AMX on both BF16 and INT8. The following run cases are executed:
40-
41-
1. FP32 (baseline)
42-
2. BF16 using AVX512_CORE_AMX
43-
3. INT8 using AVX512_CORE_VNNI
44-
4. INT8 using AVX512_CORE_AMX
45-
46-
The Intel® oneAPI Deep Neural Network Library (oneDNN) reference guide contains a page about [CPU Dispatcher Control](https://www.intel.com/content/www/us/en/develop/documentation/onednn-developer-guide-and-reference/top/performance-profiling-and-inspection/cpu-dispatcher-control.html) where you can set the instruction set to AVX-512 and Intel® AMX during runtime. Previous instruction sets are also available.
47-
48-
To run with INT8, the model is quantized using the quantization feature from Intel® Extension for PyTorch (IPEX). TorchScript is also used in all inference run cases to deploy the model in graph mode instead of imperative mode for faster runtime.
49-
50-
The sample tutorial contains one Jupyter Notebook and a Python script. You can use either.
51-
52-
### Jupyter Notebook
53-
54-
| Notebook | Description
55-
|:--- |:---
56-
|`IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8.ipynb` | PyTorch* Inference Optimizations with Advanced Matrix Extensions BF16/INT8
57-
58-
### Python Scripts
59-
60-
| Script | Description
61-
|:--- |:---
62-
|`pytorch_inference_amx.py` | The script performs inference with Intel® AMX BF16/INT8 and compares the performance against the baseline of FP32
63-
|`pytorch_inference_vnni.py` | The script performs inference with VNNI INT8 and compares the performance against the baseline of FP32
64-
65-
## Set Environment Variables
66-
67-
When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
68-
69-
## Run the `PyTorch* Inference Optimizations with Advanced Matrix Extensions Bfloat16 Integer8` Sample
70-
71-
### On Linux*
29+
- This code sample will perform inference on the ResNet50 and BERT models while using Intel® Extension for PyTorch*. For each pretrained model, there is a warm-up run of 20 samples before running inference on the specified number of samples (i.e. 1000) to record the time. Intel® AMX is supported on BF16 and INT8 data types starting with the 4th Gen Xeon Scalable Processors. The inference time will be compared, which showcases the speedup over FP32 when using VNNI and Intel® AMX on both BF16 and INT8.
7230

73-
> **Note**: If you have not already done so, set up your CLI
74-
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
75-
>
76-
> Linux*:
77-
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
78-
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
79-
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
80-
>
81-
> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
31+
- The following run cases are executed:
32+
1. FP32 (baseline)
33+
2. BF16 using AVX512_CORE_AMX
34+
3. INT8 using AVX512_CORE_VNNI
35+
4. INT8 using AVX512_CORE_AMX
8236

83-
#### Activate Conda
37+
- The Intel® oneAPI Deep Neural Network Library (oneDNN) reference guide contains a page about [CPU Dispatcher Control](https://www.intel.com/content/www/us/en/develop/documentation/onednn-developer-guide-and-reference/top/performance-profiling-and-inspection/cpu-dispatcher-control.html) where you can set the instruction set to AVX-512 and Intel® AMX during runtime. Previous instruction sets are also available.
8438

85-
1. Activate the Conda environment.
86-
```
87-
conda activate pytorch
88-
```
89-
2. Activate Conda environment without Root access (Optional).
39+
- To run with INT8, the model is quantized using the quantization feature from Intel® Extension for PyTorch. TorchScript is also used in all inference run cases to deploy the model in graph mode instead of imperative mode for faster runtime.
9040

91-
By default, the AI Kit is installed in the `/opt/intel/oneapi` folder and requires root privileges to manage it.
92-
93-
You can choose to activate Conda environment without root access. To bypass root access to manage your Conda environment, clone and activate your desired Conda environment using the following commands similar to the following.
94-
95-
```
96-
conda create --name user_pytorch --clone pytorch
97-
conda activate user_pytorch
98-
```
99-
100-
#### Additional Environment Setup
101-
102-
- **Additional Packages**
103-
104-
You will need to install these additional packages in *requirements.txt*.
105-
```
106-
python -m pip install -r requirements.txt
107-
```
108-
109-
- **Jupyter Kernelspec**
110-
111-
Add the jupyter kernelspec. This step is essential to ensure the notebook uses the environment you set up.
112-
```
113-
python -m ipykernel install --user --name=user_pytorch
114-
```
115-
116-
117-
#### Running the Jupyter Notebook
118-
119-
1. Change to the sample directory.
120-
2. Launch Jupyter Notebook.
121-
```
122-
jupyter notebook --ip=0.0.0.0 --port 8888 --allow-root
123-
```
124-
3. Follow the instructions to open the URL with the token in your browser.
125-
4. Locate and select the Notebook.
126-
```
127-
IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8.ipynb
128-
```
129-
5. Change your Jupyter Notebook kernel to **user_pytorch**.
130-
6. Run every cell in the Notebook in sequence.
131-
132-
#### Running on the Command Line (Optional)
133-
134-
1. Change to the sample directory.
135-
2. Run the script.
136-
```
137-
python pytorch_inference_amx.py
138-
python pytorch_inference_vnni.py
139-
```
140-
141-
### Troubleshooting
142-
143-
If you encounter environment issues, you can create a new conda environment with the desired Python version, then install Intel® Extension for PyTorch (IPEX) for CPU by following these [instructions](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html). Finally, install all packages in *requirements.txt*.
41+
## Environment Setup
42+
You will need to download and install the following toolkits, tools, and components to use the sample.
14443

145-
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the *[Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)* for more information on using the utility.
44+
**1. Get Intel® AI Tools**
45+
46+
Required AI Tools: Intel® Extension for PyTorch* (CPU)
47+
48+
If you have not already, select and install these Tools via [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html). AI and Analytics samples are validated on AI Tools Offline Installer. It is recommended to select Offline Installer option in AI Tools Selector.
49+
50+
>**Note**: If Docker option is chosen in AI Tools Selector, refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples.
51+
52+
**2. (Offline Installer) Activate the AI Tools bundle base environment**
53+
If the default path is used during the installation of AI Tools:
54+
```
55+
source $HOME/intel/oneapi/intelpython/bin/activate
56+
```
57+
If a non-default path is used:
58+
```
59+
source <custom_path>/bin/activate
60+
```
61+
62+
**3. (Offline Installer) Activate relevant Conda environment**
63+
```
64+
conda activate pytorch
65+
```
66+
67+
**4. Clone the GitHub repository**
68+
```
69+
git clone https://github.com/oneapi-src/oneAPI-samples.git
70+
cd oneAPI-samples/AI-and-Analytics/Features-and-Functionality/IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8
71+
```
72+
73+
**5. Install dependencies**
74+
>**Note**: Before running the following commands, make sure your Conda/Python environment with AI Tools installed is activated
75+
76+
```
77+
pip install -r requirements.txt
78+
pip install notebook
79+
```
80+
For Jupyter Notebook, refer to [Installing Jupyter](https://jupyter.org/install) for detailed installation instructions.
81+
82+
## Run the Sample
83+
>**Note**: Before running the sample, make sure [Environment Setup](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Features-and-Functionality/IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8#environment-setup) is completed.
84+
85+
Go to the section which corresponds to the installation method chosen in [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html) to see relevant instructions:
86+
* [AI Tools Offline Installer (Validated)](#ai-tools-offline-installer-validated)
87+
* [Conda/PIP](#condapip)
88+
* [Docker](#docker)
89+
90+
### AI Tools Offline Installer (Validated)
91+
92+
**1. Register Conda kernel to Jupyter Notebook kernel**
93+
94+
If the default path is used during the installation of AI Tools:
95+
```
96+
$HOME/intel/oneapi/intelpython/envs/pytorch/bin/python -m ipykernel install --user --name=pytorch
97+
```
98+
If a non-default path is used:
99+
```
100+
<custom_path>/bin/python -m ipykernel install --user --name=pytorch
101+
```
102+
**2. Launch Jupyter Notebook**
103+
```
104+
jupyter notebook --ip=0.0.0.0 --port 8888 --allow-root
105+
```
106+
**3. Follow the instructions to open the URL with the token in your browser**
107+
108+
**4. Select the Notebook**
109+
```
110+
IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8.ipynb
111+
```
112+
**5. Change the kernel to `pytorch`**
113+
114+
**6. Run every cell in the Notebook in sequence**
115+
116+
### Conda/PIP
117+
> **Note**: Before running the instructions below, make sure your Conda/Python environment with AI Tools installed is activated
118+
119+
**1. Register Conda/Python kernel to Jupyter Notebook kernel**
120+
For Conda:
121+
```
122+
<CONDA_PATH_TO_ENV>/bin/python -m ipykernel install --user --name=<your-env-name>
123+
```
124+
To know <CONDA_PATH_TO_ENV>, run `conda env list` and find your Conda environment path.
125+
126+
For PIP:
127+
```
128+
python -m ipykernel install --user --name=<your-env-name>
129+
```
130+
**2. Launch Jupyter Notebook**
131+
```
132+
jupyter notebook --ip=0.0.0.0 --port 8888 --allow-root
133+
```
134+
**3. Follow the instructions to open the URL with the token in your browser**
135+
136+
**4. Select the Notebook**
137+
```
138+
IntelPyTorch_InferenceOptimizations_AMX_BF16_INT8.ipynb
139+
```
140+
**5. Change the kernel to `<your-env-name>`**
141+
142+
**6. Run every cell in the Notebook in sequence**
143+
144+
### Docker
145+
AI Tools Docker images already have Get Started samples pre-installed. Refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples.
146146

147147
## Example Output
148148

149149
If successful, the sample displays `[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]`. Additionally, the sample will print out the runtimes and charts of relative performance with the FP32 model without any optimizations as the baseline.
150150

151151
The performance speedups using Intel® AMX BF16 and INT8 are approximate on ResNet50 and BERT. Performance will vary based on your hardware and software versions. To see a larger performance gap between VNNI and Intel® AMX, increase the batch size. For even more speedup, consider using the Intel® Extension for PyTorch (IPEX) [Launch Script](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/launch_script.html).
152152

153+
## Related Samples
154+
155+
* [PyTorch Training Optimizations with Advanced Matrix Extensions Bfloat16](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Features-and-Functionality/IntelPyTorch_TrainingOptimizations_AMX_BF16)
156+
* [Intel PyTorch GPU Inference Optimization with AMP](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Features-and-Functionality/IntelPyTorch_GPU_InferenceOptimization_with_AMP)
157+
153158
## License
154159

155160
Code samples are licensed under the MIT license. See
156-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
161+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
162+
for details.
163+
164+
Third party program Licenses can be found here:
165+
[third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
157166

158-
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
167+
*Other names and brands may be claimed as the property of others. [Trademarks](https://www.intel.com/content/www/us/en/legal/trademarks.html)

0 commit comments

Comments
 (0)