|
1 | | -PyTorch 2 Export Quantization with OpenVINO backend |
| 1 | +PyTorch 2 Export Quantization for OpenVINO torch.compile backend. |
2 | 2 | =========================================================================== |
3 | 3 |
|
4 | 4 | **Author**: dlyakhov, asuslov, aamir, # TODO: add required authors |
5 | 5 |
|
6 | 6 | Introduction |
7 | 7 | -------------- |
8 | 8 |
|
9 | | -This tutorial introduces the steps for utilizing the `Neural Network Compression Framework (nncf) <https://github.com/openvinotoolkit/nncf/tree/develop>`_ to generate a quantized model customized |
10 | | -for the `OpenVINO torch.compile backend <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_ and explains how to lower the quantized model into the `OpenVINO <https://docs.openvino.ai/2024/index.html>`_ representation. |
| 9 | +This tutorial demonstrates how to use `OpenVINOQuantizer` from `Neural Network Compression Framework (NNCF) <https://github.com/openvinotoolkit/nncf/tree/develop>`_ in PyTorch 2 Export Quantization flow to generate a quantized model customized for the `OpenVINO torch.compile backend <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_ and explains how to lower the quantized model into the `OpenVINO <https://docs.openvino.ai/2024/index.html>`_ representation. |
11 | 10 |
|
12 | 11 | The pytorch 2 export quantization flow uses the torch.export to capture the model into a graph and performs quantization transformations on top of the ATen graph. |
13 | 12 | This approach is expected to have significantly higher model coverage, better programmability, and a simplified UX. |
14 | | -OpenVINO is the new backend that compiles the FX Graph generated by TorchDynamo into an optimized OpenVINO model. |
| 13 | +OpenVINO backend compiles the FX Graph generated by TorchDynamo into an optimized OpenVINO model. |
15 | 14 |
|
16 | 15 | The quantization flow mainly includes four steps: |
17 | 16 |
|
18 | 17 | - Step 1: Install OpenVINO and NNCF. |
19 | 18 | - Step 2: Capture the FX Graph from the eager Model based on the `torch export mechanism <https://pytorch.org/docs/main/export.html>`_. |
20 | | -- Step 3: Apply the Quantization flow based on the captured FX Graph. |
21 | | -- Step 4: Lower the quantized model into OpenVINO representation with the API ``torch.compile``. |
| 19 | +- Step 3: Apply the PyTorch 2 Export Quantization flow with OpenVINOQuantizer based on the captured FX Graph. |
| 20 | +- Step 4: Lower the quantized model into OpenVINO representation with the API `torch.compile <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
22 | 21 |
|
23 | 22 | The high-level architecture of this flow could look like this: |
24 | 23 |
|
@@ -80,7 +79,6 @@ We will start by performing the necessary imports, capturing the FX Graph from t |
80 | 79 | import torchvision.models as models |
81 | 80 | from torch.ao.quantization.quantize_pt2e import convert_pt2e |
82 | 81 | from torch.ao.quantization.quantize_pt2e import prepare_pt2e |
83 | | - from torch.ao.quantization.quantizer.openvino_quantizer import OpenVINOQuantizer |
84 | 82 |
|
85 | 83 | import nncf |
86 | 84 | from nncf.torch import disable_patching |
@@ -111,6 +109,8 @@ After we capture the FX Module to be quantized, we will import the OpenVINOQuant |
111 | 109 |
|
112 | 110 | .. code-block:: python |
113 | 111 |
|
| 112 | + from nncf.experimental.torch.fx.quantization.quantizer.openvino_quantizer import OpenVINOQuantizer |
| 113 | +
|
114 | 114 | quantizer = OpenVINOQuantizer() |
115 | 115 |
|
116 | 116 | ``OpenVINOQuantizer`` has several optional parameters that allow tuning the quantization process to get a more accurate model. |
@@ -208,4 +208,4 @@ Conclusion |
208 | 208 | ------------ |
209 | 209 |
|
210 | 210 | With this tutorial, we introduce how to use torch.compile with the OpenVINO backend and the OpenVINO quantizer. |
211 | | -For further information, please visit `OpenVINO deploymet via torch.compile documentation <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
| 211 | +For further information, please visit `OpenVINO deployment via torch.compile documentation <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
0 commit comments