|
1 |
| -PyTorch 2 Export Quantization with OpenVINO backend |
| 1 | +PyTorch 2 Export Quantization for OpenVINO torch.compile backend. |
2 | 2 | ===========================================================================
|
3 | 3 |
|
4 | 4 | **Author**: dlyakhov, asuslov, aamir, # TODO: add required authors
|
5 | 5 |
|
6 | 6 | Introduction
|
7 | 7 | --------------
|
8 | 8 |
|
9 |
| -This tutorial introduces the steps for utilizing the `Neural Network Compression Framework (nncf) <https://github.com/openvinotoolkit/nncf/tree/develop>`_ to generate a quantized model customized |
10 |
| -for the `OpenVINO torch.compile backend <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_ and explains how to lower the quantized model into the `OpenVINO <https://docs.openvino.ai/2024/index.html>`_ representation. |
| 9 | +This tutorial demonstrates how to use `OpenVINOQuantizer` from `Neural Network Compression Framework (NNCF) <https://github.com/openvinotoolkit/nncf/tree/develop>`_ in PyTorch 2 Export Quantization flow to generate a quantized model customized for the `OpenVINO torch.compile backend <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_ and explains how to lower the quantized model into the `OpenVINO <https://docs.openvino.ai/2024/index.html>`_ representation. |
11 | 10 |
|
12 | 11 | The pytorch 2 export quantization flow uses the torch.export to capture the model into a graph and performs quantization transformations on top of the ATen graph.
|
13 | 12 | This approach is expected to have significantly higher model coverage, better programmability, and a simplified UX.
|
14 |
| -OpenVINO is the new backend that compiles the FX Graph generated by TorchDynamo into an optimized OpenVINO model. |
| 13 | +OpenVINO backend compiles the FX Graph generated by TorchDynamo into an optimized OpenVINO model. |
15 | 14 |
|
16 | 15 | The quantization flow mainly includes four steps:
|
17 | 16 |
|
18 | 17 | - Step 1: Install OpenVINO and NNCF.
|
19 | 18 | - Step 2: Capture the FX Graph from the eager Model based on the `torch export mechanism <https://pytorch.org/docs/main/export.html>`_.
|
20 |
| -- Step 3: Apply the Quantization flow based on the captured FX Graph. |
21 |
| -- Step 4: Lower the quantized model into OpenVINO representation with the API ``torch.compile``. |
| 19 | +- Step 3: Apply the PyTorch 2 Export Quantization flow with OpenVINOQuantizer based on the captured FX Graph. |
| 20 | +- Step 4: Lower the quantized model into OpenVINO representation with the API `torch.compile <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
22 | 21 |
|
23 | 22 | The high-level architecture of this flow could look like this:
|
24 | 23 |
|
@@ -80,7 +79,6 @@ We will start by performing the necessary imports, capturing the FX Graph from t
|
80 | 79 | import torchvision.models as models
|
81 | 80 | from torch.ao.quantization.quantize_pt2e import convert_pt2e
|
82 | 81 | from torch.ao.quantization.quantize_pt2e import prepare_pt2e
|
83 |
| - from torch.ao.quantization.quantizer.openvino_quantizer import OpenVINOQuantizer |
84 | 82 |
|
85 | 83 | import nncf
|
86 | 84 | from nncf.torch import disable_patching
|
@@ -111,6 +109,8 @@ After we capture the FX Module to be quantized, we will import the OpenVINOQuant
|
111 | 109 |
|
112 | 110 | .. code-block:: python
|
113 | 111 |
|
| 112 | + from nncf.experimental.torch.fx.quantization.quantizer.openvino_quantizer import OpenVINOQuantizer |
| 113 | +
|
114 | 114 | quantizer = OpenVINOQuantizer()
|
115 | 115 |
|
116 | 116 | ``OpenVINOQuantizer`` has several optional parameters that allow tuning the quantization process to get a more accurate model.
|
@@ -208,4 +208,4 @@ Conclusion
|
208 | 208 | ------------
|
209 | 209 |
|
210 | 210 | With this tutorial, we introduce how to use torch.compile with the OpenVINO backend and the OpenVINO quantizer.
|
211 |
| -For further information, please visit `OpenVINO deploymet via torch.compile documentation <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
| 211 | +For further information, please visit `OpenVINO deployment via torch.compile documentation <https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html>`_. |
0 commit comments