Skip to content

Commit 7080627

Browse files
merged v2.4.3.dev1(The XQuant Extension Tool has been added) (#1501)
* Developed the XQuant Extension Tool. * Developed XQuant to support "not Tensor"(not only "list") input. * Updated repository URLs in documents and tutorials.
1 parent 6a3dd1b commit 7080627

183 files changed

Lines changed: 13469 additions & 14792 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,9 @@ Modify your model's quantization configuration for specific layers or apply a cu
104104
**🖥️ Visualization**. Observe useful information for troubleshooting the quantized model's performance using TensorBoard. [Read more](https://sonysemiconductorsolutions.github.io/mct-model-optimization/guidelines/visualization.html).
105105

106106
**🔑 XQuant (Explainable Quantization)** [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sony/model_optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_xquant.ipynb). Get valuable insights regarding the quality and success of the quantization process of your model. The report includes histograms and similarity metrics between the original float model and the quantized model in key points of the model. The report can be visualized using TensorBoard.
107+
108+
**🔑 XQuant Extension Tool.** Calculates the error for each layer by comparing the float model and quantized model, using both models along with the quantization log. The results are presented in reports. It identifies the causes of the detected errors and recommends appropriate improvement measures for each cause. [Read more](docs/guidelines/XQuant_Extension_Tool.html) [Troubleshoot Manual](docs/docs_troubleshoot/index.html)
109+
107110
__________________________________________________________________________________________________________
108111
### Enhanced Post-Training Quantization (EPTQ)
109112
As part of the GPTQ capability, we provide an advanced optimization algorithm called EPTQ.

docs/.buildinfo

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
2-
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: 0ea336ad60b6f25f2eb2c0bcb6e36019
2+
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
3+
config: 0acc28ca46f6cf8e16b36376430fc6b8
44
tags: 645f666f9bcd5a90fca523b33c5a78b7

docs/_sources/api/api_docs/classes/XQuantConfig.rst.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,14 @@ XQuant Configuration
99
.. autoclass:: model_compression_toolkit.xquant.common.xquant_config.XQuantConfig
1010
:members:
1111

12+
.. note::
1213

14+
The following parameters are only used in the **xquant_report_troubleshoot_pytorch_experimental** function.
1315

16+
- quantize_reported_dir
17+
- threshold_quantize_error
18+
- is_detect_under_threshold_quantize_error
19+
- threshold_degrade_layer_ratio
20+
- threshold_zscore_outlier_removal
21+
- threshold_ratio_unbalanced_concatenation
22+
- threshold_bitwidth_mixed_precision_with_model_output_loss_objective

docs/_sources/api/api_docs/index.rst.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ xquant
7979
===========
8080

8181
- :ref:`xquant_report_pytorch_experimental<ug-xquant_report_pytorch_experimental>`: A function to generate an explainable quantization report for a quantized Pytorch model (experimental).
82+
- :ref:`xquant_report_troubleshoot_pytorch_experimental<ug-xquant_report_troubleshoot_pytorch_experimental>`: A function to generate an explainable quantization report, detect degraded layaers and judge degrade causes for a quantized Pytorch model. (experimental).
8283
- :ref:`xquant_report_keras_experimental<ug-xquant_report_keras_experimental>`: A function to generate an explainable quantization report for a quantized Keras model (experimental).
8384

8485
- :ref:`XQuantConfig<ug-XQuantConfig>`: Configuration for the XQuant report (experimental).
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
:orphan:
2+
3+
.. _ug-xquant_report_troubleshoot_pytorch_experimental:
4+
5+
6+
================================================
7+
XQuant Report Troubleshoot Pytorch
8+
================================================
9+
10+
.. autofunction:: model_compression_toolkit.xquant.pytorch.facade_xquant_report.xquant_report_troubleshoot_pytorch_experimental
11+
12+
13+
14+
Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
===============================
2+
XQuant Extension Tool
3+
===============================
4+
About XQuant Extension Tool
5+
===============================
6+
This tool calculates the error for each layer by comparing the float model and quantized model, using both models along with the quantization log. The results are presented in reports. It identifies the causes of the detected errors and recommends appropriate improvement measures for each cause. The following are the main components of the XQuant functional extension tool.
7+
8+
* Troubleshooting Manual
9+
10+
A document that outlines judgment methods and countermeasures for accuracy degradation, based on existing troubleshooting documentation.
11+
12+
* XQuant Extension Tool
13+
14+
A tool that connects layers identified as having degraded accuracy to the relevant improvement manual and outputs the results.
15+
16+
Overall Process Flow
17+
============================
18+
19+
.. image:: ../../images/flow.png
20+
21+
The overall process follows the steps below:
22+
23+
1. Input the float model, quantized model, and quantization log.
24+
2. Detect layers in which accuracy has degraded due to quantization.
25+
3. Judge degradation causes on the detected degraded layers.
26+
4. Based on the judge results, individual countermeasure procedures or general improvement measures are proposed from the troubleshooting manual.
27+
28+
Additionally, in the cases highlighted in red below, general improvement measures will be suggested instead of specific countermeasures for each item judged.
29+
30+
* When no degraded layers can be found
31+
* When the majority of layers are identified as degraded and the issue is judged not to be with individual layers
32+
* When judging accuracy as degraded and none of the judge items apply
33+
* When accuracy does not improve after applying the proposed judgment countermeasures
34+
35+
Please refer to the attached link for the items to be judged in detail.
36+
37+
How to Run
38+
===============
39+
40+
| For instructions on how to execute, please refer to the link attached.
41+
| This XQuant Extension Tool was created based on xquant, as shown in the link below. In addition to the conventional xquant functions, it is linked to a troubleshooting manual and provides appropriate countermeasures for each cause of degration.
42+
43+
| It can suggest more specific countermeasures than conventional tools and provides manuals that are easy to understand even for users who are not familiar with quantization.
44+
45+
When runnnig the tool, replace **xquant_report_pytorch_experimental** in the code with **xquant_report_troubleshoot_pytorch_experimental** in the tutorial of XQuant (Explainable Quantization).
46+
47+
.. code-block:: python
48+
49+
from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
50+
# xquant_report_pytorch_experimental --> xquant_report_troubleshoot_pytorch_experimental
51+
result = xquant_report_troubleshoot_pytorch_experimental(
52+
float_model,
53+
quantized_model,
54+
random_data_gen,
55+
validation_dataset,
56+
xquant_config
57+
)
58+
59+
60+
| When running XQuant, execute the steps in the following order:
61+
62+
1. mct.set_log_folder
63+
2. mct.ptq.pytorch_post_training_quantization
64+
3. XQuantConfig
65+
4. xquant_report_troubleshoot_pytorch_experimental
66+
67+
.. code-block:: python
68+
69+
mct.set_log_folder('./log/dir/path')
70+
71+
quantized_model, quantized_info = mct.ptq.pytorch_post_training_quantization(
72+
in_module=float_model, representative_data_gen=random_data_gen)
73+
74+
xquant_config = XQuantConfig(report_dir='./log_tensorboard_xquant')
75+
76+
from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
77+
result = xquant_report_troubleshoot_pytorch_experimental(
78+
float_model,
79+
quantized_model,
80+
random_data_gen,
81+
validation_dataset,
82+
xquant_config
83+
)
84+
85+
| The log for TensorBoard is generated in the folder path set by *mct.set_log_folder*.
86+
87+
.. note::
88+
89+
If log for TensorBoard does not exist, the *Unbalanced Concatnation* described below will not be executed.
90+
91+
XQuantConfig Format and Examples
92+
======================================
93+
94+
When running XQuant, the parameters can be set as shown in the table below.
95+
96+
.. list-table:: XQuantConfig parameter
97+
:header-rows: 1
98+
:widths: 15 15 50 20
99+
100+
* - input parameter
101+
- type
102+
- details
103+
- initial value
104+
105+
* - report_dir
106+
- str
107+
- Directory where the results will be saved. **[Necessary]**
108+
- ``-``
109+
110+
* - custom_similarity_metrics
111+
- dict[str, Callable]
112+
- User-specified quantization error metric calculation functions. str: metric name, Callable: function to calculate the metric.
113+
- None
114+
115+
* - quantize_reported_dir
116+
- str
117+
- Directory where the the quantization log will be saved. If not specified, the path set with *mct.set_log_folder* will be used.
118+
- Most recently set value in mct.set_log_folder
119+
120+
* - threshold_quantize_error
121+
- dict[str, float]
122+
- Threshold values for detecting degradation in accuracy.
123+
- {"mse":0.1, "cs":0.1, "sqnr":0.1}
124+
125+
* - is_detect_under_threshold_quantize_error
126+
- dict[str, bool]
127+
- For each threshold specified in threshold_quantize_error, True: detect the layer as degraded when the error is below the threshold.; False: detect the layer as degraded when the error is above the threshold (Not required if custom metrics are not set).
128+
- {"mse":False, "cs":True, "sqnr":True}
129+
130+
* - threshold_degration_layer_ratio
131+
- float
132+
- If the number of layers detected as degraded is large, skips the judge degradation causes Specify the ratio here.
133+
- 0.5
134+
135+
* - threshold_zscore_outlier_removal
136+
- float
137+
- Used in judge degradation causes (Outlier Removal). Threshold for z_score to detect outliers.
138+
- 5.0
139+
140+
* - threshold_ratio_unbalanced_concatnation
141+
- float
142+
- Used in judge degradation causes (unbalanced “concatnation”). Threshold for the multiplier of range width between concatenated layers.
143+
- 16.0
144+
145+
* - threshold_bitwidth_mixed_precision
146+
_with_model_output_loss_objective
147+
- int
148+
- Used in judge degradation causes (Mixed precision with model output loss objective). Bitwidth of the final layer to judge insufficient bitwidth.
149+
- 2
150+
151+
You can configure each parameter by calling the XQuantConfig class as shown below.
152+
153+
.. code-block:: python
154+
155+
XQuantConfig(report_dir: str,
156+
custom_similarity_metrics: Dict[str, Callable] = None,
157+
quantize_reported_dir: str = None,
158+
threshold_quantize_error: Dict[str, float] = {"mse": 0.1, "cs": 0.1, "sqnr": 0.1},
159+
is_detect_under_threshold_quantize_error: Dict[str, bool] = {"mse": False, "cs": True, "sqnr": True},
160+
threshold_degrade_layer_ratio: float = 0.5,
161+
threshold_zscore_outlier_removal: float = 5.0,
162+
threshold_ratio_unbalanced_concatenation: float = 16.0,
163+
threshold_bitwidth_mixed_precision_with_model_output_loss_objective: int = 2
164+
):
165+
166+
Understanding the Quantization Error Graph
167+
=============================================================
168+
169+
| Quantization error graphs are generated for three calculation methods (mse, cs, sqnr) and two datasets (representative and validation), resulting in six graphs in total.
170+
| These graphs are saved in the directory specified by the XQuantConfig's report_dir.
171+
172+
.. image:: ../../images/quant_loss_mse_repr.png
173+
174+
* **X-axis**: Layer names (layers identified as degraded are highlighted in red)
175+
* **Y-axis**: Quantization error
176+
* **Red dashed line**: Threshold for accuracy degradation as set in XQuantConfig
177+
* **Red circle**: Layers judged to have degraded accuracy
178+
179+
| As an example, an output graph calculated using "mse" with a representative dataset is shown.
180+
| The initial threshold value of 0.1 is set, and layers exceeding this threshold are indicated with a red circle. In addition, the corresponding layer names on the X axis are highlighted in red. With this graph, layers with accuracy degradation can be visually confirmed.
181+
182+
Understanding the judgment result
183+
============================================
184+
185+
Outlier Removal
186+
-----------------
187+
188+
| In outlier removal, values exceeding the threshold set in XQuantConfig's threshold_zscore_outlier_removal are detected.
189+
| The console displays a message stating that there are output values that deviate significantly from the average, and refers you to the Troubleshooting Manual's “Outlier Removal” section.
190+
| It also lists the histogram save paths for the layers containing the detected outliers.
191+
| The histograms are saved in a directory named “outlier_histgrams” created in the path specified by **report_dir** in **XQuantConfig**.
192+
193+
::
194+
195+
WARNING:Model Compression Toolkit:There are output values ​​that deviate significantly from the average. Refer to the following images and the TroubleShooting Documentation (MCT XQuant Extension Tool) of 'Outlier Removal'.
196+
WARNING:Model Compression Toolkit:./log_tensorboard_xquant/outlier_histgrams/stem_2_conv_kxk_0_conv_bn.png
197+
WARNING:Model Compression Toolkit:./log_tensorboard_xquant/outlier_histgrams/stages_0_blocks_0_token_mixer_mixer_conv_scale_conv_bn.png
198+
WARNING:Model Compression Toolkit:./log_tensorboard_xquant/outlier_histgrams/stages_0_blocks_0_token_mixer_mixer_conv_kxk_0_conv_bn.png
199+
200+
201+
202+
203+
204+
| Next, we will move on to explaining the output histogram.
205+
206+
.. image:: ../../images/outlier.png
207+
208+
* **First X-axis(lower part)**: Indicates bins that finely divide the range of data values.
209+
* **Second X-axis(upper part)**: Shows the z-score values corresponding to the primary X-axis.
210+
* **Red dashed line**: The z-score threshold set in XQuantConfig.
211+
* **Black dashed line**
212+
213+
* **Lower zscore**: Indicates the maximum value on the lower side of the histogram.
214+
* **Upper zscore**: Indicates the maximum value on the upper side of the histogram.
215+
216+
| An example of a histogram detected by Outlier Removal is shown.
217+
| In this example, outliers appear in the range from about 3.9 to 5.3 on the lower end of the z-score.
218+
| Therefore, setting the z-score threshold to 3.9 will allow these outliers to be removed.
219+
220+
| By setting the threshold confirmed using the XQuant extension tool as an argument when executing **mct.ptq.pytorch_post_training_quantization** as follows, it is possible to remove outliers.
221+
222+
.. code-block:: python
223+
224+
core_config = mct.core.CoreConfig(mct.core.QuantizationConfig(z_threshold=3.9))
225+
quantized_model, quantized_info = mct.ptq.pytorch_post_training_quantization(in_module=float_model,
226+
representative_data_gen=random_data_gen,
227+
core_config=core_config)
228+
229+
Shift Negative Activation
230+
------------------------------
231+
232+
| Shift Negative Activation is detected when a default activation layer is present in the model.
233+
| The console will display a message indicating that such a layer has been found and recommend you consult the “Shift Negative Activation” section of the troubleshooting manual.
234+
235+
| The detected activation layers will also be listed.
236+
237+
| Please refer to the troubleshooting manual for further details.
238+
239+
::
240+
241+
WARNING:Model Compression Toolkit:There are activations that contain negative values. Refer to the troubleshooting manual of "Shift Negative Activation".
242+
WARNING:Model Compression Toolkit:stem_0_act=GELU
243+
WARNING:Model Compression Toolkit:stem_1_act=GELU
244+
WARNING:Model Compression Toolkit:stem_2_act=GELU
245+
246+
247+
248+
249+
Unbalanced Concatnation
250+
---------------------------
251+
252+
| For unbalanced concatenation, the value ranges of each concatenated layer are calculated.
253+
| If the difference in range width exceeds the threshold set in XQuantConfig's threshold_ratio_unbalanced_concatnation, it is considered unbalanced.
254+
| The console will display a message indicating detection of unbalanced concatenation and suggest consulting the “Unbalanced Concatenation” section of the troubleshooting manual.
255+
| Additionally, the names of the relevant layers and the recommended scaling factor are displayed.
256+
257+
::
258+
259+
WARNING:Model Compression Toolkit:There are unbalanced range layers concatnated. Refer to the troubleshooting manual of 'Unbalanced "concatenation"'.
260+
WARNING:Model Compression Toolkit:first layer:features.15.conv.2, second layer:features.15.conv.3, if you add a scaling operation, recommended scaling:first layer * 5.758747418625537
261+
WARNING:Model Compression Toolkit:first layer:features.16.conv.2, second layer:features.16.conv.3, if you add a scaling operation, recommended scaling:first layer * 6.228137651975462
262+
263+
264+
265+
266+
| To resolve unbalanced concatenation, there are two methods: disable optimization for some graphs as shown below, or apply a multiplier to the layer before concatenation as shown in the console.
267+
268+
| For details, refer to the troubleshooting manual.
269+
270+
.. code-block:: python
271+
272+
core_config = mct.core.CoreConfig(mct.core.QuantizationConfig(linear_collapsing=False,
273+
residual_collapsing=False))
274+
quantized_model, _ = mct.ptq.pytorch_post_training_quantization(...,
275+
core_config=core_config)
276+
277+
Mixed Precision with model output loss objective
278+
------------------------------------------------------------
279+
280+
| In Mixed Precision with model output loss objective, output is generated when the bit width of the final layer is less than or equal to the threshold (default: 2).
281+
| If the quantization bit width of the last layer is unusually small, the console will display a message recommending you refer to the troubleshooting manual’s “Mixed Precision with model output loss objective” section.
282+
283+
::
284+
285+
WARNING:Model Compression Toolkit:the quantization bitwidth of the last layer is an extremely small number. Refer to the troubleshooting manual of 'Mixed Precision with model output loss objective'.
286+

docs/_sources/index.rst.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ Visualization:
7373
:maxdepth: 1
7474

7575
Visualize a model and other data within the TensorBoard UI. <../guidelines/visualization>
76+
XQuant Extension Tool <../guidelines/XQuant_Extension_Tool>
7677

7778

7879
Quickstart

0 commit comments

Comments
 (0)