Skip to content

Commit c9d664e

Browse files
Update tutorial for XQuant Extension Tool: refine content and remove redundant sections for clarity
1 parent 1471725 commit c9d664e

File tree

2 files changed

+303
-150
lines changed

2 files changed

+303
-150
lines changed

tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_XQuant_Extension_Tool.ipynb

Lines changed: 8 additions & 150 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,16 @@
66
"id": "ag0MtvPUkc8i"
77
},
88
"source": [
9-
"# Quantization Troubleshooting with the Model Compression Toolkit (MCT) Using the XQuant Extension Tool\n",
9+
"# Quantization Troubleshooting with the Model Compression Toolkit (MCT) Using the XQuant Extension Tool(Judgeable Troubleshoot)\n",
1010
"\n",
1111
"[Run this tutorial in Google Colab](https://colab.research.google.com/github/SonySemiconductorSolutions/mct-model-optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_XQuant_Extension_Tool.ipynb)\n",
1212
"\n",
1313
"## Overview\n",
1414
"This notebook provides practical guidance for improving the quality of post‑training quantization for PyTorch models using the XQuant Extension Tool. \n",
1515
"It computes the error for each layer by comparing the floating‑point model and the quantized model, and combines these results with the quantization log. The analysis is presented as a report that highlights the causes of detected errors and suggests appropriate corrective actions for each.\n",
1616
"\n",
17+
"General corrective actions are listed [here](https://colab.research.google.com/github/SonySemiconductorSolutions/mct-model-optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_XQuant_Extension_Tool_2.ipynb).\n",
18+
"\n",
1719
"## Summary\n",
1820
"We will cover the following steps:\n",
1921
"\n",
@@ -24,16 +26,8 @@
2426
" - Quantization Troubleshooting for MCT \n",
2527
"4. Judgeable Troubleshoot\n",
2628
" - Outlier Removal\n",
27-
"5. General Troubleshoot\n",
28-
" - Representative Dataset Size & Diversity\n",
29-
" - Bias Correction\n",
30-
" - Using More Samples in Mixed Precision Quantization\n",
31-
" - Threshold Selection Error Method\n",
32-
" - Enabling Hessian Based Mixed Precision\n",
33-
" - GPTQ - Gradient-Based Post Training Quantization\n",
34-
"6. Conclusion\n",
29+
"5. Conclusion\n",
3530
"\n",
36-
" \n",
3731
"## Setup\n",
3832
"Install the relevant packages:"
3933
]
@@ -187,152 +181,16 @@
187181
"cell_type": "markdown",
188182
"metadata": {},
189183
"source": [
190-
"## General Troubleshoots\n",
191-
"If there is no significant improvement, comprehensively evaluate other areas for improvement.\n",
192-
"The following items are general troubleshoots for quantization accuracy improvement."
193-
]
194-
},
195-
{
196-
"cell_type": "markdown",
197-
"metadata": {},
198-
"source": [
199-
"### Representative Dataset Size & Diversity\n",
200-
"The representative dataset is used by MCT to derive the threshold for the model's activation tensor.\n",
201-
"If the representative dataset is too small or not diverse enough, accuracy may decrease.\n",
202-
"Increase the number of samples in the representative dataset or increase the diversity of the samples.\n",
203-
"\n",
204-
"See [TroubleShooting Documentation>>Representative Dataset Size & Diversity](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/representative_dataset_size_and_diversity.html#ug-representative-dataset-size-and-diversity)"
205-
]
206-
},
207-
{
208-
"cell_type": "code",
209-
"execution_count": null,
210-
"metadata": {},
211-
"outputs": [],
212-
"source": [
213-
"# Change representative dataset and re-run the quantization process."
214-
]
215-
},
216-
{
217-
"cell_type": "markdown",
218-
"metadata": {},
219-
"source": [
220-
"### Bias Correction\n",
221-
"MCT applies bias correction by default to overcome induced bias shift caused by weights quantization.\n",
222-
"\n",
223-
"You can check if the bias correction causes a degradation in accuracy, by disabling the bias correction (setting weights_bias_correction to False of the QuantizationConfig in CoreConfig).\n",
224-
"\n",
225-
"See [TroubleShooting Documentation>>Bias Correction](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/bias_correction.html#ug-bias-correction)"
226-
]
227-
},
228-
{
229-
"cell_type": "code",
230-
"execution_count": null,
231-
"metadata": {},
232-
"outputs": [],
233-
"source": [
234-
"# Change weights_bias_correction and re-run the quantization process."
235-
]
236-
},
237-
{
238-
"cell_type": "markdown",
239-
"metadata": {},
240-
"source": [
241-
"### Using More Samples in Mixed Precision Quantization\n",
242-
"\n",
243-
"In Mixed Precision quantization, MCT will assign a different bit width to each weight in the model, depending on the weight’s layer sensitivity and a resource constraint defined by the user, such as target model size.\n",
244-
"\n",
245-
"By default, MCT employs 32 samples from the provided representative dataset for the Mixed Precision search. Leveraging a larger dataset could enhance results, particularly when dealing with datasets exhibiting high variance.\n",
246-
"\n",
247-
"Set the num_of_images attribute to a larger value of the MixedPrecisionQuantizationConfig in CoreConfig.\n",
248-
"\n",
249-
"See [TroubleShooting Documentation>>Using More Samples in Mixed Precision Quantization](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/using_more_samples_in_mixed_precision_quantization.html#ug-using-more-samples-in-mixed-precision-quantization"
250-
]
251-
},
252-
{
253-
"cell_type": "code",
254-
"execution_count": null,
255-
"metadata": {},
256-
"outputs": [],
257-
"source": [
258-
"# Change num_of_images and re-run the quantization process."
259-
]
260-
},
261-
{
262-
"cell_type": "markdown",
263-
"metadata": {},
264-
"source": [
265-
"### Threshold Selection Error Method\n",
266-
"MCT defaults to employing the Mean-Squared Error (MSE) metric for threshold optimization, however, it offers a range of alternative error metrics (e.g. using min/max values, KL-divergence, etc.) to accommodate different network requirements.\n",
267-
"\n",
268-
"We advise you to consider other error metrics if your model is suffering from significant accuracy degradation, especially if it contains unorthodox activation layers.\n",
269-
"\n",
270-
"For example, set NOCLIPPING to the activation_error_method attribute of the QuantizationConfig in CoreConfig.\n",
271-
"\n",
272-
"See [TroubleShooting Documentation>>Threshold Selection Error Method](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/threhold_selection_error_method.html#ug-threshold-selection-error-method)"
273-
]
274-
},
275-
{
276-
"cell_type": "code",
277-
"execution_count": null,
278-
"metadata": {},
279-
"outputs": [],
280-
"source": [
281-
"# Change activation_error_method and re-run the quantization process."
282-
]
283-
},
284-
{
285-
"cell_type": "markdown",
286-
"metadata": {},
287-
"source": [
288-
"### Enabling Hessian Based Mixed Precision\n",
289-
"MCT offers a Hessian-based scoring mechanism to assess the importance of layers during the Mixed Precision search.\n",
290-
"This feature can notably enhance Mixed Precision outcomes for certain network architectures.\n",
291-
"\n",
292-
"Set the use_hessian_based_scores flag to True in the MixedPrecisionQuantizationConfig of the CoreConfig.\n",
293-
"\n",
294-
"See [TroubleShooting Documentation>>Enabling Hessian-Based Mixed Precision](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/enabling_hessian-based_mixed_precision.html#ug-enabling-hessian-based-mixed-precision)\n"
295-
]
296-
},
297-
{
298-
"cell_type": "code",
299-
"execution_count": null,
300-
"metadata": {},
301-
"outputs": [],
302-
"source": [
303-
"# Change use_hessian_based_scores and re-run the quantization process."
304-
]
305-
},
306-
{
307-
"cell_type": "markdown",
308-
"metadata": {},
309-
"source": [
310-
"### GPTQ - Gradient-Based Post Training Quantization\n",
311-
"When PTQ (either with or without Mixed Precision) fails to deliver the required accuracy, GPTQ is potentially the remedy.\n",
312-
"\n",
313-
"MCT can configure GPTQ optimization options, such as the number of epochs for the optimization process.\n",
314-
"\n",
315-
"See [GPTQ - Gradient-Based Post Training Quantization](https://sonysemiconductorsolutions.github.io/mct-model-optimization/docs_troubleshoot/troubleshoots/gptq-gradient_based_post_training_quantization.html#ug-gptq-gradient-based-post-training-quantization)"
316-
]
317-
},
318-
{
319-
"cell_type": "code",
320-
"execution_count": null,
321-
"metadata": {},
322-
"outputs": [],
323-
"source": [
324-
"# Re-run the quantization process using GPTQ."
184+
"## Conclusion\n",
185+
"Through this XQuant analysis, accuracy improved by XX%\n",
186+
"By doing this, you can improve the accuracy of your model with the XQuant Extension Tool."
325187
]
326188
},
327189
{
328190
"cell_type": "markdown",
329191
"metadata": {},
330192
"source": [
331-
"## Conclusion\n",
332-
"Through this XQuant analysis, accuracy improved by XX%\n",
333-
"By doing this, you can improve the accuracy of your model with the XQuant Extension Tool.\n",
334-
"For more information, please see [this document](https://sonysemiconductorsolutions.github.io/mct-model-optimization/guidelines/XQuant_Extension_Tool.html#overall-process-flow\n",
335-
")"
193+
"To see general corrective actions and practical guidance to improve the post-training quantization quality of your PyTorch models, check out [the notebook](https://colab.research.google.com/github/SonySemiconductorSolutions/mct-model-optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_XQuant_Extension_Tool_2.ipynb)."
336194
]
337195
},
338196
{

0 commit comments

Comments
 (0)