Skip to content

Commit a3df4ed

Browse files
Update FAQ and documentation
1 parent b8297d7 commit a3df4ed

File tree

3 files changed

+15
-21
lines changed

3 files changed

+15
-21
lines changed

FAQ.md

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
1. [Why does the size of the quantized model remain the same as the original model size?](#1-why-does-the-size-of-the-quantized-model-remain-the-same-as-the-original-model-size)
66
2. [Why does loading a quantized exported model from a file fail?](#2-why-does-loading-a-quantized-exported-model-from-a-file-fail)
77
3. [Why am I getting a torch.fx error?](#3-why-am-i-getting-a-torchfx-error)
8-
8+
4. [Does MCT support both per-tensor and per-channel quantization?](#4-does-mct-support-both-per-tensor-and-per-channel-quantization)
99

1010
### 1. Why does the size of the quantized model remain the same as the original model size?
1111

@@ -57,23 +57,19 @@ Check the `torch.fx` error, and search for an identical replacement. Some exampl
5757

5858
### 4. Does MCT support both per-tensor and per-channel quantization?
5959

60-
MCT supports both per-tensor and per-channel quantization, as [defined in TPC](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#ug-target-platform-capabilities)
61-
62-
#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold).
63-
To change this, please set the following parameters.
60+
MCT supports both per-tensor and per-channel quantization, as [defined in TPC](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold). To change this, please set the following parameters.
6461

65-
Solution:
66-
You can switch between per-tensor quantization and per-channel quantization by switching the parameter (weights_per_channel_threshold) as shown below.
62+
**Solution**: You can switch between per-tensor quantization and per-channel quantization by switching the parameter (weights_per_channel_threshold) as shown below.
6763

68-
In the object that configures the quantizer below:
69-
model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig()
70-
Set the following parameters:
71-
weights_per_channel_threshold(bool) - Indicates whether to quantize the weights per-channel or per-tensor.
72-
For more details, please refer to [this](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold) page.
64+
In the object that configures the quantizer below:
65+
* model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig()
66+
Set the following parameter:
67+
* weights_per_channel_threshold(bool) - Indicates whether to quantize the weights per-channel or per-tensor.
68+
For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold).
7369

7470

75-
In QAT, the following object is used to set up a weight-learnable quantizer:
76-
model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig()
77-
Set the following parameters:
78-
weights_per_channel_threshold (bool) – Whether to quantize the weights per-channel or not (per-tensor).
79-
For more details, please refer to [this](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/trainable_infrastructure.html#trainablequantizerweightsconfig) page.
71+
In QAT, the following object is used to set up a weight-learnable quantizer:
72+
* model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig()
73+
Set the following parameter:
74+
* weights_per_channel_threshold (bool) – Whether to quantize the weights per-channel or not (per-tensor).
75+
For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/trainable_infrastructure.html#trainablequantizerweightsconfig).

docs/api/api_docs/classes/QuantizationErrorMethod.html

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,7 @@ <h3>Navigation</h3>
4444
<dl class="py class">
4545
<dt class="sig sig-object py" id="model_compression_toolkit.core.QuantizationErrorMethod">
4646
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">model_compression_toolkit.core.</span></span><span class="sig-name descname"><span class="pre">QuantizationErrorMethod</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">value</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#model_compression_toolkit.core.QuantizationErrorMethod" title="Link to this definition"></a></dt>
47-
<dd><blockquote>
48-
<div><p>Method for quantization threshold selection:</p>
49-
</div></blockquote>
47+
<dd><p>Method for quantization threshold selection:</p>
5048
<p>NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.</p>
5149
<p>MSE - <strong>(default)</strong> Use mean square error for minimizing quantization noise.</p>
5250
<p>MAE - Use mean absolute error for minimizing quantization noise.</p>

model_compression_toolkit/core/common/quantization/quantization_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ class CustomOpsetLayers(NamedTuple):
3939

4040
class QuantizationErrorMethod(Enum):
4141
"""
42-
Method for quantization threshold selection:
42+
Method for quantization threshold selection:
4343
4444
NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
4545

0 commit comments

Comments
 (0)