Skip to content

Commit 83c8bec

Browse files
Feature doc explane quant algorithms 2 (#1677)
Add more details to the FAQ, troubleshooting, and documentation.
1 parent 09dfd0e commit 83c8bec

File tree

76 files changed

+552
-452
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+552
-452
lines changed

FAQ.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
1. [Why does the size of the quantized model remain the same as the original model size?](#1-why-does-the-size-of-the-quantized-model-remain-the-same-as-the-original-model-size)
66
2. [Why does loading a quantized exported model from a file fail?](#2-why-does-loading-a-quantized-exported-model-from-a-file-fail)
77
3. [Why am I getting a torch.fx error?](#3-why-am-i-getting-a-torchfx-error)
8+
4. [Does MCT support both per-tensor and per-channel quantization?](#4-does-mct-support-both-per-tensor-and-per-channel-quantization)
89

910

1011
### 1. Why does the size of the quantized model remain the same as the original model size?
@@ -54,3 +55,26 @@ Despite these limitations, some adjustments can be made to facilitate MCT quanti
5455
Check the `torch.fx` error, and search for an identical replacement. Some examples:
5556
* An `if` statement in a module's `forward` method might can be easily skipped.
5657
* The `list()` Python method can be replaced with a concatenation operation [A, B, C].
58+
59+
### 4. Does MCT support both per-tensor and per-channel quantization?
60+
61+
MCT supports both per-tensor and per-channel quantization, as [defined in TPC](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold). To change this, please set the following parameters.
62+
63+
**Solution**: You can switch between per-tensor quantization and per-channel quantization by switching the parameter (weights_per_channel_threshold) as shown below.
64+
65+
In the object that configures the quantizer below:
66+
* model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig()
67+
68+
Set the following parameter:
69+
* weights_per_channel_threshold(bool) - Indicates whether to quantize the weights per-channel or per-tensor.
70+
71+
For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold).
72+
73+
74+
In QAT, the following object is used to set up a weight-learnable quantizer:
75+
* model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig()
76+
77+
Set the following parameter:
78+
* weights_per_channel_threshold (bool) – Whether to quantize the weights per-channel or not (per-tensor).
79+
80+
For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/trainable_infrastructure.html#trainablequantizerweightsconfig).

docs/api/api_docs/classes/BitWidthConfig.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>BitWidthConfig &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

docs/api/api_docs/classes/DataGenerationConfig.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>Data Generation Configuration &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

docs/api/api_docs/classes/DefaultDict.html

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>DefaultDict Class &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

@@ -60,15 +60,15 @@ <h3>Navigation</h3>
6060
<dd><p>Get the value of the inner dictionary by the given key, If key is not in dictionary,
6161
it uses the default_factory to return a default value.</p>
6262
<dl class="field-list simple">
63-
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
64-
<dd class="field-odd"><p><strong>key</strong> – Key to use in inner dictionary.</p>
63+
<dt class="field-odd">Return type<span class="colon">:</span></dt>
64+
<dd class="field-odd"><p><span class="sphinx_autodoc_typehints-type"><code class="xref py py-data docutils literal notranslate"><span class="pre">Any</span></code></span></p>
6565
</dd>
66-
<dt class="field-even">Returns<span class="colon">:</span></dt>
67-
<dd class="field-even"><p>Value of the inner dictionary by the given key, or a default value if not exist.
68-
If default_factory was not passed at initialization, it returns None.</p>
66+
<dt class="field-even">Parameters<span class="colon">:</span></dt>
67+
<dd class="field-even"><p><strong>key</strong> – Key to use in inner dictionary.</p>
6968
</dd>
70-
<dt class="field-odd">Return type<span class="colon">:</span></dt>
71-
<dd class="field-odd"><p><code class="xref py py-data docutils literal notranslate"><span class="pre">Any</span></code></p>
69+
<dt class="field-odd">Returns<span class="colon">:</span></dt>
70+
<dd class="field-odd"><p>Value of the inner dictionary by the given key, or a default value if not exist.
71+
If default_factory was not passed at initialization, it returns None.</p>
7272
</dd>
7373
</dl>
7474
</dd></dl>

docs/api/api_docs/classes/FrameworkInfo.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>FrameworkInfo Class &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

@@ -66,7 +66,7 @@ <h3>Navigation</h3>
6666
<p class="rubric">Examples</p>
6767
<p>When quantizing a Keras model, if we want to quantize the kernels of Conv2D layers only, we can
6868
set, and we know it’s kernel out/in channel indices are (3, 2) respectivly:</p>
69-
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="nn">tf</span>
69+
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span><span class="w"> </span><span class="nn">tensorflow</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">tf</span>
7070
<span class="gp">&gt;&gt;&gt; </span><span class="n">kernel_ops</span> <span class="o">=</span> <span class="p">[</span><span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">]</span>
7171
<span class="gp">&gt;&gt;&gt; </span><span class="n">kernel_channels_mapping</span> <span class="o">=</span> <span class="n">DefaultDict</span><span class="p">({</span><span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">:</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">)})</span>
7272
</pre></div>

docs/api/api_docs/classes/GradientPTQConfig.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>GradientPTQConfig Class &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

docs/api/api_docs/classes/MixedPrecisionQuantizationConfig.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>MixedPrecisionQuantizationConfig &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

docs/api/api_docs/classes/PruningConfig.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>Pruning Configuration &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

docs/api/api_docs/classes/PruningInfo.html

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>Pruning Information &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

@@ -65,9 +65,6 @@ <h3>Navigation</h3>
6565
<dt class="field-even">Return type<span class="colon">:</span></dt>
6666
<dd class="field-even"><p>Dict[BaseNode, np.ndarray]</p>
6767
</dd>
68-
<dt class="field-odd">Return type<span class="colon">:</span></dt>
69-
<dd class="field-odd"><p><code class="xref py py-class docutils literal notranslate"><span class="pre">Dict</span></code>[<code class="xref py py-class docutils literal notranslate"><span class="pre">BaseNode</span></code>, <code class="xref py py-class docutils literal notranslate"><span class="pre">ndarray</span></code>]</p>
70-
</dd>
7168
</dl>
7269
</dd></dl>
7370

@@ -82,9 +79,6 @@ <h3>Navigation</h3>
8279
<dt class="field-even">Return type<span class="colon">:</span></dt>
8380
<dd class="field-even"><p>Dict[BaseNode, np.ndarray]</p>
8481
</dd>
85-
<dt class="field-odd">Return type<span class="colon">:</span></dt>
86-
<dd class="field-odd"><p><code class="xref py py-class docutils literal notranslate"><span class="pre">Dict</span></code>[<code class="xref py py-class docutils literal notranslate"><span class="pre">BaseNode</span></code>, <code class="xref py py-class docutils literal notranslate"><span class="pre">ndarray</span></code>]</p>
87-
</dd>
8882
</dl>
8983
</dd></dl>
9084

docs/api/api_docs/classes/QuantizationConfig.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
88

99
<title>QuantizationConfig &#8212; MCT Documentation: ver 2.6.0</title>
10-
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=fa44fd50" />
10+
<link rel="stylesheet" type="text/css" href="../../../static/pygments.css?v=03e43079" />
1111
<link rel="stylesheet" type="text/css" href="../../../static/bizstyle.css?v=5283bb3d" />
1212
<link rel="stylesheet" type="text/css" href="../../../static/css/custom.css?v=01243f34" />
1313

@@ -50,7 +50,7 @@ <h3>Navigation</h3>
5050
activations using thresholds, with weight threshold selection based on MSE and activation threshold selection
5151
using NOCLIPPING (min/max), while enabling relu_bound_to_power_of_2 and weights_bias_correction,
5252
you can instantiate a quantization configuration like this:</p>
53-
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">model_compression_toolkit</span> <span class="k">as</span> <span class="nn">mct</span>
53+
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span><span class="w"> </span><span class="nn">model_compression_toolkit</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">mct</span>
5454
<span class="gp">&gt;&gt;&gt; </span><span class="n">qc</span> <span class="o">=</span> <span class="n">mct</span><span class="o">.</span><span class="n">core</span><span class="o">.</span><span class="n">QuantizationConfig</span><span class="p">(</span><span class="n">activation_error_method</span><span class="o">=</span><span class="n">mct</span><span class="o">.</span><span class="n">core</span><span class="o">.</span><span class="n">QuantizationErrorMethod</span><span class="o">.</span><span class="n">NOCLIPPING</span><span class="p">,</span> <span class="n">weights_error_method</span><span class="o">=</span><span class="n">mct</span><span class="o">.</span><span class="n">core</span><span class="o">.</span><span class="n">QuantizationErrorMethod</span><span class="o">.</span><span class="n">MSE</span><span class="p">,</span> <span class="n">relu_bound_to_power_of_2</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">weights_bias_correction</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
5555
</pre></div>
5656
</div>

0 commit comments

Comments
 (0)