Skip to content

Commit d9c3be7

Browse files
Merge pull request #978 from wwwind:docs_cpc
PiperOrigin-RevId: 455000418
2 parents 39d91a2 + a63fb72 commit d9c3be7

File tree

3 files changed

+36
-1
lines changed

3 files changed

+36
-1
lines changed

tensorflow_model_optimization/g3doc/guide/clustering/clustering_comprehensive_guide.ipynb

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -279,6 +279,27 @@
279279
"clustered_model.summary()"
280280
]
281281
},
282+
{
283+
"cell_type": "markdown",
284+
"metadata": {
285+
"id": "bU0SIhY2Q63C"
286+
},
287+
"source": [
288+
"### Cluster convolutional layers per channel\n",
289+
"\n",
290+
"The clustered model could be passed to further optimizations such as a [post training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization). If the quantization is done per channel, then the model should be clustered per channel as well. This increases the accuracy of the clustered and quantized model.\n",
291+
"\n",
292+
"**Note:** only Conv2D layers are clustered per channel\n",
293+
"\n",
294+
"To cluster per channel, the parameter `cluster_per_channel` should be set to `True`. It could be set for some layers or for the whole model.\n",
295+
"\n",
296+
"**Tips:**\n",
297+
"\n",
298+
"* If a model is to be quantized further, you can consider to use [cluster preserving QAT technique](https://www.tensorflow.org/model_optimization/guide/combine/collaborative_optimization).\n",
299+
"\n",
300+
"* The model could be pruned before applying the clustering per channel. With the parameter `preserve_sparsity` is set to `True`, the sparsity is preserved during the clustering per channel. Note that the [sparsity and cluster preserving QAT technique](https://www.tensorflow.org/model_optimization/guide/combine/collaborative_optimization) should be used in this case."
301+
]
302+
},
282303
{
283304
"cell_type": "markdown",
284305
"metadata": {
@@ -466,6 +487,7 @@
466487
"colab": {
467488
"collapsed_sections": [],
468489
"name": "clustering_comprehensive_guide.ipynb",
490+
"provenance": [],
469491
"toc_visible": true
470492
},
471493
"kernelspec": {

tensorflow_model_optimization/g3doc/guide/combine/collaborative_optimization.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,18 @@ with PQAT and CQAT collaborative optimization paths.
106106
</table>
107107
</figure>
108108

109+
### CQAT and PCQAT results for models clustered per channel
110+
Results below are obtained with the technique [clustering per channel](https://www.tensorflow.org/model_optimization/guide/clustering).
111+
They illustrate that if convolutional layers of the model are clustered per channel, then the model accuracy is higher. If your model has many convolutional layers, then we recommend to cluster per channel. The compression ratio remains the same, but the model accuracy will be higher. The model optimization pipeline is 'clustered -> cluster preserving QAT -> post training quantization, int8' in our experiments.
112+
<figure>
113+
<table class="tableizer-table">
114+
<tr class="tableizer-firstrow"><th>Model</th><th>Clustered -> CQAT, int8 quantized</th><th>Clustered per channel -> CQAT, int8 quantized</th>
115+
<tr><td>DS-CNN-L</td><td>95.949%</td><td> 96.44%</td></tr>
116+
<tr><td>MobileNet-V2</td><td>71.538%</td><td>72.638%</td></tr>
117+
<tr><td>MobileNet-V2 (pruned)</td><td>71.45%</td><td>71.901%</td></tr>
118+
</table>
119+
</figure>
120+
109121
## Examples
110122

111123
For end-to-end examples of the collaborative optimization techniques described

tensorflow_model_optimization/g3doc/guide/combine/cqat_example.ipynb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,7 @@
253253
"clustering_params = {\n",
254254
" 'number_of_clusters': 8,\n",
255255
" 'cluster_centroids_init': CentroidInitialization.KMEANS_PLUS_PLUS\n",
256+
" 'cluster_per_channel': True,\n",
256257
"}\n",
257258
"\n",
258259
"clustered_model = cluster_weights(model, **clustering_params)\n",
@@ -597,7 +598,7 @@
597598
"source": [
598599
"## Apply post-training quantization and compare to CQAT model\n",
599600
"\n",
600-
"Next, we use post-training quantization (no fine-tuning) on the clustered model and check its accuracy against the CQAT model. This demonstrates why you would need to use CQAT to improve the quantized model's accuracy.\n",
601+
"Next, we use post-training quantization (no fine-tuning) on the clustered model and check its accuracy against the CQAT model. This demonstrates why you would need to use CQAT to improve the quantized model's accuracy. The difference may not be very visible, because the MNIST model is quite small and overparametrized.\n",
601602
"\n",
602603
"First, define a generator for the callibration dataset from the first 1000 training images."
603604
]

0 commit comments

Comments
 (0)