Introduce the the notebook of using prune policy for latency improvements

tensorflower-gardener · tensorflower-gardener · commit c6fc392adae9 · 2021-05-18T16:06:15.000-07:00
PiperOrigin-RevId: 374525721
diff --git a/tensorflow_model_optimization/g3doc/_book.yaml b/tensorflow_model_optimization/g3doc/_book.yaml
@@ -31,6 +31,8 @@ upper_tabs:
         path: /model_optimization/guide/pruning/pruning_with_keras
       - title: Pruning comprehensive guide
         path: /model_optimization/guide/pruning/comprehensive_guide
+      - title: Pruning for on on-device inference with XNNPACK
+        path: /model_optimization/guide/pruning/pruning_for_on_device_inference
 
       - heading: Quantization
       - title: Quantization aware training overview
diff --git a/tensorflow_model_optimization/g3doc/guide/pruning/index.md b/tensorflow_model_optimization/g3doc/guide/pruning/index.md
@@ -9,6 +9,8 @@ fits with your use case.
     [Pruning with Keras](pruning_with_keras.ipynb) example.
 *   To quickly find the APIs you need for your use case, see the
     [pruning comprehensive guide](comprehensive_guide.ipynb).
+*   To explore the application of pruning for on-device inference, see the
+    [Pruning for on on-device inference with XNNPACK](pruning_for_on_device_inference.ipynb).
 
 ## Overview
 
diff --git a/tensorflow_model_optimization/g3doc/guide/pruning/pruning_for_on_device_inference.ipynb b/tensorflow_model_optimization/g3doc/guide/pruning/pruning_for_on_device_inference.ipynb