ModelTC
diff --git a/‎docs/source/algorithm/index.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/algorithm/index.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/example/deploy.rst‎
Lines changed: 10 additions & 0 deletions b/‎docs/source/example/deploy.rst‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/source/example/index.rst‎
Lines changed: 8 additions & 62 deletions b/‎docs/source/example/index.rst‎
Lines changed: 8 additions & 62 deletions
diff --git a/‎docs/source/example/platforms/snpe.rst‎
Lines changed: 53 additions & 0 deletions b/‎docs/source/example/platforms/snpe.rst‎
Lines changed: 53 additions & 0 deletions
diff --git a/‎docs/source/example/platforms/tensorrt.rst‎
Lines changed: 56 additions & 0 deletions b/‎docs/source/example/platforms/tensorrt.rst‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎docs/source/example/quantization.rst‎
Lines changed: 54 additions & 0 deletions b/‎docs/source/example/quantization.rst‎
Lines changed: 54 additions & 0 deletions
diff --git a/‎docs/source/example/setup.rst‎
Lines changed: 21 additions & 0 deletions b/‎docs/source/example/setup.rst‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎docs/source/hardware/index.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/source/hardware/index.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/source/hardware/snpe.rst‎
Lines changed: 29 additions & 0 deletions b/‎docs/source/hardware/snpe.rst‎
Lines changed: 29 additions & 0 deletions
@@ -96,7 +96,7 @@ PTQ Algorithms
 
            \[ E[ L(x,y,\mathbf{w}) - L(x,y,\mathbf{w}+\Delta \mathbf{w}) ] \approx \Delta \mathbf{w}^T g^{(\mathbf{w})} + \frac12 \Delta \mathbf{w}^T H^{(\mathbf{w})} \Delta \mathbf{w} \approx \Delta \mathbf{w}_1^2 + \Delta \mathbf{w}_2^2 + \Delta \mathbf{w}_1 \Delta \mathbf{w}_2 \]
 
-Hence, it's benifitial to learn a rounding mask for each layer. One well-designed object function is given by the authors:
+Hence, it's benificial to learn a rounding mask for each layer. One well-designed object function is given by the authors:
 
 .. raw:: latex html
 
 
@@ -0,0 +1,10 @@
+Advance instructions for different hardware platforms
+======================================================
+
+.. toctree::
+   :titlesonly:
+
+   TensorRT <platforms/tensorrt.rst>
+   SNPE <platforms/snpe.rst>
+
+
@@ -1,65 +1,11 @@
 Get Started
-==========================
-We follow the `PyTorch official example <https://github.com/pytorch/examples/tree/master/imagenet/>`_ to build the example of Model Quantization Benchmark for ImageNet classification task.
- 
-Requirements
--------------
+============
+This tutorial will give details about the whole work-through to do quantization with MQBench, including:
 
-- Install PyTorch following `pytorch.org <http://pytorch.org/>`_
-- Install dependencies::
+.. toctree::
+   :maxdepth: 1
+   :titlesonly:
 
-    pip install -r requirements.txt
-
-- Download the ImageNet dataset from `the official website <http://www.image-net.org/>`_
-
-  - Then, and move validation images to labeled subfolders, using `the following shell script <https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh/>`_
-
-- Install TensorRT=7.2.1.6 from `NVIDIA <https://developer.nvidia.com/tensorrt/>`_
-
-Usage
----------
-
-- **Quantization-Aware Training:**
-
-  - Training hyper-parameters:
-    
-    - batch size = 128
-    - epochs = 1 
-    - lr = 1e-4
-    - others like weight decay, momentum are kept as default.
-
-  - ResNet18 / ResNet50 / MobileNet_v2::
-
-        python main.py -a [model_name] --epochs 1 --lr 1e-4 --b 128 --seed 99 --pretrained
-
-
-- **Deployment**
-  We provide the example to deploy the quantized model to TensorRT.
-
-  1. First export the quantized model to ONNX [tensorrt_deploy_model.onnx] and dump the clip ranges [tensorrt_clip_ranges.json] for activations.::
-
-        python main.py -a [model_name] --resume [model_save_path]
-     
-
-  2. Second build the TensorRT INT8 engine and evaluate, please make sure [dataset_path] contains subfolder [val]::
-
-        python onnx2trt.py --onnx [tensorrt_deploy_model.onnx] --trt [model_name.trt] --clip [tensorrt_clip_ranges.json] --data [dataset_path] --evaluate
-    
-  3. If you don’t pass in external clip ranges [tensorrt_clip_ranges.json], TenosrRT will do calibration using default algorithm IInt8EntropyCalibrator2 with 100 images. So, please make sure [dataset_path] contains subfolder [cali]::
-
-        python onnx2trt.py --onnx [tensorrt_deploy_model.onnx] --trt [model_name.trt] --data [dataset_path] --evaluate
-
-Results
------------
-
-+-------------------+--------------------------------+------------------------------------------------------------------------------------------------------------------+
-|   Model           |       accuracy\@fp32           |                                           accuracy\@int8                                                         |
-|                   |                                +----------------------------------------+---------------------------------+---------------------------------------+
-|                   |                                |     TensoRT Calibration                |        MQBench QAT              |       TensorRT SetRange               |  
-+===================+================================+========================================+=================================+=======================================+
-|  **ResNet18**     |    Acc\@1 69.758 Acc\@5 89.078 |   Acc\@1 69.612 Acc\@5 88.980          |    Acc\@1 69.912 Acc\@5 89.150  |    Acc\@1 69.904 Acc\@5 89.182        |
-+-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+ 
-|  **ResNet50**     |    Acc\@1 76.130 Acc\@5 92.862 |   Acc\@1 76.074 Acc\@5 92.892          |    Acc\@1 76.114 Acc\@5 92.946  |    Acc\@1 76.320 Acc\@5 93.006        | 
-+-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+
-|  **MobileNet_v2** |    Acc\@1 71.878 Acc\@5 90.286 |   Acc\@1 70.700 Acc\@5 89.708          |    Acc\@1 70.826 Acc\@5 89.874  |    Acc\@1 70.724 Acc\@5 89.870        |  
-+-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+
+   setup
+   quantization
+   deploy
@@ -0,0 +1,53 @@
+SNPE
+=============
+Example of QAT and deployment on SNPE.
+
+**Requirements**:
+
+- Install SNPE SDK from `QualComm <https://developer.qualcomm.com/sites/default/files/docs/snpe/setup.html>`_ (Suggest Ubuntu 18.04)
+
+**QAT**:
+
+- Follow the QAT procedures to get a model checkpoint, and suggest learning rate as 5e-5 with cosine scheduler and Adam optimizer for tens of epochs.
+
+**Deployment**:
+
+- Convert PyTorch checkpoint to `snpe_deploy.onnx` and dump clip ranges to `snpe_clip_ranges.json`::
+
+    from mqbench.convert_deploy import convert_deploy
+    input_dict = {'x': [1, 3, 224, 224]}
+    convert_deploy(solver.model.module, BackendType.SNPE, input_dict)
+
+- Convert `.onnx` file to `.dlc` format (supported by SNPE)::
+
+    snpe-onnx-to-dlc --input_network ./snpe_deploy.onnx --output_path ./snpe_deploy.dlc --quantization_overrides ./snpe_clip_ranges.json
+
+    - Note that, the `.json` file contains activation ranges for quantization, but it's required here although the model hasn't been quantized now.
+
+- Quantize the model with parameters overridden::
+
+    snpe-dlc-quantize --input_dlc ./snpe_deploy.dlc --input_list ./data.txt --override_params  --bias_bitwidth 32
+
+    - The `data.txt` records paths to image data for calibration (not important since we will override parameters) which will be loaded by `numpy.fromfile(dtype=np.float32)` and have shape of `(224, 224, 3)`. And this file is required for test.
+
+    - Now we get the final model `snpe_deploy_quantized.dlc`
+
+**Results**:
+
+The test is done by SNPE SDK tools, with the quantized model and a text file recording paths to test data in shape of (224, 224, 3)::
+
+    snpe-net-run --container ./snpe_deploy_quantized.dlc --input_list ./test_data.txt
+
+The results on several tested models:
+
++-------------------+--------------------------------+------------------------------------------------------------------------------------------------------------------+
+|   Model           |       accuracy\@fp32           |                                           accuracy\@int8                                                         |
+|                   |                                +-------------------------------------------------------+----------------------------------------------------------+
+|                   |                                |                      MQBench QAT                      |                            SNPE                          |
++===================+================================+=======================================================+==========================================================+
+|  **ResNet18**     |    70.65%                      |                      70.75%                           |                      70.74%                              |
++-------------------+--------------------------------+-------------------------------------------------------+----------------------------------------------------------+
+|  **ResNet50**     |    77.94%                      |                      77.75%                           |                      77.92%                              |
++-------------------+--------------------------------+-------------------------------------------------------+----------------------------------------------------------+
+|  **MobileNet_v2** |    72.67%                      |                      72.31%                           |                      72.65%                              |
++-------------------+--------------------------------+-------------------------------------------------------+----------------------------------------------------------+
@@ -0,0 +1,56 @@
+TensorRT
+================
+Example of QAT and deployment on TensorRT.
+
+**Requirements**:
+
+- Install TensorRT=7.2.1.6 from `NVIDIA <https://developer.nvidia.com/tensorrt/>`_
+
+**QAT**:
+
+- Training hyper-parameters:
+
+- batch size = 128
+- epochs = 1
+- learning rate  = 1e-4 (for ResNet series) / 1e-5 (for MobileNet series)
+- weight decay = 1e-4 (for ResNet series) / 0 (for MobileNet series)
+- optimizer: SGD (for ResNet series) / Adam (for MobileNet series)
+- others like momentum are kept as default.
+
+- [model_name] = ResNet18 / ResNet50 / MobileNet_v2 / ... ::
+
+      git clone https://github.com/TheGreatCold/MQBench.git
+      cd application/imagenet_example
+      python main.py -a [model_name] --epochs 1 --lr 1e-4 --b 128 --pretrained
+
+
+**Deployment**:
+
+We provide the example to deploy the quantized model to TensorRT.
+
+- First export the quantized model to ONNX [tensorrt_deploy_model.onnx] and dump the clip ranges [tensorrt_clip_ranges.json] for activations. ::
+
+      python main.py -a [model_name] --resume [model_save_path]
+
+- Second build the TensorRT INT8 engine and evaluate, please make sure [dataset_path] contains subfolder [val]. ::
+
+      python onnx2trt.py --onnx [tensorrt_deploy_model.onnx] --trt [model_name.trt] --clip [tensorrt_clip_ranges.json] --data [dataset_path] --evaluate
+
+- If you don’t pass in external clip ranges [tensorrt_clip_ranges.json], TenosrRT will do calibration using default algorithm IInt8EntropyCalibrator2 with 100 images. So, please make sure [dataset_path] contains subfolder [cali]. ::
+
+      python onnx2trt.py --onnx [tensorrt_deploy_model.onnx] --trt [model_name.trt] --data [dataset_path] --evaluate
+
+**Results**:
+
+
++-------------------+--------------------------------+------------------------------------------------------------------------------------------------------------------+
+|   Model           |       accuracy\@fp32           |                                           accuracy\@int8                                                         |
+|                   |                                +----------------------------------------+---------------------------------+---------------------------------------+
+|                   |                                |     TensoRT Calibration                |        MQBench QAT              |       TensorRT SetRange               |
++===================+================================+========================================+=================================+=======================================+
+|  **ResNet18**     |    Acc\@1 69.758 Acc\@5 89.078 |   Acc\@1 69.612 Acc\@5 88.980          |    Acc\@1 69.912 Acc\@5 89.150  |    Acc\@1 69.904 Acc\@5 89.182        |
++-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+
+|  **ResNet50**     |    Acc\@1 76.130 Acc\@5 92.862 |   Acc\@1 76.074 Acc\@5 92.892          |    Acc\@1 76.114 Acc\@5 92.946  |    Acc\@1 76.320 Acc\@5 93.006        |
++-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+
+|  **MobileNet_v2** |    Acc\@1 71.878 Acc\@5 90.286 |   Acc\@1 70.700 Acc\@5 89.708          |    Acc\@1 71.158 Acc\@5 89.990  |    Acc\@1 71.102 Acc\@5 89.932        |
++-------------------+--------------------------------+----------------------------------------+---------------------------------+---------------------------------------+
@@ -0,0 +1,54 @@
+How to do quantization with MQBench
+======================================
+
+QAT (Quantization-Aware Training)
+---------------------------------------------
+The training only requires some additional operations compared to ordinary fine-tune.
+
+::
+
+      import torchvision.models as models
+      from mqbench.convert_deploy import convert_deploy
+      from mqbench.prepare_by_platform import prepare_qat_fx_by_platform, BackendType
+      from mqbench.utils.state import enable_calibration, enable_quantization
+
+      # first, initialize the FP32 model with pretrained parameters.
+      model = models.__dict__["resnet18"](pretrained=True)
+      model.train()
+
+      # then, we will trace the original model using torch.fx and \
+      # insert fake quantize nodes according to different hardware backends (e.g. TensorRT).
+      model = prepare_qat_fx_by_platform(model, BackendType.Tensorrt)
+
+      # before training, we recommend to enable observers for calibration in several batches, and then enable quantization.
+      model.eval()
+      enable_calibration(model)
+      calibration_flag = True
+
+      # training loop
+      for i, batch in enumerate(data):
+          # do forward procedures
+          ...
+
+          if calibration_flag:
+              if i >= 0:
+                  calibration_flag = False
+                  model.zero_grad()
+                  model.train()
+                  enable_quantization(model)
+              else:
+                  continue
+
+          # do backward and optimization
+          ...
+
+      # deploy model, remove fake quantize nodes and dump quantization params like clip ranges.
+      convert_deploy(model.eval(), BackendType.Tensorrt, input_shape_dict={'data': [10, 3, 224, 224]})
+
+
+PTQ (Post-Training Quantization)
+---------------------------------------------
+To be finished.
+
+
+
@@ -0,0 +1,21 @@
+Preparations
+===================================
+Generally, we follow the `PyTorch official example <https://github.com/pytorch/examples/tree/master/imagenet/>`_ to build the example of Model Quantization Benchmark for ImageNet classification task.
+
+
+- Install PyTorch following `pytorch.org <http://pytorch.org/>`_
+- Install dependencies ::
+
+    pip install -r requirements.txt
+
+- Specific requirements for hardware platforms will be introduced later
+
+- Download the ImageNet dataset from `the official website <http://www.image-net.org/>`_
+
+  - Then, and move validation images to labeled subfolders, using `the following shell script <https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh/>`_
+
+  - Or process other datasets in the similar way.
+
+- Full precision pretrained models are preferred, but sometimes it's possible to do QAT from scratch.
+
+
@@ -6,3 +6,4 @@ Quantization Hardware
 
    nnie
    tensorrt
+   snpe
@@ -0,0 +1,29 @@
+SNPE
+=========
+
+`Snapdragon Neural Processing Engine (SNPE) <https://developer.qualcomm.com/sites/default/files/docs/snpe//index.html/>`_ is a Qualcomm Snapdragon software accelerated runtime for the execution of deep neural networks.
+
+.. _SNPE Quantization Scheme:
+
+Quantization Scheme
+--------------------
+8/16 bit per-layer asymmetric linear quantization.
+
+.. math::
+
+    \begin{equation}
+        q = \mathtt{clamp}\left(\left\lfloor R * \dfrac{x - cmin}{cmax - cmin} \right\rceil, lb, ub\right)
+    \end{equation}
+
+where :math:`R` is the integer range after quantization, :math:`cmax` and :math:`cmin` are calculated range of the floating values, :math:`lb` and :math:`ub` are bounds of integer range.
+Taking 8bit as an example, R=255, [lb, ub]=[0,255].
+
+
+In fact, when building the SNPE with the official tools, it will firstly convert the model into *.dlc* model file of full precision, and then optionally change it into a quantized version.
+
+.. attention::
+    Users can provide a .json file to override the parameters.
+
+    The values of *scale* and *offset* are not required, but can be overrided.
+
+    SNPE will adjust the values of *cmin* and *cmax* to ensure zero is representable.
Original file line number	Diff line number	Diff line change
`@@ -6,3 +6,4 @@ Quantization Hardware`
`6`	`6`
`7`	`7`	`nnie`
`8`	`8`	`tensorrt`
	`9`	`+ snpe`