add int8 mobilenet & faster_rcnn (#504)

mengniwang95 · jcwchen · web-flow · commit 673d0d48a4fd · 2022-03-21T16:06:10.000-07:00
* add mobilenet &amp; faster_rcnn

Signed-off-by: mengniwa &lt;mengni.wang@intel.com&gt;

* fix readme

Signed-off-by: mengniwa &lt;mengni.wang@intel.com&gt;

* update model link

Signed-off-by: mengniwa &lt;mengni.wang@intel.com&gt;

Co-authored-by: Chun-Wei Chen &lt;jacky82226@gmail.com&gt;
diff --git a/vision/classification/mobilenet/README.md b/vision/classification/mobilenet/README.md
@@ -16,6 +16,13 @@ The below model is using multiplier value as 1.0.
  |Model        |Download  |Download (with sample test data)| ONNX version |Opset version|Top-1 accuracy (%)|Top-5 accuracy (%)|
 |-------------|:--------------|:--------------|:--------------|:--------------|:--------------|:--------------|
 |MobileNet v2-1.0|    [13.6 MB](model/mobilenetv2-7.onnx)  |  [14.1 MB](model/mobilenetv2-7.tar.gz) |  1.2.1  | 7| 70.94    |     89.99           |
+|MobileNet v2-1.0-fp32|    [13.3 MB](model/mobilenetv2-12.onnx)  |  [12.9 MB](model/mobilenetv2-12.tar.gz) |  1.9.0  | 12| 69.48    |     89.26           |
+|MobileNet v2-1.0-int8|    [3.5 MB](model/mobilenetv2-12-int8.onnx)  |  [3.7 MB](model/mobilenetv2-12-int8.tar.gz) |  1.9.0  | 12| 68.30    |     88.44           |
+> Compared with the fp32 MobileNet v2-1.0, int8 MobileNet v2-1.0's Top-1 accuracy decline ratio is 1.70%, Top-5 accuracy decline ratio is 0.92% and performance improvement is 1.05x.
+>
+> Note the performance depends on the test hardware. 
+> 
+> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
 
 ## Inference
 We used MXNet as framework with gluon APIs to perform inference. View the notebook [imagenet_inference](../imagenet_inference.ipynb) to understand how to use above models for doing inference. Make sure to specify the appropriate model name in the notebook.
@@ -48,15 +55,40 @@ We used MXNet as framework with gluon APIs to perform training. View the [traini
 ## Validation
 We used MXNet as framework with gluon APIs to perform validation. Use the notebook [imagenet_validation](../imagenet_validation.ipynb) to verify the accuracy of the model on the validation set. Make sure to specify the appropriate model name in the notebook.
 
+## Quantization
+MobileNet v2-1.0-int8 is obtained by quantizing MobileNet v2-1.0-fp32 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/image_recognition/onnx_model_zoo/mobilenet/quantization/ptq/README.md) to understand how to use Intel® Neural Compressor for quantization.
+
+### Environment
+onnx: 1.9.0 
+onnxruntime: 1.8.0
+
+### Prepare model
+```shell
+wget https://github.com/onnx/models/blob/main/vision/classification/mobilenet/model/mobilenetv2-12.onnx
+```
+
+### Model quantize
+Make sure to specify the appropriate dataset path in the configuration file.
+```bash
+bash run_tuning.sh --input_model=path/to/model \  # model path as *.onnx
+                   --config=mobilenetv2.yaml \
+                   --output_model=path/to/save
+```
 
 ## References
 * **MobileNet-v2** Model from the paper [MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381)
 
 * [MXNet](http://mxnet.incubator.apache.org), [Gluon model zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html), [GluonCV](https://gluon-cv.mxnet.io)
 
+* [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
+
 ## Contributors
 * [ankkhedia](https://github.com/ankkhedia) (Amazon AI)
 * [abhinavs95](https://github.com/abhinavs95) (Amazon AI)
+* [mengniwang95](https://github.com/mengniwang95) (Intel)
+* [airMeng](https://github.com/airMeng) (Intel)
+* [ftian1](https://github.com/ftian1) (Intel)
+* [hshen14](https://github.com/hshen14) (Intel)
 
 ## License
 Apache 2.0
diff --git a/vision/classification/mobilenet/model/mobilenetv2-12-int8.onnx b/vision/classification/mobilenet/model/mobilenetv2-12-int8.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cc028fe6cae7bc11a4ff53cfc9b79c920e8be65ce33a904ec3e2a8f66d77f95f
+size 3655033
diff --git a/vision/classification/mobilenet/model/mobilenetv2-12-int8.tar.gz b/vision/classification/mobilenet/model/mobilenetv2-12-int8.tar.gz
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f00ea406993d4b2a404a4c5f4258e862f68aedfbcc324510267e556612501211
+size 3910933
diff --git a/vision/classification/mobilenet/model/mobilenetv2-12.onnx b/vision/classification/mobilenet/model/mobilenetv2-12.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c0c3f76d93fa3fd6580652a45618618a220fced18babf65774ed169de0432ad5
+size 13964571
diff --git a/vision/classification/mobilenet/model/mobilenetv2-12.tar.gz b/vision/classification/mobilenet/model/mobilenetv2-12.tar.gz
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f83b422708a708b592a5fd56d84710038511789df3ae510b84123930c37762d
+size 13498787
diff --git a/vision/object_detection_segmentation/faster-rcnn/README.md b/vision/object_detection_segmentation/faster-rcnn/README.md
@@ -10,7 +10,13 @@ This model is a real-time neural network for object detection that detects 80 di
 |Model        |Download  | Download (with sample test data)|ONNX version|Opset version|Accuracy |
 |-------------|:--------------|:--------------|:--------------|:--------------|:--------------|
 |Faster R-CNN R-50-FPN      |[167.3 MB](model/FasterRCNN-10.onnx) |[158.0 MB](model/FasterRCNN-10.tar.gz) |1.5 |10 |mAP of 0.35 |
-
+|Faster R-CNN R-50-FPN-fp32      |[168.5 MB](model/FasterRCNN-12.onnx) |[156.2 MB](model/FasterRCNN-12.tar.gz) |1.9 |12 |mAP of 0.3437 |
+|Faster R-CNN R-50-FPN-int8      |[42.6 MB](model/FasterRCNN-12-int8.onnx) |[36.2 MB](model/FasterRCNN-12-int8.tar.gz) |1.9 |12 |mAP of 0.3399 |
+> Compared with the fp32 FasterRCNN-12, int8 FasterRCNN-12's mAP decline ratio is 1.11% and performance improvement is 1.43x.
+>
+> Note the performance depends on the test hardware. 
+> 
+> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
 
 
 <hr>
@@ -112,14 +118,44 @@ Metric is COCO box mAP (averaged over IoU of 0.5:0.95), computed over 2017 COCO
 mAP of 0.353
 <hr>
 
+## Quantization
+Faster R-CNN R-50-FPN-fp32 is obtained by quantizing Faster R-CNN R-50-FPN-fp32 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/object_detection/onnx_model_zoo/faster_rcnn/quantization/ptq/README.md) to understand how to use Intel® Neural Compressor for quantization.
+
+### Environment
+onnx: 1.9.0 
+onnxruntime: 1.8.0
+
+### Prepare model
+```shell
+wget https://github.com/onnx/models/blob/main/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12.onnx
+```
+
+### Model quantize
+```bash
+bash run_tuning.sh --input_model=path/to/model \  # model path as *.onnx
+                   --config=faster_rcnn.yaml \
+                   --data_path=path/to/COCO2017 \
+                   --output_model=path/to/save
+```
+<hr>
+
 ## Publication/Attribution
 Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Conference on Neural Information Processing Systems (NIPS), 2015.
 
 Massa, Francisco and Girshick, Ross. maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark).
 <hr>
 
 ## References
-This model is converted from [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) with modifications in [repository](https://github.com/BowenBao/maskrcnn-benchmark/tree/onnx_stage).
+* This model is converted from [facebookresearch/maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) with modifications in [repository](https://github.com/BowenBao/maskrcnn-benchmark/tree/onnx_stage).
+
+* [Intel® Neural Compressor](https://github.com/intel/neural-compressor)
+<hr>
+
+## Contributors
+* [mengniwang95](https://github.com/mengniwang95) (Intel)
+* [airMeng](https://github.com/airMeng) (Intel)
+* [ftian1](https://github.com/ftian1) (Intel)
+* [hshen14](https://github.com/hshen14) (Intel)
 <hr>
 
 ## License
diff --git a/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12-int8.onnx b/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12-int8.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6bb169f20bd4bf1c08212f2c8b2693a22a6343454f8115fd8c0fadd2a49922a
+size 44626453
diff --git a/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12-int8.tar.gz b/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12-int8.tar.gz
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:599fcd738eb08d8d5daee9edb332f785c59f9a3c704292ba420a41c1b43404e2
+size 37955516
diff --git a/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12.onnx b/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:789104f5a47f008450a37e02da8a8d3642766ed10d91238312b1532fe72789a6
+size 176713194
diff --git a/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12.tar.gz b/vision/object_detection_segmentation/faster-rcnn/model/FasterRCNN-12.tar.gz
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67208685047b27a201c1b50a3a7a0ea25161e9988bc4211adb74c85637c33860
+size 163814449

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:cc028fe6cae7bc11a4ff53cfc9b79c920e8be65ce33a904ec3e2a8f66d77f95f`
	`3`	`+size 3655033`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:f00ea406993d4b2a404a4c5f4258e862f68aedfbcc324510267e556612501211`
	`3`	`+size 3910933`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:c0c3f76d93fa3fd6580652a45618618a220fced18babf65774ed169de0432ad5`
	`3`	`+size 13964571`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:5f83b422708a708b592a5fd56d84710038511789df3ae510b84123930c37762d`
	`3`	`+size 13498787`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:c6bb169f20bd4bf1c08212f2c8b2693a22a6343454f8115fd8c0fadd2a49922a`
	`3`	`+size 44626453`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:599fcd738eb08d8d5daee9edb332f785c59f9a3c704292ba420a41c1b43404e2`
	`3`	`+size 37955516`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:789104f5a47f008450a37e02da8a8d3642766ed10d91238312b1532fe72789a6`
	`3`	`+size 176713194`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+version https://git-lfs.github.com/spec/v1`
	`2`	`+oid sha256:67208685047b27a201c1b50a3a7a0ea25161e9988bc4211adb74c85637c33860`
	`3`	`+size 163814449`