You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vision/classification/shufflenet/README.md
+44-3Lines changed: 44 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,8 +9,13 @@ Computationally efficient CNN architecture designed specifically for mobile devi
9
9
ShuffleNet is a deep convolutional network for image classification. [ShuffleNetV2](https://pytorch.org/hub/pytorch_vision_shufflenet_v2/) is an improved architecture that is the state-of-the-art in terms of speed and accuracy tradeoff used for image classification.
> Compared with the fp32 ShuffleNet-v2, int8 ShuffleNet-v2's Top-1 error rising ratio is 0.59%, Top-5 error rising ratio is 1.71% and performance improvement is 1.62x.
35
+
>
36
+
> Note the performance depends on the test hardware.
37
+
>
38
+
> Performance data here is collected with Intel® Xeon® Platinum 8280 Processor, 1s 4c per instance, CentOS Linux 8.3, data batch size is 1.
27
39
28
40
## Inference
29
41
[This script](ShufflenetV2-export.py) converts the ShuffleNetv2 model from PyTorch to ONNX and uses ONNX Runtime for inference.
@@ -66,14 +78,43 @@ For training we use train+valset in COCO except for 5000 images from minivalset,
66
78
Details of performance on COCO object detection are provided in [this paper](https://arxiv.org/pdf/1807.11164v1.pdf)
67
79
<hr>
68
80
81
+
## Quantization
82
+
ShuffleNet-v2-int8 is obtained by quantizing ShuffleNet-v2-fp32 model. We use [Intel® Neural Compressor](https://github.com/intel/neural-compressor) with onnxruntime backend to perform quantization. View the [instructions](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/onnx_model_zoo/shufflenet/README.md) to understand how to use Intel® Neural Compressor for quantization.
Make sure to specify the appropriate dataset path in the configuration file.
95
+
```bash
96
+
bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
97
+
--config=shufflenetv2.yaml \
98
+
--output_model=path/to/save
99
+
```
100
+
101
+
### Model inference
102
+
We use onnxruntime to perform ShuffleNetv2_fp32 and ShuffleNetv2_int8 inference. View the notebook [onnxrt_inference](../onnxrt_inference.ipynb) to understand how to use these 2 models for doing inference as well as which preprocess and postprocess we use.
0 commit comments