Skip to content

Commit 1963af7

Browse files
committed
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_clip_op
2 parents 4eb4438 + b5ebca4 commit 1963af7

File tree

1 file changed

+72
-0
lines changed
  • python/paddle/fluid/contrib/int8_inference

1 file changed

+72
-0
lines changed
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Offline INT8 Calibration Tool
2+
3+
PaddlePaddle supports offline INT8 calibration to accelerate the inference speed. In this document, we provide the instructions on how to enable INT8 calibration and show the ResNet-50 and MobileNet-V1 results in accuracy.
4+
5+
## 0. Prerequisite
6+
You need to install at least PaddlePaddle-1.3 python package `pip install paddlepaddle==1.3`.
7+
8+
## 1. How to generate INT8 model
9+
You can refer to the unit test in [test_calibration.py](../tests/test_calibration.py). Basically, there are three steps:
10+
* Construct calibration object.
11+
12+
```python
13+
calibrator = int8_utility.Calibrator( # Step 1
14+
program=infer_program, # required, FP32 program
15+
pretrained_model=model_path, # required, FP32 pretrained model
16+
algo=algo, # required, calibration algorithm; default is max, the alternative is KL (Kullback–Leibler divergence)
17+
exe=exe, # required, executor
18+
output=int8_model, # required, INT8 model
19+
feed_var_names=feed_dict, # required, feed dict
20+
fetch_list=fetch_targets) # required, fetch targets
21+
```
22+
23+
* Call the calibrator.sample_data() after executor run.
24+
```python
25+
_, acc1, _ = exe.run(
26+
program,
27+
feed={feed_dict[0]: image,
28+
feed_dict[1]: label},
29+
fetch_list=fetch_targets)
30+
31+
calibrator.sample_data() # Step 2
32+
```
33+
34+
* Call the calibrator.save_int8_model() after sampling over specified iterations (e.g., iterations = 50)
35+
```python
36+
calibrator.save_int8_model() # Step 3
37+
```
38+
39+
## 2. How to run INT8 model
40+
You can load INT8 model by load_inference_model [API](https://github.com/PaddlePaddle/Paddle/blob/8b50ad80ff6934512d3959947ac1e71ea3fb9ea3/python/paddle/fluid/io.py#L991) and run INT8 inference similar as [FP32](https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/object_detection/eval.py "FP32").
41+
42+
```python
43+
[infer_program, feed_dict,
44+
fetch_targets] = fluid.io.load_inference_model(model_path, exe)
45+
```
46+
47+
## 3. Result
48+
We provide the results of accuracy measurd on [Intel® Xeon® Platinum Gold Processor](https://ark.intel.com/products/120489/Intel-Xeon-Gold-6148-Processor-27-5M-Cache-2-40-GHz- "Intel® Xeon® Gold 6148 Processor") (also known as Intel® Xeon® Skylake6148).
49+
50+
| Model | Dataset | FP32 Accuracy | INT8 Accuracy | Accuracy Diff |
51+
| ------------ | ------------ | ------------ | ------------ | ------------ |
52+
| ResNet-50 | Small | 72.00% | 72.00% | 0.00% |
53+
| MobileNet-V1 | Small | 62.00% | 62.00% | 0.00% |
54+
| ResNet-50 | Full ImageNet Val | 76.63% | 76.17% | 0.46% |
55+
| MobileNet-V1 | Full ImageNet Val | 70.78% | 70.49% | 0.29% |
56+
57+
Please note that [Small](http://paddle-inference-dist.cdn.bcebos.com/int8/calibration_test_data.tar.gz "Small") is a subset of [full ImageNet validation dataset](http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar "full ImageNet validation dataset").
58+
59+
Notes:
60+
* The accuracy measurement requires the model with `label`.
61+
* The INT8 theoretical speedup is ~1.33X on Intel® Xeon® Skylake Server (please refer to `This allows for 4x more input at the cost of 3x more instructions or 33.33% more compute` in [Reference](https://software.intel.com/en-us/articles/lower-numerical-precision-deep-learning-inference-and-training "Reference")).
62+
63+
## 4. How to reproduce the results
64+
* Small dataset
65+
```bash
66+
python python/paddle/fluid/contrib/tests/test_calibration.py
67+
```
68+
69+
* Full dataset
70+
```bash
71+
DATASET=full python python/paddle/fluid/contrib/tests/test_calibration.py
72+
```

0 commit comments

Comments
 (0)