Commit dc4be7c
Qualcomm AI Engine Direct - Observer Fix and remove unused passes (pytorch#6225)
Summary:
- `ConvertToLinear()` is redundant in `qnn_preprocess.py` since this pass is already called in `executorch/backends/qualcomm/utils/utils.py`
- Some models are experiencing a significant drop in accuracy, with a few models having 0% accuracy. Adding new conditions to perform requantization and change ptq_per_channel_quant_config's IO from MinMaxObserver to MovingAverageMinMaxObserver to resolve the issue.
1. Why adding new conditions to do requantization? We noticed this change in PyTorch PR (pytorch/pytorch@b8eef50#diff-976c3b0c6f85048d3db01a0c394ce8eb16e2f7541f0983d0f4ef549baa4be822L152). Before this PR, quantization spec only checks whether 2 qspecs were same by comparing `dtype` and `is_dynamic`. After this change, it checks for more attributes such as `scale`, `zero_point`, etc. This causes some nodes having an extra pair of QDQ nodes. As shown in the image below, there are 2 pairs of QDQ nodes after the PyTorch PR, and these 2 pairs of QDQ nodes have different scale and offset. For QNN lowering process, node will only save the quant info right after the node output. For example, `cat` op below will use `quantize_per_tensor_default_18`'s scale and offset as the node's quant attribute, and all other quant and dequant nodes will be ignored.
This causes an accuracy drop, but by inserting a requantize node, we can see an improvement in accuracy for most models. Taking inceptionv3 as an example, the average top1 accuracy 0%->~75%. I have checked a couple other models and see accuracy either stays the same or have improvements.
I have also provided the option for users to skip this requant optimization if they preferred not to use it.
**Before:**

___
**After**

2. Why change ptq_per_channel_quant_config's IO from MinMaxObserver to MovingAverageMinMaxObserver?
After the above change, it seems like there is an inference speed drop due to requantization. By switching to MovingAverageMinMaxObserver, I observed an improvement in inference speed for some models such as inceptionv3.
Pull Request resolved: pytorch#6225
Reviewed By: kirklandsign
Differential Revision: D64413835
Pulled By: cccclai
fbshipit-source-id: a8be66b034c69ff403f9f2985f2b584695f3798b1 parent 423f65d commit dc4be7c
File tree
5 files changed
+30
-8
lines changed- backends/qualcomm
- _passes
- quantizer
- utils
5 files changed
+30
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
| 35 | + | |
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
| |||
68 | 71 | | |
69 | 72 | | |
70 | 73 | | |
71 | | - | |
72 | | - | |
73 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
74 | 94 | | |
75 | 95 | | |
76 | 96 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | 14 | | |
16 | 15 | | |
17 | 16 | | |
| |||
49 | 48 | | |
50 | 49 | | |
51 | 50 | | |
52 | | - | |
53 | 51 | | |
54 | 52 | | |
55 | 53 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
367 | | - | |
| 367 | + | |
368 | 368 | | |
369 | 369 | | |
370 | 370 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| 72 | + | |
72 | 73 | | |
73 | 74 | | |
74 | 75 | | |
| |||
305 | 306 | | |
306 | 307 | | |
307 | 308 | | |
308 | | - | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
309 | 312 | | |
310 | 313 | | |
311 | 314 | | |
| |||
0 commit comments