Commit bbe9c2b
authored
Don't constant fold Quantize/DequantizeLinear nodes by default (#2713)
I added support for exporting `QuantizeLinear`/`DequantizeLinear` nodes
(from `fake_quantize_per_*_affine` torch operators) in a previous PR.
Unfortunately, the current default onnxscript optimizer settings tend to
automatically remove any weight quantization. This is because the
`Weight -> QDQ -> ...` pattern looks like it can be just constant folded
to `QDQ(Weight) -> ...`.
I believe that this behavior is not desirable, since the presence of
`QDQ` nodes in the graph is what allows inference engines to run the
supported computations using quantized data types. So the purpose of
`QDQ` nodes is to hold the relevant quantization "metadata". As such,
they normally shouldn't be constant folded.
I have extended the existing logic in `FoldConstantsPass` that was used
to exclude `ConstantOfShape` from constant folding.
I haven't found any tests verifying this behavior for `ConstantOfShape`
and I'm not sure, how to set up such a unit test, so I have left this
code untested for now. If adding tests is mandatory, please give me a
hint on where should I add such a test and what would be the best way to
check/assert that the optimized graph matches the expectations
(hopefully without reinventing the wheel or manually introspecting the
`ir.Model` object).1 parent 3e7d9fb commit bbe9c2b
1 file changed
+18
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
29 | 37 | | |
30 | 38 | | |
31 | 39 | | |
| |||
1226 | 1234 | | |
1227 | 1235 | | |
1228 | 1236 | | |
1229 | | - | |
| 1237 | + | |
1230 | 1238 | | |
1231 | | - | |
1232 | | - | |
1233 | | - | |
1234 | | - | |
1235 | | - | |
1236 | | - | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
1237 | 1248 | | |
1238 | 1249 | | |
1239 | 1250 | | |
| |||
0 commit comments