-
Notifications
You must be signed in to change notification settings - Fork 277
[TorchFX] quantize_pt2e custom quantizers support
#3487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TorchFX] quantize_pt2e custom quantizers support
#3487
Conversation
0c628c4 to
8864b02
Compare
8864b02 to
1fa9940
Compare
1fa9940 to
8864b02
Compare
8864b02 to
1efa37f
Compare
8814d48 to
9bcaba1
Compare
quantize_pt2e custom quantizers support
9bcaba1 to
d506459
Compare
d506459 to
c85ffe7
Compare
| UINT8 = "UINT8" | ||
|
|
||
|
|
||
| class ExtendedQuantizerSetup(ABC, SingleConfigQuantizerSetup): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change requires offline discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, ExtendedQuantizerSetup is replaced with ExtendedQuantizerConfig
…#3541) Splitting of huge #3487 PR: ### Changes Default `DuplicateDQPass` which does not work without torch.ao annotations is replaced with working DuplicateDQPassNoAnnotations ### Reason for changes To fix `DuplicateDQPass` ### Example  ### Related tickets #3487 #3231 ### Tests TorchFX Conformance tests references are updated to test fixed DuplicateDQPassNoAnnotations pass
255f9b2 to
92ef6e2
Compare
012750c to
69796b5
Compare
69796b5 to
7a2c141
Compare
|
@AlexanderDokuchaev, I hope we can merge this PR without the refactoring of TensorDataType and a registry for the new ExtendedQuantizerConfig class. Please let me know ASAP if it is not possible |
adaadaa to
01656d1
Compare
| :param dest_dtype: Target integer data type for quantized values. | ||
| """ | ||
| super().__init__(num_bits, mode, signedness_to_force, per_channel, narrow_range) | ||
| if dest_dtype not in [TensorDataType.int8, TensorDataType.uint8]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question, why ExtendedQuantizerConfig responsible to check type?
It's responsibilities of algorithms to use only valid type, it's just structure that in general case can contain any type.
Suggest to remove this check, check type in algorithms if need, and remove test_structs.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because other types are not supported, no? I don't want to check types in all other places, I want to check them at the beginning. Config with dest_type bfloat16 or int4 does not make any sense to have right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation type is responsibility of algorithms, not a structure that just contains parameters.
Config will be valid with all dest_type, but quantization algorithm supports only uint8 and int8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| metric_value: 0.2429 | ||
| torchvision/vit_b_16_backend_X86_QUANTIZER_NNCF: | ||
| metric_value: 0.80922 | ||
| torchvision/vit_b_16_backend_X86_QUANTIZER_AO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will not merge tests that are running for several days, even if they are disabled by default.
post_training_quantization/681/ - still in progress after 7 days
What a reason that is takes long time?
Did you check that there is no any bugs?
c19a398 to
086758e
Compare
| :param dest_dtype: Target integer data type for quantized values. | ||
| """ | ||
| super().__init__(num_bits, mode, signedness_to_force, per_channel, narrow_range) | ||
| if dest_dtype not in [TensorDataType.int8, TensorDataType.uint8]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation type is responsibility of algorithms, not a structure that just contains parameters.
Config will be valid with all dest_type, but quantization algorithm supports only uint8 and int8
| metric_value: 0.2429 | ||
| torchvision/vit_b_16_backend_X86_QUANTIZER_NNCF: | ||
| metric_value: 0.80922 | ||
| torchvision/vit_b_16_backend_X86_QUANTIZER_AO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the issue: validation with using PyTorch runs on all available CPU cores by default, but there's a limit set in the CI environment. As a result, validation runs slower due to CPU throttling.
Tried to fix it by set TORCH_NUM_THREADS env variable in CI.
post_training_quantization/685/
6b0fee0 to
45c43a7
Compare
alexsu52
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Changes
ExtendedQuantizerSetup. It contains additional info about dtypes to use in q->dq pairsReason for changes
To fully support quantization via
quantize_pt2ewith custom quantizers (likeXNNPACKQuantizer)Related tickets
#3231
Tests
tests/torch/fx/test_calculation_quantizer_params.pyOV_QUANTIZER_NNCF,OV_QUANTIZER_AO