Skip to content

Conversation

@daniil-lyakhov
Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov commented May 8, 2025

Changes

  • TorchAOAdapter is updated with new entity: the ExtendedQuantizerSetup. It contains additional info about dtypes to use in q->dq pairs
  • TorchFX MinMax algo backend migrated from restrained stip procedure to flexible custom quantization parameters assignment code

Reason for changes

To fully support quantization via quantize_pt2e with custom quantizers (like XNNPACKQuantizer)

Related tickets

#3231

Tests

  • flexible custom quantization parameters assignment is tested by tests/torch/fx/test_calculation_quantizer_params.py
  • Conformance test is updated with 2 new configurations:
    OV_QUANTIZER_NNCF, OV_QUANTIZER_AO
  • conformance: post_training_quantization/682/

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF Common Pull request that updates NNCF Common experimental NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ labels May 8, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 0c628c4 to 8864b02 Compare May 8, 2025 16:03
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 8864b02 to 1fa9940 Compare June 3, 2025 13:20
@github-actions github-actions bot removed NNCF Common Pull request that updates NNCF Common experimental NNCF ONNX Pull requests that updates NNCF ONNX labels Jun 3, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 1fa9940 to 8864b02 Compare June 3, 2025 13:30
@github-actions github-actions bot added NNCF Common Pull request that updates NNCF Common experimental NNCF ONNX Pull requests that updates NNCF ONNX labels Jun 3, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 8864b02 to 1efa37f Compare June 3, 2025 14:52
@github-actions github-actions bot removed the NNCF Common Pull request that updates NNCF Common label Jun 3, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 8814d48 to 9bcaba1 Compare June 6, 2025 18:16
@daniil-lyakhov daniil-lyakhov changed the title Dl/fx/dont use nncf q [TorchFX] quantize_pt2e custom quantizers support Jun 6, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 9bcaba1 to d506459 Compare June 6, 2025 18:28
@daniil-lyakhov daniil-lyakhov marked this pull request as ready for review June 6, 2025 18:28
@daniil-lyakhov daniil-lyakhov requested a review from a team as a code owner June 6, 2025 18:28
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from d506459 to c85ffe7 Compare June 6, 2025 18:31
@alexsu52 alexsu52 self-requested a review June 7, 2025 07:32
UINT8 = "UINT8"


class ExtendedQuantizerSetup(ABC, SingleConfigQuantizerSetup):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change requires offline discussion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, ExtendedQuantizerSetup is replaced with ExtendedQuantizerConfig

AlexanderDokuchaev pushed a commit that referenced this pull request Jun 16, 2025
…#3541)

Splitting of huge #3487 PR:

### Changes

Default `DuplicateDQPass` which does not work without torch.ao
annotations is replaced with working DuplicateDQPassNoAnnotations

### Reason for changes

To fix `DuplicateDQPass`

### Example

![image](https://github.com/user-attachments/assets/2f7ec0e5-e1ab-4e44-92ed-9f1fabbf13cb)

### Related tickets

#3487 
#3231

### Tests

TorchFX Conformance tests references are updated to test fixed
DuplicateDQPassNoAnnotations pass
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 255f9b2 to 92ef6e2 Compare June 16, 2025 12:15
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch 2 times, most recently from 012750c to 69796b5 Compare July 7, 2025 09:44
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 69796b5 to 7a2c141 Compare July 7, 2025 10:07
@github-actions github-actions bot removed the NNCF Common Pull request that updates NNCF Common label Jul 7, 2025
@daniil-lyakhov
Copy link
Collaborator Author

@AlexanderDokuchaev, I hope we can merge this PR without the refactoring of TensorDataType and a registry for the new ExtendedQuantizerConfig class. Please let me know ASAP if it is not possible

@github-actions github-actions bot added the NNCF Common Pull request that updates NNCF Common label Jul 7, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from adaadaa to 01656d1 Compare July 7, 2025 14:29
:param dest_dtype: Target integer data type for quantized values.
"""
super().__init__(num_bits, mode, signedness_to_force, per_channel, narrow_range)
if dest_dtype not in [TensorDataType.int8, TensorDataType.uint8]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question, why ExtendedQuantizerConfig responsible to check type?
It's responsibilities of algorithms to use only valid type, it's just structure that in general case can contain any type.
Suggest to remove this check, check type in algorithms if need, and remove test_structs.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because other types are not supported, no? I don't want to check types in all other places, I want to check them at the beginning. Config with dest_type bfloat16 or int4 does not make any sense to have right now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation type is responsibility of algorithms, not a structure that just contains parameters.
Config will be valid with all dest_type, but quantization algorithm supports only uint8 and int8

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

metric_value: 0.2429
torchvision/vit_b_16_backend_X86_QUANTIZER_NNCF:
metric_value: 0.80922
torchvision/vit_b_16_backend_X86_QUANTIZER_AO:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will not merge tests that are running for several days, even if they are disabled by default.
post_training_quantization/681/ - still in progress after 7 days

What a reason that is takes long time?
Did you check that there is no any bugs?

:param dest_dtype: Target integer data type for quantized values.
"""
super().__init__(num_bits, mode, signedness_to_force, per_channel, narrow_range)
if dest_dtype not in [TensorDataType.int8, TensorDataType.uint8]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation type is responsibility of algorithms, not a structure that just contains parameters.
Config will be valid with all dest_type, but quantization algorithm supports only uint8 and int8

metric_value: 0.2429
torchvision/vit_b_16_backend_X86_QUANTIZER_NNCF:
metric_value: 0.80922
torchvision/vit_b_16_backend_X86_QUANTIZER_AO:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the issue: validation with using PyTorch runs on all available CPU cores by default, but there's a limit set in the CI environment. As a result, validation runs slower due to CPU throttling.
Tried to fix it by set TORCH_NUM_THREADS env variable in CI.
post_training_quantization/685/

@github-actions github-actions bot removed the NNCF Common Pull request that updates NNCF Common label Jul 10, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/dont_use_nncf_q branch from 6b0fee0 to 45c43a7 Compare July 16, 2025 07:51
@daniil-lyakhov daniil-lyakhov requested a review from alexsu52 July 16, 2025 07:52
Copy link

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexsu52 alexsu52 merged commit 73b39a3 into openvinotoolkit:develop Jul 22, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants