Skip to content

Conversation

daniil-lyakhov
Copy link
Owner

Summary

[PLEASE REMOVE] See CONTRIBUTING.md's Pull Requests for ExecuTorch PR guidelines.

[PLEASE REMOVE] If this PR closes an issue, please add a Fixes #<issue-id> line.

[PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: " label. For a list of available release notes labels, check out CONTRIBUTING.md's Pull Requests.

Test plan

[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.

Enable model quantization: Default is False.

- **`--dataset`** (optional):
Path to the calibration dataset. TODO: It is necessary to think in what form to support the dataset. For the experiment, tiny-imagenet is used, which can be downloaded from here http://cs231n.stanford.edu/tiny-imagenet-200.zip and specify the path to it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

default="CPU",
help="Target device for compiling the model (e.g., CPU, GPU). Default is CPU.",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a --qunatization_flow argument that would have values ​​"pt2e" and "nncf" to specify which flow should be used during quantization?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please check

if suite == "torchvision":
transform = torchvision_models.get_model_weights(model_name).DEFAULT.transforms()
else:
transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work for the huggingface suite?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, an exception is added

elif isinstance(input_shape, list):
input_shape = tuple(input_shape)
else:
msg = "Input shape must be a list or tuple."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if input_shape is tuple I will be here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# Export the model to the aten dialect
aten_dialect: ExportedProgram = export(model, example_args)

if quantize:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you implement quantization flow in the separate function?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

print(f"Model exported and saved as {model_name}.pte on {device}.")
print(f"Model exported and saved as {model_file_name} on {device}.")

if validate:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you implement validation in the separate function?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

from nncf.common.graph.graph import NNCFGraph

QUANT_ANNOTATION_KEY = "quantization_annotation"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to introduce the following class:

class QuantizationMode(StrEnum):
    """
    Defines special quantization modes.
 
    - INT8_SYM: INT8 symmetric quantization for both activations and weights.
    - INT8_MIXED: INT8 asymmetric quantization for activations, symmetric for weights.
    - INT8_TRANSFORMER: Optimized INT8 quantization for transformer-based models
    """
 
    INT8_SYM = "int8_sym"
    INT8_MIXED = "int8_mixed"
    INT8_TRANSFORMER = "int8_transformer"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 36 to 54
def __init__(
self,
*,
mode: Optional[p.QuantizationMode] = None,
preset: Optional[q.structs.QuantizationPreset] = None,
target_device: p.TargetDevice = p.TargetDevice.ANY,
transformer_model: bool = False,
ignored_scope: Optional[nncf.IgnoredScope] = None,
overflow_fix: Optional[advanced_p.OverflowFix] = None,
quantize_outputs: bool = False,
activations_quantization_params: Optional[advanced_p.QuantizationParameters] = None,
weights_quantization_params: Optional[advanced_p.QuantizationParameters] = None,
):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think?

Suggested change
def __init__(
self,
*,
mode: Optional[p.QuantizationMode] = None,
preset: Optional[q.structs.QuantizationPreset] = None,
target_device: p.TargetDevice = p.TargetDevice.ANY,
transformer_model: bool = False,
ignored_scope: Optional[nncf.IgnoredScope] = None,
overflow_fix: Optional[advanced_p.OverflowFix] = None,
quantize_outputs: bool = False,
activations_quantization_params: Optional[advanced_p.QuantizationParameters] = None,
weights_quantization_params: Optional[advanced_p.QuantizationParameters] = None,
):
def __init__(
self,
*,
mode: QuantizationMode = QuantizationMode.INT8_MIXED,
ignored_scope: Optional[nncf.IgnoredScope] = None,
**kwargs
):

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the constructor parameters subset, refactored

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I set default QuantizationMode == INT8_SYM to alight with the default MinMax parameters (which were default parameters before) https://github.com/openvinotoolkit/nncf/blob/develop/nncf/quantization/algorithms/min_max/algorithm.py#L211-L215

@daniil-lyakhov daniil-lyakhov force-pushed the dl/openvino/model_enabling branch from ddcbb11 to 01b88f8 Compare February 11, 2025 17:01
self,
*,
mode: Optional[QuantizationMode] = QuantizationMode.INT8_SYM,
ignored_scope: Optional[nncf.IgnoredScope] = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignored scope is model specific information. What do you think about introducing additional method to set ignored scope?

def set_ignored_scope(names: Optional[List[str]] = None, patterns: Optional[List[str]] = None, types: Optional[List[str]] = None, subgraphs: Optional[List[Tuple[List[str], List[str]]]] = None, validate: bool = True)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@daniil-lyakhov daniil-lyakhov force-pushed the dl/openvino/model_enabling branch 2 times, most recently from 573f316 to c499e3c Compare February 12, 2025 14:08
@daniil-lyakhov daniil-lyakhov force-pushed the dl/openvino/model_enabling branch 2 times, most recently from fa6744b to c7e0758 Compare February 12, 2025 14:31
alexsu52 pushed a commit to openvinotoolkit/nncf that referenced this pull request Feb 24, 2025
### Changes

Mark quantizer and quantize_pt2e as API

### Reason for changes

To introduce `OpenVINOQuantizer` and `quantize_pt2e` in the api docs:
https://openvinotoolkit.github.io/nncf/index.html

### Related tickets

daniil-lyakhov/executorch#2
daniil-lyakhov pushed a commit that referenced this pull request Jun 4, 2025
Differential Revision: D75034439

Pull Request resolved: pytorch#11011
daniil-lyakhov pushed a commit that referenced this pull request Jun 11, 2025
Differential Revision: D75982351

Pull Request resolved: pytorch#11456
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants