-
Notifications
You must be signed in to change notification settings - Fork 0
[YN]Openvino backend #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: openvino_backend
Are you sure you want to change the base?
Conversation
examples/openvino/aot/README.md
Outdated
Enable model quantization: Default is False. | ||
|
||
- **`--dataset`** (optional): | ||
Path to the calibration dataset. TODO: It is necessary to think in what form to support the dataset. For the experiment, tiny-imagenet is used, which can be downloaded from here http://cs231n.stanford.edu/tiny-imagenet-200.zip and specify the path to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
default="CPU", | ||
help="Target device for compiling the model (e.g., CPU, GPU). Default is CPU.", | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding a --qunatization_flow
argument that would have values "pt2e" and "nncf" to specify which flow should be used during quantization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, please check
if suite == "torchvision": | ||
transform = torchvision_models.get_model_weights(model_name).DEFAULT.transforms() | ||
else: | ||
transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work for the huggingface suite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, an exception is added
elif isinstance(input_shape, list): | ||
input_shape = tuple(input_shape) | ||
else: | ||
msg = "Input shape must be a list or tuple." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like if input_shape is tuple I will be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# Export the model to the aten dialect | ||
aten_dialect: ExportedProgram = export(model, example_args) | ||
|
||
if quantize: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you implement quantization flow in the separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
print(f"Model exported and saved as {model_name}.pte on {device}.") | ||
print(f"Model exported and saved as {model_file_name} on {device}.") | ||
|
||
if validate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you implement validation in the separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
from nncf.common.graph.graph import NNCFGraph | ||
|
||
QUANT_ANNOTATION_KEY = "quantization_annotation" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to introduce the following class:
class QuantizationMode(StrEnum):
"""
Defines special quantization modes.
- INT8_SYM: INT8 symmetric quantization for both activations and weights.
- INT8_MIXED: INT8 asymmetric quantization for activations, symmetric for weights.
- INT8_TRANSFORMER: Optimized INT8 quantization for transformer-based models
"""
INT8_SYM = "int8_sym"
INT8_MIXED = "int8_mixed"
INT8_TRANSFORMER = "int8_transformer"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
def __init__( | ||
self, | ||
*, | ||
mode: Optional[p.QuantizationMode] = None, | ||
preset: Optional[q.structs.QuantizationPreset] = None, | ||
target_device: p.TargetDevice = p.TargetDevice.ANY, | ||
transformer_model: bool = False, | ||
ignored_scope: Optional[nncf.IgnoredScope] = None, | ||
overflow_fix: Optional[advanced_p.OverflowFix] = None, | ||
quantize_outputs: bool = False, | ||
activations_quantization_params: Optional[advanced_p.QuantizationParameters] = None, | ||
weights_quantization_params: Optional[advanced_p.QuantizationParameters] = None, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think?
def __init__( | |
self, | |
*, | |
mode: Optional[p.QuantizationMode] = None, | |
preset: Optional[q.structs.QuantizationPreset] = None, | |
target_device: p.TargetDevice = p.TargetDevice.ANY, | |
transformer_model: bool = False, | |
ignored_scope: Optional[nncf.IgnoredScope] = None, | |
overflow_fix: Optional[advanced_p.OverflowFix] = None, | |
quantize_outputs: bool = False, | |
activations_quantization_params: Optional[advanced_p.QuantizationParameters] = None, | |
weights_quantization_params: Optional[advanced_p.QuantizationParameters] = None, | |
): | |
def __init__( | |
self, | |
*, | |
mode: QuantizationMode = QuantizationMode.INT8_MIXED, | |
ignored_scope: Optional[nncf.IgnoredScope] = None, | |
**kwargs | |
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the constructor parameters subset, refactored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I set default QuantizationMode == INT8_SYM to alight with the default MinMax parameters (which were default parameters before) https://github.com/openvinotoolkit/nncf/blob/develop/nncf/quantization/algorithms/min_max/algorithm.py#L211-L215
ddcbb11
to
01b88f8
Compare
self, | ||
*, | ||
mode: Optional[QuantizationMode] = QuantizationMode.INT8_SYM, | ||
ignored_scope: Optional[nncf.IgnoredScope] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignored scope is model specific information. What do you think about introducing additional method to set ignored scope?
def set_ignored_scope(names: Optional[List[str]] = None, patterns: Optional[List[str]] = None, types: Optional[List[str]] = None, subgraphs: Optional[List[Tuple[List[str], List[str]]]] = None, validate: bool = True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
573f316
to
c499e3c
Compare
Co-authored-by: Alexander Suslov <[email protected]>
fa6744b
to
c7e0758
Compare
### Changes Mark quantizer and quantize_pt2e as API ### Reason for changes To introduce `OpenVINOQuantizer` and `quantize_pt2e` in the api docs: https://openvinotoolkit.github.io/nncf/index.html ### Related tickets daniil-lyakhov/executorch#2
Differential Revision: D75034439 Pull Request resolved: pytorch#11011
Differential Revision: D75982351 Pull Request resolved: pytorch#11456
Summary
[PLEASE REMOVE] See CONTRIBUTING.md's Pull Requests for ExecuTorch PR guidelines.
[PLEASE REMOVE] If this PR closes an issue, please add a
Fixes #<issue-id>
line.[PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: " label. For a list of available release notes labels, check out CONTRIBUTING.md's Pull Requests.
Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.