diff --git a/FAQ.md b/FAQ.md index 5221abda4..0c79f6d78 100644 --- a/FAQ.md +++ b/FAQ.md @@ -5,6 +5,7 @@ 1. [Why does the size of the quantized model remain the same as the original model size?](#1-why-does-the-size-of-the-quantized-model-remain-the-same-as-the-original-model-size) 2. [Why does loading a quantized exported model from a file fail?](#2-why-does-loading-a-quantized-exported-model-from-a-file-fail) 3. [Why am I getting a torch.fx error?](#3-why-am-i-getting-a-torchfx-error) +4. [Does MCT support both per-tensor and per-channel quantization?](#4-does-mct-support-both-per-tensor-and-per-channel-quantization) ### 1. Why does the size of the quantized model remain the same as the original model size? @@ -54,3 +55,26 @@ Despite these limitations, some adjustments can be made to facilitate MCT quanti Check the `torch.fx` error, and search for an identical replacement. Some examples: * An `if` statement in a module's `forward` method might can be easily skipped. * The `list()` Python method can be replaced with a concatenation operation [A, B, C]. + +### 4. Does MCT support both per-tensor and per-channel quantization? + +MCT supports both per-tensor and per-channel quantization, as [defined in TPC](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold). To change this, please set the following parameters. + +**Solution**: You can switch between per-tensor quantization and per-channel quantization by switching the parameter (weights_per_channel_threshold) as shown below. + +In the object that configures the quantizer below: +* model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig() + +Set the following parameter: +* weights_per_channel_threshold(bool) - Indicates whether to quantize the weights per-channel or per-tensor. + +For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/target_platform_capabilities.html#model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold). + + +In QAT, the following object is used to set up a weight-learnable quantizer: +* model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig() + +Set the following parameter: +* weights_per_channel_threshold (bool) – Whether to quantize the weights per-channel or not (per-tensor). + +For more details, please refer to [this page](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/modules/trainable_infrastructure.html#trainablequantizerweightsconfig). diff --git a/docs/api/api_docs/classes/BitWidthConfig.html b/docs/api/api_docs/classes/BitWidthConfig.html index 83473d0f6..34ff1f8af 100644 --- a/docs/api/api_docs/classes/BitWidthConfig.html +++ b/docs/api/api_docs/classes/BitWidthConfig.html @@ -7,7 +7,7 @@
Get the value of the inner dictionary by the given key, If key is not in dictionary, it uses the default_factory to return a default value.
key – Key to use in inner dictionary.
+Any
Value of the inner dictionary by the given key, or a default value if not exist. -If default_factory was not passed at initialization, it returns None.
+key – Key to use in inner dictionary.
Any
Value of the inner dictionary by the given key, or a default value if not exist. +If default_factory was not passed at initialization, it returns None.
Examples
When quantizing a Keras model, if we want to quantize the kernels of Conv2D layers only, we can set, and we know it’s kernel out/in channel indices are (3, 2) respectivly:
->>> import tensorflow as tf
+>>> import tensorflow as tf
>>> kernel_ops = [tf.keras.layers.Conv2D]
>>> kernel_channels_mapping = DefaultDict({tf.keras.layers.Conv2D: (3,2)})
diff --git a/docs/api/api_docs/classes/GradientPTQConfig.html b/docs/api/api_docs/classes/GradientPTQConfig.html
index c31461a72..f8c3485cc 100644
--- a/docs/api/api_docs/classes/GradientPTQConfig.html
+++ b/docs/api/api_docs/classes/GradientPTQConfig.html
@@ -7,7 +7,7 @@
GradientPTQConfig Class — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/classes/MixedPrecisionQuantizationConfig.html b/docs/api/api_docs/classes/MixedPrecisionQuantizationConfig.html
index 7ddeea6c4..8c2dfca9d 100644
--- a/docs/api/api_docs/classes/MixedPrecisionQuantizationConfig.html
+++ b/docs/api/api_docs/classes/MixedPrecisionQuantizationConfig.html
@@ -7,7 +7,7 @@
MixedPrecisionQuantizationConfig — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/classes/PruningConfig.html b/docs/api/api_docs/classes/PruningConfig.html
index aeb06f672..1abe2e370 100644
--- a/docs/api/api_docs/classes/PruningConfig.html
+++ b/docs/api/api_docs/classes/PruningConfig.html
@@ -7,7 +7,7 @@
Pruning Configuration — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/classes/PruningInfo.html b/docs/api/api_docs/classes/PruningInfo.html
index 962091a6d..b66597303 100644
--- a/docs/api/api_docs/classes/PruningInfo.html
+++ b/docs/api/api_docs/classes/PruningInfo.html
@@ -7,7 +7,7 @@
Pruning Information — MCT Documentation: ver 2.6.0
-
+
@@ -65,9 +65,6 @@ Navigation
Return type:
Dict[BaseNode, np.ndarray]
-Return type:
-Dict[BaseNode, ndarray]
-
@@ -82,9 +79,6 @@ Navigation
Return type:
Dict[BaseNode, np.ndarray]
-Return type:
-Dict[BaseNode, ndarray]
-
diff --git a/docs/api/api_docs/classes/QuantizationConfig.html b/docs/api/api_docs/classes/QuantizationConfig.html
index 4eab2f6ad..dfc3ab3a5 100644
--- a/docs/api/api_docs/classes/QuantizationConfig.html
+++ b/docs/api/api_docs/classes/QuantizationConfig.html
@@ -7,7 +7,7 @@
QuantizationConfig — MCT Documentation: ver 2.6.0
-
+
@@ -50,7 +50,7 @@ Navigation
activations using thresholds, with weight threshold selection based on MSE and activation threshold selection
using NOCLIPPING (min/max), while enabling relu_bound_to_power_of_2 and weights_bias_correction,
you can instantiate a quantization configuration like this:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> qc = mct.core.QuantizationConfig(activation_error_method=mct.core.QuantizationErrorMethod.NOCLIPPING, weights_error_method=mct.core.QuantizationErrorMethod.MSE, relu_bound_to_power_of_2=True, weights_bias_correction=True)
diff --git a/docs/api/api_docs/classes/QuantizationErrorMethod.html b/docs/api/api_docs/classes/QuantizationErrorMethod.html
index a3d3b092a..4705b00e4 100644
--- a/docs/api/api_docs/classes/QuantizationErrorMethod.html
+++ b/docs/api/api_docs/classes/QuantizationErrorMethod.html
@@ -7,7 +7,7 @@
QuantizationErrorMethod — MCT Documentation: ver 2.6.0
-
+
@@ -45,12 +45,44 @@ Navigation
class model_compression_toolkit.core.QuantizationErrorMethod(value)¶
Method for quantization threshold selection:
-NOCLIPPING - Use min/max values as thresholds.
-MSE - Use mean square error for minimizing quantization noise.
+NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
+MSE - (default) Use mean square error for minimizing quantization noise.
MAE - Use mean absolute error for minimizing quantization noise.
KL - Use KL-divergence to make signals distributions to be similar as possible.
-Lp - Use Lp-norm to minimizing quantization noise.
+Lp - Use Lp-norm to minimizing quantization noise. The parameter p is specified by QuantizationConfig.l_p_value (default: 2; integer only). It equals MAE when p = 1 and MSE when p = 2. If you want to use p≧3, please use this method.
HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization.
+How to select QuantizationErrorMethod
+
+
+
+
+
+
+Method
+Recommended Situations
+
+
+
+NOCLIPPING
+Research and debugging phases where you want to observe behavior across the entire range. This is effective when you want to maintain the entire range, especially when the data is biased (for example, when there is an extremely small amount of data on the minimum side).
+
+MSE
+Basically, you should use this method. This method is effective when the data distribution is close to normal and there are few outliers. Effective when you want stable results, such as in regression tasks.
+
+MAE
+Effective for data with a lot of noise and outliers.
+
+KL
+Useful for tasks where output distribution is important (such as Anomaly Detection).
+
+LP
+p≧3 is effective when you want to be more sensitive to outliers than MSE. (such as Sparse Data).
+
+HMSE
+Recommended when using GPTQ. This is effective for models where specific layers strongly influence the overall accuracy. (such as Transformers).
+
+
+
diff --git a/docs/api/api_docs/classes/ResourceUtilization.html b/docs/api/api_docs/classes/ResourceUtilization.html
index 9c0fe05c3..9e4ea601c 100644
--- a/docs/api/api_docs/classes/ResourceUtilization.html
+++ b/docs/api/api_docs/classes/ResourceUtilization.html
@@ -7,7 +7,7 @@
ResourceUtilization — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/classes/Wrapper.html b/docs/api/api_docs/classes/Wrapper.html
index 198ef4b24..36a729fc8 100644
--- a/docs/api/api_docs/classes/Wrapper.html
+++ b/docs/api/api_docs/classes/Wrapper.html
@@ -7,7 +7,7 @@
wrapper — MCT Documentation: ver 2.6.0
-
+
@@ -57,8 +57,11 @@ Navigation
quantize_and_export(float_model, representative_dataset, framework='pytorch', method='PTQ', use_mixed_precision=False, param_items=None)¶
Main function to perform model quantization and export.
-- Parameters:
-
+- Return type:
+Tuple[bool, Any]
+
+- Parameters:
+
float_model – The float model to be quantized.
representative_dataset (Callable, np.array, tf.Tensor) – Representative dataset for calibration.
framework (str) – ‘tensorflow’ or ‘pytorch’.
@@ -71,13 +74,13 @@
Navigation
[[key,value],…]. Default: None
-- Returns:
-tuple (quantization success flag, quantized model)
+- Returns:
+tuple (quantization success flag, quantized model)
Examples
Import MCT
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Prepare the float model and dataset
@@ -342,11 +345,6 @@ Navigation
-
-- Return type:
-Tuple[bool, Any]
-
-
diff --git a/docs/api/api_docs/classes/XQuantConfig.html b/docs/api/api_docs/classes/XQuantConfig.html
index 19fc4fbef..68b667bf7 100644
--- a/docs/api/api_docs/classes/XQuantConfig.html
+++ b/docs/api/api_docs/classes/XQuantConfig.html
@@ -7,7 +7,7 @@
XQuant Configuration — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/index.html b/docs/api/api_docs/index.html
index 5c7693b1c..7165a2c56 100644
--- a/docs/api/api_docs/index.html
+++ b/docs/api/api_docs/index.html
@@ -7,7 +7,7 @@
API Docs — MCT Documentation: ver 2.6.0
-
+
@@ -45,7 +45,7 @@ Navigation
API Docs¶
Init module for MCT API.
-import model_compression_toolkit as mct
+import model_compression_toolkit as mct
diff --git a/docs/api/api_docs/methods/get_keras_data_generation_config.html b/docs/api/api_docs/methods/get_keras_data_generation_config.html
index 88918209e..394e33802 100644
--- a/docs/api/api_docs/methods/get_keras_data_generation_config.html
+++ b/docs/api/api_docs/methods/get_keras_data_generation_config.html
@@ -7,7 +7,7 @@
Get DataGenerationConfig for Keras Models — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.data_generation.get_keras_data_generation_config(n_iter=DEFAULT_N_ITER, optimizer=Adam, data_gen_batch_size=DEFAULT_DATA_GEN_BS, initial_lr=DEFAULT_KERAS_INITIAL_LR, output_loss_multiplier=DEFAULT_KERAS_OUTPUT_LOSS_MULTIPLIER, scheduler_type=SchedulerType.REDUCE_ON_PLATEAU, bn_alignment_loss_type=BatchNormAlignemntLossType.L2_SQUARE, output_loss_type=OutputLossType.REGULARIZED_MIN_MAX_DIFF, data_init_type=DataInitType.Gaussian, layer_weighting_type=BNLayerWeightingType.AVERAGE, image_granularity=ImageGranularity.BatchWise, image_pipeline_type=ImagePipelineType.SMOOTHING_AND_AUGMENTATION, image_normalization_type=ImageNormalizationType.KERAS_APPLICATIONS, extra_pixels=DEFAULT_KERAS_EXTRA_PIXELS, bn_layer_types=[BatchNormalization], image_clipping=False)¶
Function to create a DataGenerationConfig object with the specified configuration parameters.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
n_iter (int) – Number of iterations for the data generation process.
optimizer (Optimizer) – The optimizer to use for the data generation process.
data_gen_batch_size (int) – Batch size for data generation.
@@ -65,14 +68,11 @@ Navigation
image_clipping (bool) – Whether to clip images during optimization.
-- Returns:
-Data generation configuration object.
-
-- Return type:
--
+
- Returns:
+Data generation configuration object.
- Return type:
--
+
-
diff --git a/docs/api/api_docs/methods/get_keras_gptq_config.html b/docs/api/api_docs/methods/get_keras_gptq_config.html
index 36099273c..bde134fb8 100644
--- a/docs/api/api_docs/methods/get_keras_gptq_config.html
+++ b/docs/api/api_docs/methods/get_keras_gptq_config.html
@@ -7,7 +7,7 @@
Get GradientPTQConfig for Keras Models — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.gptq.get_keras_gptq_config(n_epochs, optimizer=None, optimizer_rest=None, loss=None, log_function=None, use_hessian_based_weights=True, regularization_factor=None, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE, use_hessian_sample_attention=True, gradual_activation_quantization=True)¶
Create a GradientPTQConfig instance for Keras models.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
n_epochs (int) – Number of epochs for running the representative dataset for fine-tuning.
optimizer (OptimizerV2) – Keras optimizer to use for fine-tuning for auxiliary variable. Default: Adam(learning rate set to 3e-2).
optimizer_rest (OptimizerV2) – Keras optimizer to use for fine-tuning of the bias variable. Default: Adam(learning rate set to 1e-4).
@@ -59,14 +62,14 @@ Navigation
gradual_activation_quantization (bool, GradualActivationQuantizationConfig) – If False, GradualActivationQuantization is disabled. If True, GradualActivationQuantization is enabled with the default settings. GradualActivationQuantizationConfig object can be passed to use non-default settings.
-- Returns:
-a GradientPTQConfig object to use when fine-tuning the quantized model using gptq.
+- Returns:
+a GradientPTQConfig object to use when fine-tuning the quantized model using gptq.
Examples
Import MCT and TensorFlow:
->>> import model_compression_toolkit as mct
->>> import tensorflow as tf
+>>> import model_compression_toolkit as mct
+>>> import tensorflow as tf
Create a GradientPTQConfig to run for 5 epochs:
@@ -78,11 +81,6 @@ Navigation
The configuration can be passed to keras_gradient_post_training_quantization() in order to quantize a keras model using gptq.
-
-- Return type:
--
-
-
diff --git a/docs/api/api_docs/methods/get_pytorch_data_generation_config.html b/docs/api/api_docs/methods/get_pytorch_data_generation_config.html
index dae05e83c..9bd99a6b2 100644
--- a/docs/api/api_docs/methods/get_pytorch_data_generation_config.html
+++ b/docs/api/api_docs/methods/get_pytorch_data_generation_config.html
@@ -7,7 +7,7 @@
Get DataGenerationConfig for Pytorch Models — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.data_generation.get_pytorch_data_generation_config(n_iter=DEFAULT_N_ITER, optimizer=RAdam, data_gen_batch_size=DEFAULT_DATA_GEN_BS, initial_lr=DEFAULT_PYTORCH_INITIAL_LR, output_loss_multiplier=DEFAULT_PYTORCH_OUTPUT_LOSS_MULTIPLIER, scheduler_type=SchedulerType.REDUCE_ON_PLATEAU_WITH_RESET, bn_alignment_loss_type=BatchNormAlignemntLossType.L2_SQUARE, output_loss_type=OutputLossType.NEGATIVE_MIN_MAX_DIFF, data_init_type=DataInitType.Gaussian, layer_weighting_type=BNLayerWeightingType.AVERAGE, image_granularity=ImageGranularity.AllImages, image_pipeline_type=ImagePipelineType.SMOOTHING_AND_AUGMENTATION, image_normalization_type=ImageNormalizationType.TORCHVISION, extra_pixels=DEFAULT_PYTORCH_EXTRA_PIXELS, bn_layer_types=DEFAULT_PYTORCH_BN_LAYER_TYPES, last_layer_types=DEFAULT_PYTORCH_LAST_LAYER_TYPES, image_clipping=True)¶
Function to create a DataGenerationConfig object with the specified configuration parameters.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
n_iter (int) – Number of iterations for the data generation process.
optimizer (Optimizer) – The optimizer to use for the data generation process.
data_gen_batch_size (int) – Batch size for data generation.
@@ -66,14 +69,11 @@ Navigation
image_clipping (bool) – Whether to clip images during optimization.
-- Returns:
-Data generation configuration object.
-
-- Return type:
--
+
- Returns:
+Data generation configuration object.
- Return type:
--
+
-
diff --git a/docs/api/api_docs/methods/get_pytroch_gptq_config.html b/docs/api/api_docs/methods/get_pytroch_gptq_config.html
index a27547800..4b6893053 100644
--- a/docs/api/api_docs/methods/get_pytroch_gptq_config.html
+++ b/docs/api/api_docs/methods/get_pytroch_gptq_config.html
@@ -7,7 +7,7 @@
Get GradientPTQConfig for Pytorch Models — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.gptq.get_pytorch_gptq_config(n_epochs, optimizer=None, optimizer_rest=None, loss=None, log_function=None, use_hessian_based_weights=True, regularization_factor=None, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE, use_hessian_sample_attention=True, gradual_activation_quantization=True)¶
Create a GradientPTQConfig instance for Pytorch models.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
n_epochs (int) – Number of epochs for running the representative dataset for fine-tuning.
optimizer (Optimizer) – Pytorch optimizer to use for fine-tuning for auxiliary variable. Default: Adam(learning rate set to 3e-2).
optimizer_rest (Optimizer) – Pytorch optimizer to use for fine-tuning of the bias variable. Default: Adam(learning rate set to 1e-4).
@@ -59,27 +62,22 @@ Navigation
gradual_activation_quantization (bool, GradualActivationQuantizationConfig) – If False, GradualActivationQuantization is disabled. If True, GradualActivationQuantization is enabled with the default settings. GradualActivationQuantizationConfig object can be passed to use non-default settings.
-- Returns:
-a GradientPTQConfig object to use when fine-tuning the quantized model using gptq.
+- Returns:
+a GradientPTQConfig object to use when fine-tuning the quantized model using gptq.
Examples
Import MCT and Create a GradientPTQConfig to run for 5 epochs:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> gptq_conf = mct.gptq.get_pytorch_gptq_config(n_epochs=5)
Other PyTorch optimizers can be passed with dummy params:
->>> import torch
+>>> import torch
>>> gptq_conf = mct.gptq.get_pytorch_gptq_config(n_epochs=3, optimizer=torch.optim.Adam([torch.Tensor(1)]))
The configuration can be passed to pytorch_gradient_post_training_quantization() in order to quantize a pytorch model using gptq.
-
-- Return type:
--
-
-
diff --git a/docs/api/api_docs/methods/get_target_platform_capabilities.html b/docs/api/api_docs/methods/get_target_platform_capabilities.html
index e846a07c3..d981a4e14 100644
--- a/docs/api/api_docs/methods/get_target_platform_capabilities.html
+++ b/docs/api/api_docs/methods/get_target_platform_capabilities.html
@@ -7,7 +7,7 @@
Get TargetPlatformCapabilities for tpc version — MCT Documentation: ver 2.6.0
-
+
@@ -45,17 +45,17 @@ Navigation
model_compression_toolkit.get_target_platform_capabilities(tpc_version=TPC_V1_0, device_type=IMX500_TP_MODEL)¶
Retrieves target platform capabilities model based on tpc version and the specified device type.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
tpc_version (str) – Target platform capabilities version.
device_type (str) – The type of device for the target platform.
-- Returns:
-The TargetPlatformCapabilities object matching the tpc version.
-
-- Return type:
--
+
- Returns:
+The TargetPlatformCapabilities object matching the tpc version.
diff --git a/docs/api/api_docs/methods/get_target_platform_capabilities_sdsp.html b/docs/api/api_docs/methods/get_target_platform_capabilities_sdsp.html
index 55a38f3f7..8ab54b9ae 100644
--- a/docs/api/api_docs/methods/get_target_platform_capabilities_sdsp.html
+++ b/docs/api/api_docs/methods/get_target_platform_capabilities_sdsp.html
@@ -7,7 +7,7 @@
Get TargetPlatformCapabilities for sdsp converter version — MCT Documentation: ver 2.6.0
-
+
@@ -45,14 +45,14 @@ Navigation
model_compression_toolkit.get_target_platform_capabilities_sdsp(sdsp_version=SDSP_V3_14)¶
Retrieves target platform capabilities model based on sdsp converter version.
-- Parameters:
-sdsp_version (str) – Sdsp converter version.
+- Return type:
+-
-- Returns:
-The TargetPlatformCapabilities object matching the sdsp converter version.
+- Parameters:
+sdsp_version (str) – Sdsp converter version.
-- Return type:
--
+
- Returns:
+The TargetPlatformCapabilities object matching the sdsp converter version.
diff --git a/docs/api/api_docs/methods/keras_data_generation_experimental.html b/docs/api/api_docs/methods/keras_data_generation_experimental.html
index 8a77338b8..3ecd40705 100644
--- a/docs/api/api_docs/methods/keras_data_generation_experimental.html
+++ b/docs/api/api_docs/methods/keras_data_generation_experimental.html
@@ -7,7 +7,7 @@
Keras Data Generation — MCT Documentation: ver 2.6.0
-
+
@@ -45,27 +45,30 @@ Navigation
model_compression_toolkit.data_generation.keras_data_generation_experimental(model, n_images, output_image_size, data_generation_config)¶
Function to perform data generation using the provided Keras model and data generation configuration.
-- Parameters:
-
+- Return type:
+Tensor
+
+- Parameters:
+
model (Model) – Keras model to generate data for.
n_images (int) – Number of images to generate.
output_image_size (Union[int, Tuple[int, int]]) – Size of the output images.
data_generation_config (DataGenerationConfig) – Configuration for data generation.
-- Returns:
-Finalized list containing generated images.
+- Returns:
+Finalized list containing generated images.
-- Return type:
-List[tf.Tensor]
+- Return type:
+List[tf.Tensor]
Examples
In this example, we’ll walk through generating images using a simple Keras model and a data generation configuration. The process involves creating a model, setting up a data generation configuration, and finally generating images with specified parameters.
Start by importing the Model Compression Toolkit (MCT), TensorFlow, and some layers from tensorflow.keras:
->>> import model_compression_toolkit as mct
->>> from tensorflow.keras.models import Sequential
->>> from tensorflow.keras.layers import Conv2D, BatchNormalization, Flatten, Dense, Reshape
+>>> import model_compression_toolkit as mct
+>>> from tensorflow.keras.models import Sequential
+>>> from tensorflow.keras.layers import Conv2D, BatchNormalization, Flatten, Dense, Reshape
Next, define a simple Keras model:
@@ -83,11 +86,6 @@ Navigation
The generated images can then be used for various purposes, such as data-free quantization.
-
-- Return type:
-Tensor
-
-
diff --git a/docs/api/api_docs/methods/keras_gradient_post_training_quantization.html b/docs/api/api_docs/methods/keras_gradient_post_training_quantization.html
index c80b34b77..c87c98636 100644
--- a/docs/api/api_docs/methods/keras_gradient_post_training_quantization.html
+++ b/docs/api/api_docs/methods/keras_gradient_post_training_quantization.html
@@ -7,7 +7,7 @@
Keras Gradient Based Post Training Quantization — MCT Documentation: ver 2.6.0
-
+
@@ -58,8 +58,11 @@ Navigation
training quantization by comparing points between the float and quantized models, and minimizing the observed
loss.
-- Parameters:
-
+- Return type:
+Tuple[Model, Optional[UserInformation]]
+
+- Parameters:
+
in_model (Model) – Keras model to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
gptq_config (GradientPTQConfig) – Configuration for using gptq (e.g. optimizer).
@@ -69,21 +72,21 @@ Navigation
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – TargetPlatformCapabilities to optimize the Keras model according to.
-- Returns:
-A quantized model and information the user may need to handle the quantized model.
+- Returns:
+A quantized model and information the user may need to handle the quantized model.
Examples
Import a Keras model:
->>> from tensorflow.keras.applications.mobilenet import MobileNet
+>>> from tensorflow.keras.applications.mobilenet import MobileNet
>>> model = MobileNet()
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches):
In this example a random dataset of 10 batches each containing 4 images is used.
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 224, 224, 3))]
@@ -113,11 +116,6 @@ Navigation
>>> quantized_model, quantization_info = mct.gptq.keras_gradient_post_training_quantization(model, repr_datagen, gptq_config, target_resource_utilization=ru, core_config=config)
-
-- Return type:
-Tuple[Model, Optional[UserInformation]]
-
-
diff --git a/docs/api/api_docs/methods/keras_kpi_data.html b/docs/api/api_docs/methods/keras_kpi_data.html
index b9d168cc8..3bb212d12 100644
--- a/docs/api/api_docs/methods/keras_kpi_data.html
+++ b/docs/api/api_docs/methods/keras_kpi_data.html
@@ -7,7 +7,7 @@
Get Resource Utilization information for Keras Models — MCT Documentation: ver 2.6.0
-
+
@@ -48,39 +48,37 @@ Navigation
Builds the computation graph from the given model and hw modeling, and uses it to compute the
resource utilization data.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
in_model (Model) – Keras model to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
core_config (CoreConfig) – CoreConfig containing parameters for quantization and mixed precision of how the model should be quantized.
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – FrameworkQuantizationCapabilities to optimize the Keras model according to.
-- Returns:
-A ResourceUtilization object with total weights parameters sum and max activation tensor.
+- Returns:
+A ResourceUtilization object with total weights parameters sum and max activation tensor.
Examples
Import a Keras model:
->>> from tensorflow.keras.applications.mobilenet import MobileNet
+>>> from tensorflow.keras.applications.mobilenet import MobileNet
>>> model = MobileNet()
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
Import MCT and call for resource utilization data calculation:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> ru_data = mct.core.keras_resource_utilization_data(model, repr_datagen)
-
-- Return type:
--
-
-
diff --git a/docs/api/api_docs/methods/keras_load_quantizad_model.html b/docs/api/api_docs/methods/keras_load_quantizad_model.html
index 8397bdd66..ed3b04e9a 100644
--- a/docs/api/api_docs/methods/keras_load_quantizad_model.html
+++ b/docs/api/api_docs/methods/keras_load_quantizad_model.html
@@ -7,7 +7,7 @@
Load Quantized Keras Model — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/methods/keras_post_training_quantization.html b/docs/api/api_docs/methods/keras_post_training_quantization.html
index b42467913..15ce8354b 100644
--- a/docs/api/api_docs/methods/keras_post_training_quantization.html
+++ b/docs/api/api_docs/methods/keras_post_training_quantization.html
@@ -7,7 +7,7 @@
Keras Post Training Quantization — MCT Documentation: ver 2.6.0
-
+
@@ -55,8 +55,11 @@ Navigation
In order to limit the maximal model’s size, a target ResourceUtilization need to be passed after weights_memory
is set (in bytes).
-- Parameters:
-
+- Return type:
+Tuple[Model, Optional[UserInformation]]
+
+- Parameters:
+
in_model (Model) – Keras model to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
target_resource_utilization (ResourceUtilization) – ResourceUtilization object to limit the search of the mixed-precision configuration as desired.
@@ -64,25 +67,25 @@ Navigation
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – TargetPlatformCapabilities to optimize the Keras model according to.
-- Returns:
-A quantized model and information the user may need to handle the quantized model.
+- Returns:
+A quantized model and information the user may need to handle the quantized model.
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Keras model:
->>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
+>>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
>>> model = MobileNetV2()
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches):
In this example a random dataset of 10 batches each containing 4 images is used.
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 224, 224, 3))]
@@ -110,11 +113,6 @@ Navigation
For more configuration options, please take a look at our API documentation.
-
-- Return type:
-Tuple[Model, Optional[UserInformation]]
-
-
diff --git a/docs/api/api_docs/methods/keras_pruning_experimental.html b/docs/api/api_docs/methods/keras_pruning_experimental.html
index 4732e318a..be1cb8550 100644
--- a/docs/api/api_docs/methods/keras_pruning_experimental.html
+++ b/docs/api/api_docs/methods/keras_pruning_experimental.html
@@ -7,7 +7,7 @@
Keras Structured Pruning — MCT Documentation: ver 2.6.0
-
+
@@ -53,8 +53,11 @@ Navigation
identify groups of channels that can be removed with minimal impact on performance.
Notice that the pruned model must be retrained to recover the compressed model’s performance.
-- Parameters:
-
+- Return type:
+Tuple[Model, PruningInfo]
+
+- Parameters:
+
model (Model) – The original Keras model to be pruned.
target_resource_utilization (ResourceUtilization) – The target Key Performance Indicators to be achieved through pruning.
representative_data_gen (Callable) – A function to generate representative data for pruning analysis.
@@ -62,11 +65,11 @@ Navigation
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – Platform-specific constraints and capabilities. Defaults to DEFAULT_KERAS_TPC.
-- Returns:
-A tuple containing the pruned Keras model and associated pruning information.
+- Returns:
+A tuple containing the pruned Keras model and associated pruning information.
-- Return type:
-Tuple[Model, PruningInfo]
+- Return type:
+Tuple[Model, PruningInfo]
@@ -75,17 +78,17 @@ Navigation
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Keras model:
->>> from tensorflow.keras.applications.resnet50 import ResNet50
+>>> from tensorflow.keras.applications.resnet50 import ResNet50
>>> model = ResNet50()
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
Define a target resource utilization for pruning.
@@ -106,11 +109,6 @@
Navigation
>>> pruned_model, pruning_info = mct.pruning.keras_pruning_experimental(model=model, target_resource_utilization=target_resource_utilization, representative_data_gen=repr_datagen, pruning_config=pruning_config)
-
-- Return type:
-Tuple[Model, PruningInfo]
-
-
diff --git a/docs/api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.html b/docs/api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.html
index 4f7c0129c..a468b0226 100644
--- a/docs/api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.html
+++ b/docs/api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.html
@@ -7,7 +7,7 @@
Keras Quantization Aware Training Model Finalize — MCT Documentation: ver 2.6.0
-
+
@@ -45,26 +45,29 @@ Navigation
model_compression_toolkit.qat.keras_quantization_aware_training_finalize_experimental(in_model)¶
Convert a model fine-tuned by the user (Trainable quantizers) to a model with Inferable quantizers.
-- Parameters:
-in_model (Model) – Keras model to replace TrainableQuantizer with InferableQuantizer
+- Return type:
+Model
+
+- Parameters:
+in_model (Model) – Keras model to replace TrainableQuantizer with InferableQuantizer
-- Returns:
-A quantized model with Inferable quantizers
+- Returns:
+A quantized model with Inferable quantizers
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Keras model:
->>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
+>>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
>>> model = MobileNetV2()
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
Create a MCT core config, containing the quantization configuration:
@@ -93,11 +96,6 @@ Navigation
>>> quantized_model = mct.qat.keras_quantization_aware_training_finalize_experimental(quantized_model)
-
-- Return type:
-Model
-
-
diff --git a/docs/api/api_docs/methods/keras_quantization_aware_training_init_experimental.html b/docs/api/api_docs/methods/keras_quantization_aware_training_init_experimental.html
index 6c0ff4112..c915dc2b8 100644
--- a/docs/api/api_docs/methods/keras_quantization_aware_training_init_experimental.html
+++ b/docs/api/api_docs/methods/keras_quantization_aware_training_init_experimental.html
@@ -7,7 +7,7 @@
Keras Quantization Aware Training Model Init — MCT Documentation: ver 2.6.0
-
+
@@ -75,19 +75,19 @@ Navigation
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Keras model:
->>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
+>>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
>>> model = MobileNetV2()
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches):
In this example a random dataset of 10 batches each containing 4 images is used.
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 224, 224, 3))]
diff --git a/docs/api/api_docs/methods/pytorch_data_generation_experimental.html b/docs/api/api_docs/methods/pytorch_data_generation_experimental.html
index 990280a9e..ba62ecb96 100644
--- a/docs/api/api_docs/methods/pytorch_data_generation_experimental.html
+++ b/docs/api/api_docs/methods/pytorch_data_generation_experimental.html
@@ -7,7 +7,7 @@
Pytorch Data Generation — MCT Documentation: ver 2.6.0
-
+
@@ -45,27 +45,30 @@ Navigation
model_compression_toolkit.data_generation.pytorch_data_generation_experimental(model, n_images, output_image_size, data_generation_config)¶
Function to perform data generation using the provided model and data generation configuration.
-- Parameters:
-
+- Return type:
+List[Tensor]
+
+- Parameters:
+
model (Module) – PyTorch model to generate data for.
n_images (int) – Number of images to generate.
output_image_size (Union[int, Tuple[int, int]]) – The hight and width size of the output images.
data_generation_config (DataGenerationConfig) – Configuration for data generation.
-- Returns:
-Finalized list containing generated images.
+- Returns:
+Finalized list containing generated images.
-- Return type:
-List[Tensor]
+- Return type:
+List[Tensor]
Examples
In this example, we’ll walk through generating images using a simple PyTorch model and a data generation configuration. The process involves creating a model, setting up a data generation configuration, and finally generating images with specified parameters.
Start by importing the Model Compression Toolkit (MCT), PyTorch, and some modules from torch.nn:
->>> import model_compression_toolkit as mct
->>> import torch.nn as nn
->>> from torch.nn import Conv2d, BatchNorm2d, Flatten, Linear
+>>> import model_compression_toolkit as mct
+>>> import torch.nn as nn
+>>> from torch.nn import Conv2d, BatchNorm2d, Flatten, Linear
Next, define a simple PyTorch model:
@@ -83,11 +86,6 @@ Navigation
The generated images can then be used for various purposes, such as data-free quantization.
-
-- Return type:
-List[Tensor]
-
-
diff --git a/docs/api/api_docs/methods/pytorch_gradient_post_training_quantization.html b/docs/api/api_docs/methods/pytorch_gradient_post_training_quantization.html
index 4180ce4b4..572b6b545 100644
--- a/docs/api/api_docs/methods/pytorch_gradient_post_training_quantization.html
+++ b/docs/api/api_docs/methods/pytorch_gradient_post_training_quantization.html
@@ -7,7 +7,7 @@
Pytorch Gradient Based Post Training Quantization — MCT Documentation: ver 2.6.0
-
+
@@ -58,8 +58,11 @@ Navigation
training quantization by comparing points between the float and quantized models, and minimizing the observed
loss.
-- Parameters:
-
+- Return type:
+Tuple[Module, Optional[UserInformation]]
+
+- Parameters:
+
model (Module) – Pytorch model to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
target_resource_utilization (ResourceUtilization) – ResourceUtilization object to limit the search of the mixed-precision configuration as desired.
@@ -69,25 +72,25 @@ Navigation
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – TargetPlatformCapabilities to optimize the PyTorch model according to.
-- Returns:
-A quantized module and information the user may need to handle the quantized module.
+- Returns:
+A quantized module and information the user may need to handle the quantized module.
Examples
Import Model Compression Toolkit:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Pytorch module:
->>> from torchvision import models
+>>> from torchvision import models
>>> module = models.mobilenet_v2()
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches):
In this example a random dataset of 10 batches each containing 4 images is used.
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 3, 224, 224))]
@@ -100,11 +103,6 @@ Navigation
>>> quantized_module, quantization_info = mct.gptq.pytorch_gradient_post_training_quantization(module, repr_datagen, core_config=config, gptq_config=gptq_conf)
-
-- Return type:
-Tuple[Module, Optional[UserInformation]]
-
-
diff --git a/docs/api/api_docs/methods/pytorch_kpi_data.html b/docs/api/api_docs/methods/pytorch_kpi_data.html
index c6f2cc235..a566b86d8 100644
--- a/docs/api/api_docs/methods/pytorch_kpi_data.html
+++ b/docs/api/api_docs/methods/pytorch_kpi_data.html
@@ -7,7 +7,7 @@
Get Resource Utilization information for PyTorch Models — MCT Documentation: ver 2.6.0
-
+
@@ -46,39 +46,37 @@ Navigation
Computes resource utilization data that can be used to calculate the desired target resource utilization for mixed-precision quantization.
Builds the computation graph from the given model and target platform capabilities, and uses it to compute the resource utilization data.
-- Parameters:
-
+- Return type:
+-
+
+- Parameters:
+
in_model (Model) – PyTorch model to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
core_config (CoreConfig) – CoreConfig containing parameters for quantization and mixed precision
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – FrameworkQuantizationCapabilities to optimize the PyTorch model according to.
-- Returns:
-A ResourceUtilization object with total weights parameters sum and max activation tensor.
+- Returns:
+A ResourceUtilization object with total weights parameters sum and max activation tensor.
Examples
Import a Pytorch model:
->>> from torchvision import models
+>>> from torchvision import models
>>> module = models.mobilenet_v2()
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 3, 224, 224))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 3, 224, 224))]
Import mct and call for resource utilization data calculation:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> ru_data = mct.core.pytorch_resource_utilization_data(module, repr_datagen)
-
-- Return type:
--
-
-
diff --git a/docs/api/api_docs/methods/pytorch_post_training_quantization.html b/docs/api/api_docs/methods/pytorch_post_training_quantization.html
index 0773fb24b..ac93413e2 100644
--- a/docs/api/api_docs/methods/pytorch_post_training_quantization.html
+++ b/docs/api/api_docs/methods/pytorch_post_training_quantization.html
@@ -7,7 +7,7 @@
Pytorch Post Training Quantization — MCT Documentation: ver 2.6.0
-
+
@@ -55,8 +55,11 @@ Navigation
training quantization by comparing points between the float and quantized modules, and minimizing the
observed loss.
-- Parameters:
-
+- Return type:
+Tuple[Module, Optional[UserInformation]]
+
+- Parameters:
+
in_module (Module) – Pytorch module to quantize.
representative_data_gen (Callable) – Dataset used for calibration.
target_resource_utilization (ResourceUtilization) – ResourceUtilization object to limit the search of the mixed-precision configuration as desired.
@@ -64,36 +67,31 @@ Navigation
target_platform_capabilities (Union[TargetPlatformCapabilities, str]) – TargetPlatformCapabilities to optimize the PyTorch model according to.
-- Returns:
-A quantized module and information the user may need to handle the quantized module.
+- Returns:
+A quantized module and information the user may need to handle the quantized module.
Examples
Import a Pytorch module:
->>> from torchvision import models
+>>> from torchvision import models
>>> module = models.mobilenet_v2()
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches):
In this example a random dataset of 10 batches each containing 4 images is used.
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 3, 224, 224))]
Import MCT and pass the module with the representative dataset generator to get a quantized module
Set number of clibration iterations to 1:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> quantized_module, quantization_info = mct.ptq.pytorch_post_training_quantization(module, repr_datagen)
-
-- Return type:
-Tuple[Module, Optional[UserInformation]]
-
-
diff --git a/docs/api/api_docs/methods/pytorch_pruning_experimental.html b/docs/api/api_docs/methods/pytorch_pruning_experimental.html
index 706ec4862..b4e43bc86 100644
--- a/docs/api/api_docs/methods/pytorch_pruning_experimental.html
+++ b/docs/api/api_docs/methods/pytorch_pruning_experimental.html
@@ -7,7 +7,7 @@
Pytorch Structured Pruning — MCT Documentation: ver 2.6.0
-
+
@@ -53,8 +53,11 @@ Navigation
identify groups of channels that can be removed with minimal impact on performance.
Notice that the pruned model must be retrained to recover the compressed model’s performance.
-- Parameters:
-
+- Return type:
+Tuple[Module, PruningInfo]
+
+- Parameters:
+
model (Module) – The PyTorch model to be pruned.
target_resource_utilization (ResourceUtilization) – Key Performance Indicators specifying the pruning targets.
representative_data_gen (Callable) – A function to generate representative data for pruning analysis.
@@ -63,11 +66,11 @@ Navigation
Defaults to DEFAULT_PYTORCH_TPC.
-- Returns:
-A tuple containing the pruned Pytorch model and associated pruning information.
+- Returns:
+A tuple containing the pruned Pytorch model and associated pruning information.
-- Return type:
-Tuple[Model, PruningInfo]
+- Return type:
+Tuple[Model, PruningInfo]
@@ -76,17 +79,17 @@ Navigation
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Pytorch model:
->>> from torchvision.models import resnet50, ResNet50_Weights
+>>> from torchvision.models import resnet50, ResNet50_Weights
>>> model = resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 3, 224, 224))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 3, 224, 224))]
Define a target resource utilization for pruning.
@@ -107,11 +110,6 @@
Navigation
>>> pruned_model, pruning_info = mct.pruning.pytorch_pruning_experimental(model=model, target_resource_utilization=target_resource_utilization, representative_data_gen=repr_datagen, pruning_config=pruning_config)
-
-- Return type:
-Tuple[Module, PruningInfo]
-
-
diff --git a/docs/api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.html b/docs/api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.html
index 1365711fa..a8a81cf18 100644
--- a/docs/api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.html
+++ b/docs/api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.html
@@ -7,7 +7,7 @@
PyTorch Quantization Aware Training Model Finalize — MCT Documentation: ver 2.6.0
-
+
@@ -55,17 +55,17 @@ Navigation
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Pytorch model:
->>> from torchvision.models import mobilenet_v2
+>>> from torchvision.models import mobilenet_v2
>>> model = mobilenet_v2(pretrained=True)
Create a random dataset generator:
->>> import numpy as np
->>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
+>>> import numpy as np
+>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
Create a MCT core config, containing the quantization configuration:
diff --git a/docs/api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.html b/docs/api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.html
index a58199df2..c1e3a19f2 100644
--- a/docs/api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.html
+++ b/docs/api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.html
@@ -7,7 +7,7 @@
PyTorch Quantization Aware Training Model Init — MCT Documentation: ver 2.6.0
-
+
@@ -74,18 +74,18 @@ Navigation
Examples
Import MCT:
->>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
Import a Pytorch model:
->>> from torchvision.models import mobilenet_v2
+>>> from torchvision.models import mobilenet_v2
>>> model = mobilenet_v2(pretrained=True)
Create a random dataset generator, for required number of calibration iterations (num_calibration_batches). In this example, a random dataset of 10 batches each containing 4 images is used:
->>> import numpy as np
+>>> import numpy as np
>>> num_calibration_batches = 10
->>> def repr_datagen():
+>>> def repr_datagen():
>>> for _ in range(num_calibration_batches):
>>> yield [np.random.random((4, 3, 224, 224))]
diff --git a/docs/api/api_docs/methods/set_logger_path.html b/docs/api/api_docs/methods/set_logger_path.html
index 0f14537ca..66272e074 100644
--- a/docs/api/api_docs/methods/set_logger_path.html
+++ b/docs/api/api_docs/methods/set_logger_path.html
@@ -7,7 +7,7 @@
Enable a Logger — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.set_log_folder(folder, level=logging.INFO)¶
Set a directory path for saving a log file.
-- Parameters:
-
+- Return type:
+None
+
+- Parameters:
+
folder (str) – Folder path to save the log file.
level (int) – Level of verbosity to set to the logger and handlers.
@@ -58,11 +61,6 @@ Navigation
to set up logging.
Don’t use Python’s original logger.
-
-- Return type:
-None
-
-
diff --git a/docs/api/api_docs/methods/xquant_report_keras_experimental.html b/docs/api/api_docs/methods/xquant_report_keras_experimental.html
index 65d9d733a..2feee9bb7 100644
--- a/docs/api/api_docs/methods/xquant_report_keras_experimental.html
+++ b/docs/api/api_docs/methods/xquant_report_keras_experimental.html
@@ -7,7 +7,7 @@
XQuant Report Keras — MCT Documentation: ver 2.6.0
-
+
@@ -45,8 +45,11 @@ Navigation
model_compression_toolkit.xquant.keras.facade_xquant_report.xquant_report_keras_experimental(float_model, quantized_model, repr_dataset, validation_dataset, xquant_config)¶
Generate an explainable quantization report for a quantized Keras model.
-- Parameters:
-
+- Return type:
+Dict[str, Any]
+
+- Parameters:
+
float_model (keras.Model) – The original floating-point Keras model.
quantized_model (keras.Model) – The quantized Keras model.
repr_dataset (Callable) – The representative dataset used during quantization for similarity metrics computation.
@@ -54,14 +57,11 @@ Navigation
xquant_config (XQuantConfig) – Configuration settings for explainable quantization.
-- Returns:
-A dictionary containing the collected similarity metrics and report data.
-
-- Return type:
-Dict[str, Any]
+- Returns:
+A dictionary containing the collected similarity metrics and report data.
- Return type:
-Dict[str, Any]
+Dict[str, Any]
diff --git a/docs/api/api_docs/methods/xquant_report_pytorch_experimental.html b/docs/api/api_docs/methods/xquant_report_pytorch_experimental.html
index 8388913a1..696d88e5e 100644
--- a/docs/api/api_docs/methods/xquant_report_pytorch_experimental.html
+++ b/docs/api/api_docs/methods/xquant_report_pytorch_experimental.html
@@ -7,7 +7,7 @@
XQuant Report Pytorch — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.html b/docs/api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.html
index 6817c7f5e..0f3b1c42b 100644
--- a/docs/api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.html
+++ b/docs/api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.html
@@ -7,7 +7,7 @@
XQuant Report Troubleshoot Pytorch — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/modules/core_config.html b/docs/api/api_docs/modules/core_config.html
index f83bf0b59..a114722ad 100644
--- a/docs/api/api_docs/modules/core_config.html
+++ b/docs/api/api_docs/modules/core_config.html
@@ -7,7 +7,7 @@
CoreConfig — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/modules/debug_config.html b/docs/api/api_docs/modules/debug_config.html
index 64e31975f..61ad45a27 100644
--- a/docs/api/api_docs/modules/debug_config.html
+++ b/docs/api/api_docs/modules/debug_config.html
@@ -7,7 +7,7 @@
debug_config Module — MCT Documentation: ver 2.6.0
-
+
@@ -73,11 +73,11 @@ DebugConfig>>> class ProgressInfoCallback:
-... def __init__(self):
+>>> class ProgressInfoCallback:
+... def __init__(self):
... self.history = []
...
-... def __call__(self, info):
+... def __call__(self, info):
... current = info["currentComponent"]
... total = info["totalComponents"]
... component_name = info["completedComponents"]
@@ -92,7 +92,7 @@ DebugConfig>>> def progress_info_callback(info):
+>>> def progress_info_callback(info):
... current = info["currentComponent"]
... total = info["totalComponents"]
... component_name = info["completedComponents"]
@@ -129,7 +129,7 @@ DebugConfig>>> import model_compression_toolkit as mct
+>>> import model_compression_toolkit as mct
>>> debug_config = mct.core.DebugConfig(progress_info_callback=progress_info_callback)
>>> core_config = mct.core.CoreConfig(debug_config=debug_config)
diff --git a/docs/api/api_docs/modules/exporter.html b/docs/api/api_docs/modules/exporter.html
index 2399c6b1b..435f7315a 100644
--- a/docs/api/api_docs/modules/exporter.html
+++ b/docs/api/api_docs/modules/exporter.html
@@ -7,7 +7,7 @@
exporter Module — MCT Documentation: ver 2.6.0
-
+
@@ -78,8 +78,11 @@ keras_export_model
-Parameters:
-
+- Return type:
+Dict[str, type]
+
+- Parameters:
+
model – Model to export.
save_model_path – Path to save the model.
is_layer_exportable_fn – Callable to check whether a layer can be exported or not.
@@ -87,11 +90,8 @@ keras_export_modelReturns:
-Custom objects dictionary needed to load the model.
-
-- Return type:
-Dict[str, type]
+- Returns:
+Custom objects dictionary needed to load the model.
@@ -101,9 +101,9 @@ keras_export_model¶
To export a TensorFlow model as a quantized model, it is necessary to first apply quantization
to the model using MCT:
-import numpy as np
-from keras.applications import ResNet50
-import model_compression_toolkit as mct
+import numpy as np
+from keras.applications import ResNet50
+import model_compression_toolkit as mct
# Create a model
float_model = ResNet50()
@@ -122,7 +122,7 @@ keras serialization format¶
By default, mct.exporter.keras_export_model will export the quantized Keras model to
a .keras model with custom quantizers from mct_quantizers module.
-import tempfile
+import tempfile
# Path of exported model
_, keras_file_path = tempfile.mkstemp('.keras')
@@ -160,8 +160,11 @@ pytorch_export_model
-- Parameters:
-
+- Return type:
+None
+
+- Parameters:
+
model (Module) – Model to export.
save_model_path (str) – Path to save the model.
repr_dataset (Callable) – Representative dataset for tracing the pytorch model (mandatory for exporting it).
@@ -172,9 +175,6 @@ pytorch_export_modeloutput_names (Optional[List[str]]) – Optional list of output node names for export compatibility. This argument is relevant only when using PytorchExportSerializationFormat.ONNX.
-- Return type:
-None
-
@@ -186,17 +186,17 @@ Pytorch Tutorialimport model_compression_toolkit as mct
-import numpy as np
-import torch
-from torchvision.models.mobilenetv2 import mobilenet_v2
+import model_compression_toolkit as mct
+import numpy as np
+import torch
+from torchvision.models.mobilenetv2 import mobilenet_v2
# Create a model
float_model = mobilenet_v2()
# Notice that here the representative dataset is random for demonstration only.
-def representative_data_gen():
+def representative_data_gen():
yield [np.random.random((1, 3, 224, 224))]
@@ -254,8 +254,8 @@ ONNX model output names
Use exported model for inference¶
To load and infer using the exported model, which was exported to an ONNX file in MCTQ format, we will use mct_quantizers method get_ort_session_options during onnxruntime session creation. Notice, inference on models that are exported in this format are slowly and suffers from longer latency. However, inference of these models on IMX500 will not suffer from this issue.
-import mct_quantizers as mctq
-import onnxruntime as ort
+import mct_quantizers as mctq
+import onnxruntime as ort
sess = ort.InferenceSession(onnx_file_path,
mctq.get_ort_session_options(),
diff --git a/docs/api/api_docs/modules/layer_filters.html b/docs/api/api_docs/modules/layer_filters.html
index da4e14b6e..941a9ebaa 100644
--- a/docs/api/api_docs/modules/layer_filters.html
+++ b/docs/api/api_docs/modules/layer_filters.html
@@ -7,7 +7,7 @@
Layer Attributes Filters — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/modules/network_editor.html b/docs/api/api_docs/modules/network_editor.html
index 56cd07d0d..54006edb3 100644
--- a/docs/api/api_docs/modules/network_editor.html
+++ b/docs/api/api_docs/modules/network_editor.html
@@ -7,7 +7,7 @@
network_editor Module — MCT Documentation: ver 2.6.0
-
+
@@ -50,9 +50,9 @@ EditRule
and the action is applied on these nodes during the quantization process.
Examples
Create an EditRule to quantize all Conv2D kernel attribute weights using 9 bits:
->>> import model_compression_toolkit as mct
->>> from model_compression_toolkit.core.keras.constants import KERNEL
->>> from tensorflow.keras.layers import Conv2D
+>>> import model_compression_toolkit as mct
+>>> from model_compression_toolkit.core.keras.constants import KERNEL
+>>> from tensorflow.keras.layers import Conv2D
>>> er_list = [mct.core.network_editor.EditRule(filter=mct.core.network_editor.NodeTypeFilter(Conv2D), action=mct.core.network_editor.ChangeCandidatesWeightsQuantConfigAttr(attr_name=KERNEL, weights_n_bits=9))]
diff --git a/docs/api/api_docs/modules/qat_config.html b/docs/api/api_docs/modules/qat_config.html
index a6ad8b503..da879fe16 100644
--- a/docs/api/api_docs/modules/qat_config.html
+++ b/docs/api/api_docs/modules/qat_config.html
@@ -7,7 +7,7 @@
qat_config Module — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/modules/target_platform_capabilities.html b/docs/api/api_docs/modules/target_platform_capabilities.html
index 76f1524bb..d6b785708 100644
--- a/docs/api/api_docs/modules/target_platform_capabilities.html
+++ b/docs/api/api_docs/modules/target_platform_capabilities.html
@@ -7,7 +7,7 @@
target_platform_capabilities Module — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/api/api_docs/modules/trainable_infrastructure.html b/docs/api/api_docs/modules/trainable_infrastructure.html
index 15de91ad3..52e173b98 100644
--- a/docs/api/api_docs/modules/trainable_infrastructure.html
+++ b/docs/api/api_docs/modules/trainable_infrastructure.html
@@ -7,7 +7,7 @@
trainable_infrastructure Module — MCT Documentation: ver 2.6.0
-
+
@@ -128,8 +128,8 @@ TrainableQuantizerWeightsConfigfrom model_compression_toolkit.target_platform_capabilities.target_platform_capabilities import QuantizationMethod
-from model_compression_toolkit.constants import THRESHOLD, MIN_THRESHOLD
+from model_compression_toolkit.target_platform_capabilities.target_platform_capabilities import QuantizationMethod
+from model_compression_toolkit.constants import THRESHOLD, MIN_THRESHOLD
TrainableQuantizerWeightsConfig(weights_quantization_method=QuantizationMethod.SYMMETRIC,
weights_n_bits=8,
@@ -165,8 +165,8 @@ TrainableQuantizerActivationConfigfrom model_compression_toolkit.target_platform_capabilities.target_platform_capabilities import QuantizationMethod
-from model_compression_toolkit.constants import THRESHOLD, MIN_THRESHOLD
+from model_compression_toolkit.target_platform_capabilities.target_platform_capabilities import QuantizationMethod
+from model_compression_toolkit.constants import THRESHOLD, MIN_THRESHOLD
TrainableQuantizerActivationConfig(activation_quantization_method=QuantizationMethod.UNIFORM,
activation_n_bits=8,
diff --git a/docs/api/api_docs/notes/tpc_note.html b/docs/api/api_docs/notes/tpc_note.html
index 885ac18cf..a93c45d53 100644
--- a/docs/api/api_docs/notes/tpc_note.html
+++ b/docs/api/api_docs/notes/tpc_note.html
@@ -7,7 +7,7 @@
<no title> — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/docs_troubleshoot/_sources/troubleshoots/threhold_selection_error_method.rst.txt b/docs/docs_troubleshoot/_sources/troubleshoots/threhold_selection_error_method.rst.txt
index 9acad965a..fd592d7f0 100644
--- a/docs/docs_troubleshoot/_sources/troubleshoots/threhold_selection_error_method.rst.txt
+++ b/docs/docs_troubleshoot/_sources/troubleshoots/threhold_selection_error_method.rst.txt
@@ -23,12 +23,28 @@ Solution
=================================
Use a different error method for activations. You can set the following values:
- * NOCLIPPING - Use min/max values
- * MSE (default) - Use mean square error
- * MAE - Use mean absolute error
- * KL - Use KL-divergence
- * Lp - Use Lp-norm
- * HMSE - Use Hessian-based mean squared error
+ * NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
+ * MSE - **(default)** Use mean square error for minimizing quantization noise.
+ * MAE - Use mean absolute error for minimizing quantization noise.
+ * KL - Use KL-divergence to make signals distributions to be similar as possible.
+ * Lp - Use Lp-norm to minimizing quantization noise. The parameter p is specified by QuantizationConfig.l_p_value (default: 2; integer only). It equals MAE when p = 1 and MSE when p = 2. If you want to use p≧3, please use this method.
+ * HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization.
+
+ **How to select QuantizationErrorMethod**
+
+ .. csv-table::
+ :header: "Method", "Recommended Situations"
+ :widths: 20, 80
+
+ "NOCLIPPING", "Research and debugging phases where you want to observe behavior across the entire range. This is effective when you want to maintain the entire range, especially when the data is biased (for example, when there is an extremely small amount of data on the minimum side)."
+ "MSE", "**Basically, you should use this method.** This method is effective when the data distribution is close to normal and there are few outliers. Effective when you want stable results, such as in regression tasks."
+ "MAE", "Effective for data with a lot of noise and outliers."
+ "KL", "Useful for tasks where output distribution is important (such as Anomaly Detection)."
+ "LP", "p≧3 is effective when you want to be more sensitive to outliers than MSE. (such as Sparse Data)."
+ "HMSE", "Recommended when using GPTQ. This is effective for models where specific layers strongly influence the overall accuracy. (such as Transformers)."
+
+
+
For example, set NOCLIPPING to the ``activation_error_method`` attribute of the ``QuantizationConfig`` in ``CoreConfig``.
diff --git a/docs/docs_troubleshoot/genindex.html b/docs/docs_troubleshoot/genindex.html
index 1350a4150..e2f12927f 100644
--- a/docs/docs_troubleshoot/genindex.html
+++ b/docs/docs_troubleshoot/genindex.html
@@ -6,7 +6,7 @@
Index — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/index.html b/docs/docs_troubleshoot/index.html
index 0ddb16140..73d959f8c 100644
--- a/docs/docs_troubleshoot/index.html
+++ b/docs/docs_troubleshoot/index.html
@@ -7,7 +7,7 @@
TroubleShooting Manual (MCT XQuant Extension Tool) — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/search.html b/docs/docs_troubleshoot/search.html
index 16ece6516..7cda3fba8 100644
--- a/docs/docs_troubleshoot/search.html
+++ b/docs/docs_troubleshoot/search.html
@@ -6,7 +6,7 @@
Search — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/searchindex.js b/docs/docs_troubleshoot/searchindex.js
index 5ab413c60..d0972db40 100644
--- a/docs/docs_troubleshoot/searchindex.js
+++ b/docs/docs_troubleshoot/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"Appendix: How to Read the Outlier Histograms": [[5, "appendix-how-to-read-the-outlier-histograms"]], "Bias Correction": [[1, null]], "Enabling Hessian-based Mixed Precision": [[2, null]], "GPTQ - Gradient-Based Post Training Quantization": [[3, null]], "Mixed Precision with model output loss objective": [[4, null]], "Outlier Removal": [[5, null]], "Overview": [[0, "overview"], [1, "overview"], [2, "overview"], [3, "overview"], [4, "overview"], [5, "overview"], [6, "overview"], [7, "overview"], [8, "overview"], [9, "overview"], [10, "overview"], [11, "overview"]], "Quantization Troubleshooting for MCT": [[0, "quantization-troubleshooting-for-mct"]], "References": [[0, "references"]], "Representative Dataset size and diversity": [[7, null]], "Representative and Validation Dataset Mismatch": [[6, null]], "Shift Negative Activation": [[8, null]], "Solution": [[1, "solution"], [2, "solution"], [3, "solution"], [4, "solution"], [5, "solution"], [6, "solution"], [7, "solution"], [8, "solution"], [9, "solution"], [10, "solution"], [11, "solution"]], "Threshold selection error method": [[9, null]], "Trouble Situation": [[1, "trouble-situation"], [4, "trouble-situation"], [5, "trouble-situation"], [6, "trouble-situation"], [7, "trouble-situation"], [8, "trouble-situation"], [10, "trouble-situation"], [11, "trouble-situation"]], "TroubleShooting Manual (MCT XQuant Extension Tool)": [[0, null]], "Unbalanced \u201cconcatenation\u201d": [[10, null]], "Using more samples in Mixed Precision quantization": [[11, null]]}, "docnames": ["index", "troubleshoots/bias_correction", "troubleshoots/enabling_hessian-based_mixed_precision", "troubleshoots/gptq-gradient_based_post_training_quantization", "troubleshoots/mixed_precision_with_model_output_loss_objective", "troubleshoots/outlier_removal", "troubleshoots/representative_and_validation_dataset_mismatch", "troubleshoots/representative_dataset_size_and_diversity", "troubleshoots/shift_negative_activation", "troubleshoots/threhold_selection_error_method", "troubleshoots/unbalanced_concatenation", "troubleshoots/using_more_samples_in_mixed_precision_quantization"], "envversion": {"sphinx": 64, "sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2}, "filenames": ["index.rst", "troubleshoots/bias_correction.rst", "troubleshoots/enabling_hessian-based_mixed_precision.rst", "troubleshoots/gptq-gradient_based_post_training_quantization.rst", "troubleshoots/mixed_precision_with_model_output_loss_objective.rst", "troubleshoots/outlier_removal.rst", "troubleshoots/representative_and_validation_dataset_mismatch.rst", "troubleshoots/representative_dataset_size_and_diversity.rst", "troubleshoots/shift_negative_activation.rst", "troubleshoots/threhold_selection_error_method.rst", "troubleshoots/unbalanced_concatenation.rst", "troubleshoots/using_more_samples_in_mixed_precision_quantization.rst"], "indexentries": {}, "objects": {}, "objnames": {}, "objtypes": {}, "terms": {"": [2, 3, 4, 6, 11], "0": 5, "1": [0, 10], "15": 10, "1bit": 8, "1x1": 10, "2": [0, 4, 10], "20": 5, "3": [5, 10], "32": 11, "5": [5, 10], "50": 3, "64": [10, 11], "758747418625537": 10, "8": 5, "9": 5, "A": 7, "As": 3, "By": [4, 11], "For": [0, 3, 4, 5, 9, 10], "If": [0, 1, 6, 7, 8, 10], "In": [0, 2, 3, 4, 5, 7, 11], "Such": 5, "The": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11], "_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "about": [0, 5, 8], "absolut": 9, "accommod": 9, "accur": 10, "accuraci": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11], "achiev": 4, "activ": [0, 4, 5, 6, 7, 9], "activation_error_method": 9, "actual": 4, "add": 10, "adjust": [0, 4], "advis": 9, "after": [0, 5, 8, 10], "after_statistics_correct": 4, "aim": [0, 4], "all": [7, 8], "allow": 5, "altern": 9, "an": [1, 3, 4, 5, 7, 8, 9], "ani": 3, "anoth": 9, "api": 4, "appear": 5, "appli": [1, 6], "applic": 0, "ar": [0, 1, 5, 6, 7, 8, 10], "architectur": 2, "assess": 2, "assign": [2, 4, 11], "attribut": [4, 5, 9, 11], "automat": 0, "averag": 5, "axi": 5, "base": [0, 1, 9], "becom": [5, 9], "below": [0, 4, 5], "between": [5, 10], "bia": 0, "bin": 5, "bit": [2, 4, 11], "bitwidth": 4, "black": 5, "can": [0, 1, 2, 3, 4, 5, 8, 9, 10], "case": [0, 6, 7], "caus": [1, 7], "certain": [2, 5], "chang": 8, "check": [1, 2, 3, 4, 5, 11], "class": [7, 8], "classif": 7, "collaps": 10, "collect": 1, "combin": 10, "come": 6, "common": 4, "complex": 4, "compress": [0, 4, 10], "compromis": [4, 9], "comput": [1, 2], "computation": 2, "concaten": 0, "configur": [0, 3, 8], "consid": [9, 10], "constraint": [2, 4, 11], "contain": [5, 8, 9, 10], "control": 8, "conv": 10, "convolut": 10, "core": [1, 2, 4, 5, 8, 9, 10, 11], "core_config": [1, 2, 4, 5, 8, 9, 10, 11], "coreconfig": [1, 2, 4, 5, 8, 9, 10, 11], "correct": 0, "correl": 4, "correspond": 5, "could": [9, 11], "creat": 5, "crucial": 9, "dash": 5, "data": [1, 5, 9], "dataset": [0, 1, 3, 5, 11], "deal": 11, "decreas": [0, 1], "deeper": 2, "default": [1, 4, 8, 9, 11], "defin": [2, 4, 11], "degrad": [1, 4, 5, 6, 7, 8, 9, 10, 11], "deliv": 3, "depend": [2, 4, 11], "deriv": [6, 7], "descript": 8, "detect": [5, 10], "determin": [4, 9], "detriment": 4, "differ": [2, 4, 6, 9, 10, 11], "directori": 5, "disabl": [1, 10], "discov": 10, "displai": 10, "distance_weighting_method": 4, "distribut": [1, 5, 7], "diverg": 9, "divers": 0, "diversifi": 1, "divid": 5, "document": 0, "doe": 0, "domain": 6, "don": 1, "down": 10, "driven": 9, "dure": [2, 11], "dynam": 8, "e": [4, 7, 8, 9, 11], "each": [2, 4, 7, 10, 11], "effect": 1, "either": 3, "element": 10, "elu": 8, "emphas": 4, "emploi": [9, 11], "enabl": 0, "end": 5, "enhanc": [2, 4, 9, 11], "enough": 7, "entir": 9, "epoch": 3, "error": [0, 8], "especi": [0, 9], "essenti": 5, "estim": 1, "etc": [8, 9], "exampl": [3, 4, 5, 9, 10], "execut": [5, 10], "exhibit": 11, "expand": 11, "expect": 6, "experi": 0, "extend": [2, 9, 11], "extens": 5, "extra": 8, "extrem": 4, "fail": 3, "fals": [1, 8, 10], "far": 5, "fear": 0, "featur": [2, 10], "few": 8, "fine": 5, "finetun": 3, "finish": 3, "first": [5, 10], "flag": [2, 8], "flexibl": 9, "float": 10, "follow": [0, 8, 9, 10], "formula": 10, "from": [0, 4, 5, 6, 7, 9, 11], "function": [0, 4, 8], "furthermor": 2, "g": [4, 7, 8, 9, 11], "gelu": 8, "gener": [0, 1], "get": 9, "get_pytorch_gptq_config": 3, "gptq": 0, "gptq_config": 3, "gradient": 0, "graph": 10, "greater": 4, "ha": 7, "harder": 8, "hardswish": 8, "have": [1, 5, 6, 8, 10], "hessian": [0, 9], "high": [4, 11], "hmse": 9, "how": 9, "howev": [0, 4, 9], "hyperparamet": 3, "i": [1, 3, 4, 5, 6, 7, 8, 9, 10], "ident": 6, "identifi": 0, "imag": [4, 5, 6, 7], "implement": 3, "import": [2, 4], "improv": [0, 3, 8], "inaccuraci": 10, "includ": 7, "incompat": 1, "increas": [1, 7, 11], "indic": [0, 5], "induc": 1, "inform": [2, 3, 4, 11], "input": 10, "insight": 9, "intens": 2, "introduc": 2, "involv": [4, 9, 10], "isn": 7, "item": 0, "its": [1, 6, 8], "judgeabl": 0, "kl": 9, "label": 3, "larger": 11, "last": 4, "last_lay": 4, "layer": [2, 4, 5, 8, 9, 10, 11], "lead": [2, 4, 10, 11], "leaki": 8, "less": [3, 10], "leverag": 11, "limit": 5, "line": 5, "linear_collaps": 10, "longer": 3, "loss": 0, "lost": 0, "lower": 5, "lp": 9, "mae": 9, "mai": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "make": [0, 5, 7], "manual": 5, "match": 7, "max": 9, "maxim": 4, "maximum": 5, "mct": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "mean": 9, "mechan": 2, "mess": 5, "messag": 10, "method": [0, 2, 4], "metric": 9, "might": 7, "min": 9, "minim": 0, "mismatch": 0, "mitig": 4, "mix": [0, 3], "mixed_precis": 4, "mixed_precision_config": [2, 4, 11], "mixedprecisionquantizationconfig": [2, 4, 11], "model": [0, 2, 3, 5, 6, 7, 8, 9, 10, 11], "model_compression_toolkit": 4, "modifi": 10, "more": [0, 2, 3, 4, 7, 8], "mpdistanceweight": 4, "mse": 9, "much": 3, "multipl": 10, "n_epoch": 3, "name": 5, "necessit": 2, "neg": 0, "network": [0, 2, 9, 10], "neural": 0, "nn": 8, "noclip": 9, "nois": 2, "norm": 9, "notabl": [2, 10], "num_of_imag": 11, "number": [3, 4, 7, 11], "numer": 0, "object": [0, 9], "occur": 7, "offer": [0, 2, 4, 9], "often": 0, "one": 9, "onli": [3, 8], "oper": [8, 10], "opt": 9, "optim": [3, 4, 9, 10], "option": [1, 3], "other": [0, 8, 9], "out": [2, 3, 4, 11], "outcom": 2, "outlier": 0, "outlier_histgram": 5, "outlin": 0, "output": [0, 5, 8], "overcom": 1, "overfit": 7, "page": 0, "paramet": [2, 3, 5], "part": [5, 8], "particularli": [5, 9, 11], "path": 5, "perform": 10, "period": 9, "place": 4, "pleas": 0, "point": 10, "posit": 8, "post": 0, "potenti": [2, 3, 4, 9, 10], "preced": 10, "precis": [0, 3], "predefin": 9, "prelu": 8, "preprocess": 6, "primari": 5, "priorit": 4, "problemat": 10, "process": [2, 3, 9, 11], "produc": 5, "program": 2, "provid": [1, 11], "ptq": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "pytorch_gradient_post_training_quant": 3, "pytorch_post_training_quant": [1, 2, 4, 5, 6, 7, 8, 9, 10, 11], "quant_config": 9, "quantiz": [1, 2, 4, 5, 6, 7, 8, 9, 10], "quantization_config": 9, "quantizationconfig": [1, 5, 8, 9, 10], "quantizationerrormethod": 9, "quantized_model": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "rang": [5, 8, 9, 10], "rate": 4, "read": [0, 8], "recalibr": 2, "reclaim": 0, "recommend": [1, 10], "recov": 0, "red": 5, "reduc": 4, "reduct": 4, "reli": [4, 9], "relu": 8, "remedi": 3, "remov": 0, "report_dir": 5, "repres": [0, 1, 3, 5, 11], "representative_dataset": [3, 6, 7], "requir": [3, 9], "residual_collaps": 10, "resourc": [2, 4, 11], "restor": 1, "result": [0, 4, 11], "risk": 4, "runtim": [2, 9, 11], "same": 6, "sampl": [0, 7], "save": [5, 8], "scale": 10, "scenario": [4, 5], "score": [2, 5], "search": [2, 4, 11], "second": [5, 10], "section": 0, "see": 0, "select": [0, 5], "sensit": [1, 2, 4, 10, 11], "separ": 10, "seri": 0, "set": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11], "set_log_fold": 0, "setup": 0, "shift": [0, 1], "shift_negative_activation_correct": 8, "shift_negative_params_search": 8, "shift_negative_ratio": 8, "shift_negative_threshold_recalcul": 8, "should": 6, "show": 5, "shown": 5, "side": 5, "signific": [0, 9], "significantli": 10, "silu": 8, "similar": 7, "singl": 7, "size": [0, 1, 2, 4, 11], "skew": 5, "slow": 10, "small": [4, 7, 8, 11], "solut": 0, "some": [0, 3, 8, 9, 10], "span": 9, "specif": 9, "specifi": 5, "speed": 10, "squar": 9, "statist": [1, 9], "step": 0, "suffer": 9, "sure": 7, "swish": 8, "t": [1, 7], "take": 3, "taken": 6, "target": [2, 4, 7, 11], "tensor": [5, 6, 7, 9, 10], "tensorboard": [0, 4, 5], "than": [3, 8, 10], "them": [6, 8], "therefor": [1, 5], "thi": [0, 2, 5, 7, 8, 9], "threshold": [0, 5, 6, 7, 10], "thresholds_select": 5, "togeth": [3, 10], "too": 7, "tool": 5, "toolkit": [0, 10], "torch": 8, "tradition": 4, "train": [0, 6], "true": [2, 8], "tutori": [2, 3, 4, 11], "tweak": 8, "typic": 5, "unbalanc": 0, "underli": 2, "understand": 2, "unexpect": 2, "unorthodox": 9, "up": 5, "upper": 5, "us": [0, 3, 5, 6, 7, 8, 9, 10], "use_hessian_based_scor": 2, "user": [2, 4, 11], "usual": 6, "valid": [0, 7], "valu": [5, 6, 7, 8, 9, 10, 11], "varianc": 11, "visual": [0, 4, 5], "warn": 10, "we": 9, "weight": [1, 2, 3, 4, 11], "weights_bias_correct": 1, "when": [1, 3, 4, 5, 6, 7, 8, 10, 11], "where": [5, 9], "which": [7, 9, 10], "while": [0, 9], "width": [2, 11], "without": 3, "work": 8, "x": 5, "xquant": 5, "xquantconfig": 5, "you": [0, 1, 3, 4, 5, 8, 9, 10], "your": [0, 1, 5, 8, 9, 10], "z": 5, "z_threshold": 5, "zscore": 5}, "titles": ["TroubleShooting Manual (MCT XQuant Extension Tool)", "Bias Correction", "Enabling Hessian-based Mixed Precision", "GPTQ - Gradient-Based Post Training Quantization", "Mixed Precision with model output loss objective", "Outlier Removal", "Representative and Validation Dataset Mismatch", "Representative Dataset size and diversity", "Shift Negative Activation", "Threshold selection error method", "Unbalanced \u201cconcatenation\u201d", "Using more samples in Mixed Precision quantization"], "titleterms": {"activ": 8, "appendix": 5, "base": [2, 3], "bia": 1, "concaten": 10, "correct": 1, "dataset": [6, 7], "divers": 7, "enabl": 2, "error": 9, "extens": 0, "gptq": 3, "gradient": 3, "hessian": 2, "histogram": 5, "how": 5, "loss": 4, "manual": 0, "mct": 0, "method": 9, "mismatch": 6, "mix": [2, 4, 11], "model": 4, "more": 11, "neg": 8, "object": 4, "outlier": 5, "output": 4, "overview": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "post": 3, "precis": [2, 4, 11], "quantiz": [0, 3, 11], "read": 5, "refer": 0, "remov": 5, "repres": [6, 7], "sampl": 11, "select": 9, "shift": 8, "situat": [1, 4, 5, 6, 7, 8, 10, 11], "size": 7, "solut": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "threshold": 9, "tool": 0, "train": 3, "troubl": [1, 4, 5, 6, 7, 8, 10, 11], "troubleshoot": 0, "unbalanc": 10, "us": 11, "valid": 6, "xquant": 0}})
\ No newline at end of file
+Search.setIndex({"alltitles": {"Appendix: How to Read the Outlier Histograms": [[5, "appendix-how-to-read-the-outlier-histograms"]], "Bias Correction": [[1, null]], "Enabling Hessian-based Mixed Precision": [[2, null]], "GPTQ - Gradient-Based Post Training Quantization": [[3, null]], "Mixed Precision with model output loss objective": [[4, null]], "Outlier Removal": [[5, null]], "Overview": [[0, "overview"], [1, "overview"], [2, "overview"], [3, "overview"], [4, "overview"], [5, "overview"], [6, "overview"], [7, "overview"], [8, "overview"], [9, "overview"], [10, "overview"], [11, "overview"]], "Quantization Troubleshooting for MCT": [[0, "quantization-troubleshooting-for-mct"]], "References": [[0, "references"]], "Representative Dataset size and diversity": [[7, null]], "Representative and Validation Dataset Mismatch": [[6, null]], "Shift Negative Activation": [[8, null]], "Solution": [[1, "solution"], [2, "solution"], [3, "solution"], [4, "solution"], [5, "solution"], [6, "solution"], [7, "solution"], [8, "solution"], [9, "solution"], [10, "solution"], [11, "solution"]], "Threshold selection error method": [[9, null]], "Trouble Situation": [[1, "trouble-situation"], [4, "trouble-situation"], [5, "trouble-situation"], [6, "trouble-situation"], [7, "trouble-situation"], [8, "trouble-situation"], [10, "trouble-situation"], [11, "trouble-situation"]], "TroubleShooting Manual (MCT XQuant Extension Tool)": [[0, null]], "Unbalanced \u201cconcatenation\u201d": [[10, null]], "Using more samples in Mixed Precision quantization": [[11, null]]}, "docnames": ["index", "troubleshoots/bias_correction", "troubleshoots/enabling_hessian-based_mixed_precision", "troubleshoots/gptq-gradient_based_post_training_quantization", "troubleshoots/mixed_precision_with_model_output_loss_objective", "troubleshoots/outlier_removal", "troubleshoots/representative_and_validation_dataset_mismatch", "troubleshoots/representative_dataset_size_and_diversity", "troubleshoots/shift_negative_activation", "troubleshoots/threhold_selection_error_method", "troubleshoots/unbalanced_concatenation", "troubleshoots/using_more_samples_in_mixed_precision_quantization"], "envversion": {"sphinx": 64, "sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2}, "filenames": ["index.rst", "troubleshoots/bias_correction.rst", "troubleshoots/enabling_hessian-based_mixed_precision.rst", "troubleshoots/gptq-gradient_based_post_training_quantization.rst", "troubleshoots/mixed_precision_with_model_output_loss_objective.rst", "troubleshoots/outlier_removal.rst", "troubleshoots/representative_and_validation_dataset_mismatch.rst", "troubleshoots/representative_dataset_size_and_diversity.rst", "troubleshoots/shift_negative_activation.rst", "troubleshoots/threhold_selection_error_method.rst", "troubleshoots/unbalanced_concatenation.rst", "troubleshoots/using_more_samples_in_mixed_precision_quantization.rst"], "indexentries": {}, "objects": {}, "objnames": {}, "objtypes": {}, "terms": {"": [2, 3, 4, 6, 11], "0": 5, "1": [0, 9, 10], "15": 10, "1bit": 8, "1x1": 10, "2": [0, 4, 9, 10], "20": 5, "3": [5, 9, 10], "32": 11, "5": [5, 10], "50": 3, "64": [10, 11], "758747418625537": 10, "8": 5, "9": 5, "A": 7, "As": 3, "By": [4, 11], "For": [0, 3, 4, 5, 9, 10], "If": [0, 1, 6, 7, 8, 9, 10], "In": [0, 2, 3, 4, 5, 7, 11], "It": 9, "Such": 5, "The": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11], "_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "about": [0, 5, 8], "absolut": 9, "accommod": 9, "accur": 10, "accuraci": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11], "achiev": 4, "across": 9, "activ": [0, 4, 5, 6, 7, 9], "activation_error_method": 9, "actual": 4, "add": 10, "adjust": [0, 4], "advis": 9, "after": [0, 5, 8, 10], "after_statistics_correct": 4, "aim": [0, 4], "all": [7, 8], "allow": 5, "altern": 9, "amount": 9, "an": [1, 3, 4, 5, 7, 8, 9], "ani": 3, "anomali": 9, "anoth": 9, "api": 4, "appear": 5, "appli": [1, 6], "applic": 0, "ar": [0, 1, 5, 6, 7, 8, 9, 10], "architectur": 2, "assess": 2, "assign": [2, 4, 11], "attribut": [4, 5, 9, 11], "automat": 0, "averag": 5, "avoid": 9, "axi": 5, "base": [0, 1, 9], "basic": 9, "becom": [5, 9], "behavior": 9, "below": [0, 4, 5], "between": [5, 10], "bia": [0, 9], "bias": 9, "bin": 5, "bit": [2, 4, 11], "bitwidth": 4, "black": 5, "can": [0, 1, 2, 3, 4, 5, 8, 9, 10], "case": [0, 6, 7], "caus": [1, 7], "certain": [2, 5], "chang": 8, "check": [1, 2, 3, 4, 5, 11], "class": [7, 8], "classif": 7, "clip": 9, "close": 9, "collaps": 10, "collect": 1, "combin": 10, "come": 6, "common": 4, "complex": 4, "compress": [0, 4, 10], "compromis": [4, 9], "comput": [1, 2, 9], "computation": 2, "concaten": 0, "configur": [0, 3, 8], "consid": [9, 10], "constraint": [2, 4, 11], "contain": [5, 8, 9, 10], "control": 8, "conv": 10, "convolut": 10, "core": [1, 2, 4, 5, 8, 9, 10, 11], "core_config": [1, 2, 4, 5, 8, 9, 10, 11], "coreconfig": [1, 2, 4, 5, 8, 9, 10, 11], "correct": 0, "correl": 4, "correspond": 5, "could": [9, 11], "creat": 5, "crucial": 9, "dash": 5, "data": [1, 5, 9], "dataset": [0, 1, 3, 5, 11], "deal": 11, "debug": 9, "decreas": [0, 1], "deeper": 2, "default": [1, 4, 8, 9, 11], "defin": [2, 4, 11], "degrad": [1, 4, 5, 6, 7, 8, 9, 10, 11], "deliv": 3, "depend": [2, 4, 11], "deriv": [6, 7], "descript": 8, "detect": [5, 9, 10], "determin": [4, 9], "detriment": 4, "differ": [2, 4, 6, 9, 10, 11], "directori": 5, "disabl": [1, 10], "discov": 10, "displai": 10, "distance_weighting_method": 4, "distribut": [1, 5, 7, 9], "diverg": 9, "divers": 0, "diversifi": 1, "divid": 5, "document": 0, "doe": 0, "domain": 6, "don": 1, "down": 10, "driven": 9, "dure": [2, 11], "dynam": 8, "e": [4, 7, 8, 9, 11], "each": [2, 4, 7, 10, 11], "effect": [1, 9], "either": 3, "element": 10, "elu": 8, "emphas": 4, "emploi": [9, 11], "enabl": 0, "end": 5, "enhanc": [2, 4, 9, 11], "enough": 7, "entir": 9, "epoch": 3, "equal": 9, "error": [0, 8], "especi": [0, 9], "essenti": 5, "estim": 1, "etc": [8, 9], "exampl": [3, 4, 5, 9, 10], "execut": [5, 10], "exhibit": 11, "expand": 11, "expect": 6, "experi": 0, "extend": [2, 9, 11], "extens": 5, "extra": 8, "extrem": [4, 9], "factor": 9, "fail": 3, "fals": [1, 8, 10], "far": 5, "fear": 0, "featur": [2, 10], "few": [8, 9], "fine": 5, "finetun": 3, "finish": 3, "first": [5, 10], "flag": [2, 8], "flexibl": 9, "float": 10, "follow": [0, 8, 9, 10], "formula": 10, "from": [0, 4, 5, 6, 7, 9, 11], "function": [0, 4, 8], "furthermor": 2, "g": [4, 7, 8, 9, 11], "gelu": 8, "gener": [0, 1], "get": 9, "get_pytorch_gptq_config": 3, "gptq": [0, 9], "gptq_config": 3, "gradient": 0, "graph": 10, "greater": 4, "ha": 7, "harder": 8, "hardswish": 8, "have": [1, 5, 6, 8, 10], "hessian": [0, 9], "high": [4, 11], "hmse": 9, "how": 9, "howev": [0, 4, 9], "hyperparamet": 3, "i": [1, 3, 4, 5, 6, 7, 8, 9, 10], "ident": 6, "identifi": 0, "imag": [4, 5, 6, 7], "implement": 3, "import": [2, 4, 9], "improv": [0, 3, 8], "inaccuraci": 10, "includ": 7, "incompat": 1, "increas": [1, 7, 11], "indic": [0, 5], "induc": [1, 9], "influenc": 9, "inform": [2, 3, 4, 11], "input": 10, "insight": 9, "integ": 9, "intens": 2, "introduc": 2, "involv": [4, 9, 10], "isn": 7, "item": 0, "its": [1, 6, 8], "judgeabl": 0, "kl": 9, "l_p_valu": 9, "label": 3, "larger": 11, "last": 4, "last_lay": 4, "layer": [2, 4, 5, 8, 9, 10, 11], "lead": [2, 4, 10, 11], "leaki": 8, "less": [3, 10], "leverag": 11, "limit": 5, "line": 5, "linear_collaps": 10, "longer": 3, "loss": 0, "lost": 0, "lot": 9, "lower": 5, "lp": 9, "mae": 9, "mai": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "maintain": 9, "make": [0, 5, 7, 9], "manual": 5, "match": 7, "max": 9, "maxim": 4, "maximum": 5, "mct": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "mean": 9, "mechan": 2, "mess": 5, "messag": 10, "method": [0, 2, 4], "metric": 9, "might": 7, "min": 9, "minim": [0, 9], "minimum": 9, "mismatch": 0, "mitig": 4, "mix": [0, 3], "mixed_precis": 4, "mixed_precision_config": [2, 4, 11], "mixedprecisionquantizationconfig": [2, 4, 11], "model": [0, 2, 3, 5, 6, 7, 8, 9, 10, 11], "model_compression_toolkit": 4, "modifi": 10, "more": [0, 2, 3, 4, 7, 8, 9], "mpdistanceweight": 4, "mse": 9, "much": 3, "multipl": 10, "n_epoch": 3, "name": 5, "necessit": 2, "neg": 0, "network": [0, 2, 9, 10], "neural": 0, "nn": 8, "noclip": 9, "nois": [2, 9], "norm": 9, "normal": 9, "notabl": [2, 10], "num_of_imag": 11, "number": [3, 4, 7, 11], "numer": 0, "object": [0, 9], "observ": 9, "occur": 7, "offer": [0, 2, 4, 9], "often": 0, "one": 9, "onli": [3, 8, 9], "oper": [8, 10], "opt": 9, "optim": [3, 4, 9, 10], "option": [1, 3], "other": [0, 8, 9], "out": [2, 3, 4, 11], "outcom": 2, "outlier": [0, 9], "outlier_histgram": 5, "outlin": 0, "output": [0, 5, 8, 9], "overal": 9, "overcom": 1, "overfit": 7, "p": 9, "page": 0, "paramet": [2, 3, 5, 9], "part": [5, 8], "particularli": [5, 9, 11], "path": 5, "perform": 10, "period": 9, "phase": 9, "place": 4, "pleas": [0, 9], "point": 10, "posit": 8, "possibl": 9, "post": 0, "potenti": [2, 3, 4, 9, 10], "preced": 10, "precis": [0, 3], "predefin": 9, "prelu": 8, "preprocess": 6, "primari": 5, "priorit": 4, "problemat": 10, "process": [2, 3, 9, 11], "produc": 5, "program": 2, "provid": [1, 11], "ptq": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "pytorch_gradient_post_training_quant": 3, "pytorch_post_training_quant": [1, 2, 4, 5, 6, 7, 8, 9, 10, 11], "quant_config": 9, "quantiz": [1, 2, 4, 5, 6, 7, 8, 9, 10], "quantization_config": 9, "quantizationconfig": [1, 5, 8, 9, 10], "quantizationerrormethod": 9, "quantized_model": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "rang": [5, 8, 9, 10], "rate": 4, "read": [0, 8], "recalibr": 2, "reclaim": 0, "recommend": [1, 9, 10], "recov": 0, "red": 5, "reduc": [4, 9], "reduct": 4, "regress": 9, "reli": [4, 9], "relu": 8, "remedi": 3, "remov": 0, "report_dir": 5, "repres": [0, 1, 3, 5, 11], "representative_dataset": [3, 6, 7], "requir": [3, 9], "research": 9, "residual_collaps": 10, "resolut": 9, "resourc": [2, 4, 11], "restor": 1, "result": [0, 4, 9, 11], "risk": 4, "runtim": [2, 9, 11], "same": 6, "sampl": [0, 7], "save": [5, 8], "scale": 10, "scenario": [4, 5], "score": [2, 5, 9], "search": [2, 4, 11], "second": [5, 10], "section": 0, "see": 0, "select": [0, 5], "sensit": [1, 2, 4, 9, 10, 11], "separ": 10, "seri": 0, "set": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11], "set_log_fold": 0, "setup": 0, "shift": [0, 1], "shift_negative_activation_correct": 8, "shift_negative_params_search": 8, "shift_negative_ratio": 8, "shift_negative_threshold_recalcul": 8, "should": [6, 9], "show": 5, "shown": 5, "side": [5, 9], "signal": 9, "signific": [0, 9], "significantli": 10, "silu": 8, "similar": [7, 9], "singl": 7, "situat": 9, "size": [0, 1, 2, 4, 11], "skew": 5, "slow": 10, "small": [4, 7, 8, 9, 11], "solut": 0, "some": [0, 3, 8, 9, 10], "span": 9, "spars": 9, "specif": 9, "specifi": [5, 9], "speed": 10, "squar": 9, "stabl": 9, "statist": [1, 9], "step": 0, "strongli": 9, "suffer": 9, "sure": 7, "swish": 8, "t": [1, 7], "take": 3, "taken": 6, "target": [2, 4, 7, 11], "task": 9, "tensor": [5, 6, 7, 9, 10], "tensorboard": [0, 4, 5], "than": [3, 8, 9, 10], "them": [6, 8], "therefor": [1, 5], "thi": [0, 2, 5, 7, 8, 9], "threshold": [0, 5, 6, 7, 10], "thresholds_select": 5, "togeth": [3, 10], "too": 7, "tool": 5, "toolkit": [0, 10], "torch": 8, "tradition": 4, "train": [0, 6], "transform": 9, "true": [2, 8], "tutori": [2, 3, 4, 11], "tweak": 8, "typic": 5, "unbalanc": 0, "underli": 2, "understand": 2, "unexpect": 2, "unorthodox": 9, "up": 5, "upper": 5, "us": [0, 3, 5, 6, 7, 8, 9, 10], "use_hessian_based_scor": 2, "user": [2, 4, 11], "usual": 6, "valid": [0, 7], "valu": [5, 6, 7, 8, 9, 10, 11], "valuabl": 9, "varianc": 11, "visual": [0, 4, 5], "want": 9, "warn": 10, "we": 9, "weight": [1, 2, 3, 4, 11], "weights_bias_correct": 1, "when": [1, 3, 4, 5, 6, 7, 8, 9, 10, 11], "where": [5, 9], "which": [7, 9, 10], "while": [0, 9], "width": [2, 11], "without": 3, "work": 8, "x": 5, "xquant": 5, "xquantconfig": 5, "you": [0, 1, 3, 4, 5, 8, 9, 10], "your": [0, 1, 5, 8, 9, 10], "z": 5, "z_threshold": 5, "zscore": 5}, "titles": ["TroubleShooting Manual (MCT XQuant Extension Tool)", "Bias Correction", "Enabling Hessian-based Mixed Precision", "GPTQ - Gradient-Based Post Training Quantization", "Mixed Precision with model output loss objective", "Outlier Removal", "Representative and Validation Dataset Mismatch", "Representative Dataset size and diversity", "Shift Negative Activation", "Threshold selection error method", "Unbalanced \u201cconcatenation\u201d", "Using more samples in Mixed Precision quantization"], "titleterms": {"activ": 8, "appendix": 5, "base": [2, 3], "bia": 1, "concaten": 10, "correct": 1, "dataset": [6, 7], "divers": 7, "enabl": 2, "error": 9, "extens": 0, "gptq": 3, "gradient": 3, "hessian": 2, "histogram": 5, "how": 5, "loss": 4, "manual": 0, "mct": 0, "method": 9, "mismatch": 6, "mix": [2, 4, 11], "model": 4, "more": 11, "neg": 8, "object": 4, "outlier": 5, "output": 4, "overview": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "post": 3, "precis": [2, 4, 11], "quantiz": [0, 3, 11], "read": 5, "refer": 0, "remov": 5, "repres": [6, 7], "sampl": 11, "select": 9, "shift": 8, "situat": [1, 4, 5, 6, 7, 8, 10, 11], "size": 7, "solut": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "threshold": 9, "tool": 0, "train": 3, "troubl": [1, 4, 5, 6, 7, 8, 10, 11], "troubleshoot": 0, "unbalanc": 10, "us": 11, "valid": 6, "xquant": 0}})
\ No newline at end of file
diff --git a/docs/docs_troubleshoot/static/pygments.css b/docs/docs_troubleshoot/static/pygments.css
index 0d49244ed..5f2b0a250 100644
--- a/docs/docs_troubleshoot/static/pygments.css
+++ b/docs/docs_troubleshoot/static/pygments.css
@@ -6,26 +6,26 @@ span.linenos.special { color: #000000; background-color: #ffffc0; padding-left:
.highlight .hll { background-color: #ffffcc }
.highlight { background: #eeffcc; }
.highlight .c { color: #408090; font-style: italic } /* Comment */
-.highlight .err { border: 1px solid #FF0000 } /* Error */
+.highlight .err { border: 1px solid #F00 } /* Error */
.highlight .k { color: #007020; font-weight: bold } /* Keyword */
-.highlight .o { color: #666666 } /* Operator */
+.highlight .o { color: #666 } /* Operator */
.highlight .ch { color: #408090; font-style: italic } /* Comment.Hashbang */
.highlight .cm { color: #408090; font-style: italic } /* Comment.Multiline */
.highlight .cp { color: #007020 } /* Comment.Preproc */
.highlight .cpf { color: #408090; font-style: italic } /* Comment.PreprocFile */
.highlight .c1 { color: #408090; font-style: italic } /* Comment.Single */
-.highlight .cs { color: #408090; background-color: #fff0f0 } /* Comment.Special */
+.highlight .cs { color: #408090; background-color: #FFF0F0 } /* Comment.Special */
.highlight .gd { color: #A00000 } /* Generic.Deleted */
.highlight .ge { font-style: italic } /* Generic.Emph */
.highlight .ges { font-weight: bold; font-style: italic } /* Generic.EmphStrong */
-.highlight .gr { color: #FF0000 } /* Generic.Error */
+.highlight .gr { color: #F00 } /* Generic.Error */
.highlight .gh { color: #000080; font-weight: bold } /* Generic.Heading */
.highlight .gi { color: #00A000 } /* Generic.Inserted */
-.highlight .go { color: #333333 } /* Generic.Output */
-.highlight .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */
+.highlight .go { color: #333 } /* Generic.Output */
+.highlight .gp { color: #C65D09; font-weight: bold } /* Generic.Prompt */
.highlight .gs { font-weight: bold } /* Generic.Strong */
.highlight .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
-.highlight .gt { color: #0044DD } /* Generic.Traceback */
+.highlight .gt { color: #04D } /* Generic.Traceback */
.highlight .kc { color: #007020; font-weight: bold } /* Keyword.Constant */
.highlight .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */
.highlight .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */
@@ -33,43 +33,43 @@ span.linenos.special { color: #000000; background-color: #ffffc0; padding-left:
.highlight .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */
.highlight .kt { color: #902000 } /* Keyword.Type */
.highlight .m { color: #208050 } /* Literal.Number */
-.highlight .s { color: #4070a0 } /* Literal.String */
-.highlight .na { color: #4070a0 } /* Name.Attribute */
+.highlight .s { color: #4070A0 } /* Literal.String */
+.highlight .na { color: #4070A0 } /* Name.Attribute */
.highlight .nb { color: #007020 } /* Name.Builtin */
-.highlight .nc { color: #0e84b5; font-weight: bold } /* Name.Class */
-.highlight .no { color: #60add5 } /* Name.Constant */
-.highlight .nd { color: #555555; font-weight: bold } /* Name.Decorator */
-.highlight .ni { color: #d55537; font-weight: bold } /* Name.Entity */
+.highlight .nc { color: #0E84B5; font-weight: bold } /* Name.Class */
+.highlight .no { color: #60ADD5 } /* Name.Constant */
+.highlight .nd { color: #555; font-weight: bold } /* Name.Decorator */
+.highlight .ni { color: #D55537; font-weight: bold } /* Name.Entity */
.highlight .ne { color: #007020 } /* Name.Exception */
-.highlight .nf { color: #06287e } /* Name.Function */
+.highlight .nf { color: #06287E } /* Name.Function */
.highlight .nl { color: #002070; font-weight: bold } /* Name.Label */
-.highlight .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */
+.highlight .nn { color: #0E84B5; font-weight: bold } /* Name.Namespace */
.highlight .nt { color: #062873; font-weight: bold } /* Name.Tag */
-.highlight .nv { color: #bb60d5 } /* Name.Variable */
+.highlight .nv { color: #BB60D5 } /* Name.Variable */
.highlight .ow { color: #007020; font-weight: bold } /* Operator.Word */
-.highlight .w { color: #bbbbbb } /* Text.Whitespace */
+.highlight .w { color: #BBB } /* Text.Whitespace */
.highlight .mb { color: #208050 } /* Literal.Number.Bin */
.highlight .mf { color: #208050 } /* Literal.Number.Float */
.highlight .mh { color: #208050 } /* Literal.Number.Hex */
.highlight .mi { color: #208050 } /* Literal.Number.Integer */
.highlight .mo { color: #208050 } /* Literal.Number.Oct */
-.highlight .sa { color: #4070a0 } /* Literal.String.Affix */
-.highlight .sb { color: #4070a0 } /* Literal.String.Backtick */
-.highlight .sc { color: #4070a0 } /* Literal.String.Char */
-.highlight .dl { color: #4070a0 } /* Literal.String.Delimiter */
-.highlight .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */
-.highlight .s2 { color: #4070a0 } /* Literal.String.Double */
-.highlight .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */
-.highlight .sh { color: #4070a0 } /* Literal.String.Heredoc */
-.highlight .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */
-.highlight .sx { color: #c65d09 } /* Literal.String.Other */
+.highlight .sa { color: #4070A0 } /* Literal.String.Affix */
+.highlight .sb { color: #4070A0 } /* Literal.String.Backtick */
+.highlight .sc { color: #4070A0 } /* Literal.String.Char */
+.highlight .dl { color: #4070A0 } /* Literal.String.Delimiter */
+.highlight .sd { color: #4070A0; font-style: italic } /* Literal.String.Doc */
+.highlight .s2 { color: #4070A0 } /* Literal.String.Double */
+.highlight .se { color: #4070A0; font-weight: bold } /* Literal.String.Escape */
+.highlight .sh { color: #4070A0 } /* Literal.String.Heredoc */
+.highlight .si { color: #70A0D0; font-style: italic } /* Literal.String.Interpol */
+.highlight .sx { color: #C65D09 } /* Literal.String.Other */
.highlight .sr { color: #235388 } /* Literal.String.Regex */
-.highlight .s1 { color: #4070a0 } /* Literal.String.Single */
+.highlight .s1 { color: #4070A0 } /* Literal.String.Single */
.highlight .ss { color: #517918 } /* Literal.String.Symbol */
.highlight .bp { color: #007020 } /* Name.Builtin.Pseudo */
-.highlight .fm { color: #06287e } /* Name.Function.Magic */
-.highlight .vc { color: #bb60d5 } /* Name.Variable.Class */
-.highlight .vg { color: #bb60d5 } /* Name.Variable.Global */
-.highlight .vi { color: #bb60d5 } /* Name.Variable.Instance */
-.highlight .vm { color: #bb60d5 } /* Name.Variable.Magic */
+.highlight .fm { color: #06287E } /* Name.Function.Magic */
+.highlight .vc { color: #BB60D5 } /* Name.Variable.Class */
+.highlight .vg { color: #BB60D5 } /* Name.Variable.Global */
+.highlight .vi { color: #BB60D5 } /* Name.Variable.Instance */
+.highlight .vm { color: #BB60D5 } /* Name.Variable.Magic */
.highlight .il { color: #208050 } /* Literal.Number.Integer.Long */
\ No newline at end of file
diff --git a/docs/docs_troubleshoot/troubleshoots/bias_correction.html b/docs/docs_troubleshoot/troubleshoots/bias_correction.html
index 839a46043..14ba9f6be 100644
--- a/docs/docs_troubleshoot/troubleshoots/bias_correction.html
+++ b/docs/docs_troubleshoot/troubleshoots/bias_correction.html
@@ -7,7 +7,7 @@
Bias Correction — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/enabling_hessian-based_mixed_precision.html b/docs/docs_troubleshoot/troubleshoots/enabling_hessian-based_mixed_precision.html
index 9e72dbe68..a1aa13592 100644
--- a/docs/docs_troubleshoot/troubleshoots/enabling_hessian-based_mixed_precision.html
+++ b/docs/docs_troubleshoot/troubleshoots/enabling_hessian-based_mixed_precision.html
@@ -7,7 +7,7 @@
Enabling Hessian-based Mixed Precision — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/gptq-gradient_based_post_training_quantization.html b/docs/docs_troubleshoot/troubleshoots/gptq-gradient_based_post_training_quantization.html
index 263064897..acd49af61 100644
--- a/docs/docs_troubleshoot/troubleshoots/gptq-gradient_based_post_training_quantization.html
+++ b/docs/docs_troubleshoot/troubleshoots/gptq-gradient_based_post_training_quantization.html
@@ -7,7 +7,7 @@
GPTQ - Gradient-Based Post Training Quantization — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/mixed_precision_with_model_output_loss_objective.html b/docs/docs_troubleshoot/troubleshoots/mixed_precision_with_model_output_loss_objective.html
index 7286296f5..ad0957cbb 100644
--- a/docs/docs_troubleshoot/troubleshoots/mixed_precision_with_model_output_loss_objective.html
+++ b/docs/docs_troubleshoot/troubleshoots/mixed_precision_with_model_output_loss_objective.html
@@ -7,7 +7,7 @@
Mixed Precision with model output loss objective — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
@@ -62,7 +62,7 @@ Solution
MCT offers an API to adjust the Mixed Precision objective method (MpDistanceWeighting).
Set the distance_weighting_method attribute to MpDistanceWeighting.LAST_LAYER in the MixedPrecisionQuantizationConfig of the CoreConfig.
By emphasizing a loss function that places greater importance on enhancing the model’s quantized output, users can mitigate the risk of detrimental precision reductions in the last layer.
-from model_compression_toolkit.core.common.mixed_precision import MpDistanceWeighting
+from model_compression_toolkit.core.common.mixed_precision import MpDistanceWeighting
mixed_precision_config = mct.core.MixedPrecisionQuantizationConfig(distance_weighting_method=MpDistanceWeighting.LAST_LAYER)
core_config = mct.core.CoreConfig(mixed_precision_config=mixed_precision_config)
diff --git a/docs/docs_troubleshoot/troubleshoots/outlier_removal.html b/docs/docs_troubleshoot/troubleshoots/outlier_removal.html
index ef4b50923..fd731cd2d 100644
--- a/docs/docs_troubleshoot/troubleshoots/outlier_removal.html
+++ b/docs/docs_troubleshoot/troubleshoots/outlier_removal.html
@@ -7,7 +7,7 @@
Outlier Removal — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/representative_and_validation_dataset_mismatch.html b/docs/docs_troubleshoot/troubleshoots/representative_and_validation_dataset_mismatch.html
index 0efd370fd..bdab2b0fb 100644
--- a/docs/docs_troubleshoot/troubleshoots/representative_and_validation_dataset_mismatch.html
+++ b/docs/docs_troubleshoot/troubleshoots/representative_and_validation_dataset_mismatch.html
@@ -7,7 +7,7 @@
Representative and Validation Dataset Mismatch — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/representative_dataset_size_and_diversity.html b/docs/docs_troubleshoot/troubleshoots/representative_dataset_size_and_diversity.html
index f901696c8..ee4a881f8 100644
--- a/docs/docs_troubleshoot/troubleshoots/representative_dataset_size_and_diversity.html
+++ b/docs/docs_troubleshoot/troubleshoots/representative_dataset_size_and_diversity.html
@@ -7,7 +7,7 @@
Representative Dataset size and diversity — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/shift_negative_activation.html b/docs/docs_troubleshoot/troubleshoots/shift_negative_activation.html
index fd5fd1d39..7d7a2338b 100644
--- a/docs/docs_troubleshoot/troubleshoots/shift_negative_activation.html
+++ b/docs/docs_troubleshoot/troubleshoots/shift_negative_activation.html
@@ -7,7 +7,7 @@
Shift Negative Activation — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/threhold_selection_error_method.html b/docs/docs_troubleshoot/troubleshoots/threhold_selection_error_method.html
index 2b2fb42df..1295f98fa 100644
--- a/docs/docs_troubleshoot/troubleshoots/threhold_selection_error_method.html
+++ b/docs/docs_troubleshoot/troubleshoots/threhold_selection_error_method.html
@@ -7,7 +7,7 @@
Threshold selection error method — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
@@ -52,13 +52,46 @@ Overview
Solution¶
Use a different error method for activations. You can set the following values:
-
-NOCLIPPING - Use min/max values
-MSE (default) - Use mean square error
-MAE - Use mean absolute error
-KL - Use KL-divergence
-Lp - Use Lp-norm
-HMSE - Use Hessian-based mean squared error
+
+NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
+MSE - (default) Use mean square error for minimizing quantization noise.
+MAE - Use mean absolute error for minimizing quantization noise.
+KL - Use KL-divergence to make signals distributions to be similar as possible.
+Lp - Use Lp-norm to minimizing quantization noise. The parameter p is specified by QuantizationConfig.l_p_value (default: 2; integer only). It equals MAE when p = 1 and MSE when p = 2. If you want to use p≧3, please use this method.
+HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization.
+How to select QuantizationErrorMethod
+
+
+
+
+
+
+Method
+Recommended Situations
+
+
+
+NOCLIPPING
+Research and debugging phases where you want to observe behavior across the entire range. This is effective when you want to maintain the entire range, especially when the data is biased (for example, when there is an extremely small amount of data on the minimum side).
+
+MSE
+Basically, you should use this method. This method is effective when the data distribution is close to normal and there are few outliers. Effective when you want stable results, such as in regression tasks.
+
+MAE
+Effective for data with a lot of noise and outliers.
+
+KL
+Useful for tasks where output distribution is important (such as Anomaly Detection).
+
+LP
+p≧3 is effective when you want to be more sensitive to outliers than MSE. (such as Sparse Data).
+
+HMSE
+Recommended when using GPTQ. This is effective for models where specific layers strongly influence the overall accuracy. (such as Transformers).
+
+
+
+
For example, set NOCLIPPING to the activation_error_method attribute of the QuantizationConfig in CoreConfig.
diff --git a/docs/docs_troubleshoot/troubleshoots/unbalanced_concatenation.html b/docs/docs_troubleshoot/troubleshoots/unbalanced_concatenation.html
index 7f38ead4d..c95c95fa6 100644
--- a/docs/docs_troubleshoot/troubleshoots/unbalanced_concatenation.html
+++ b/docs/docs_troubleshoot/troubleshoots/unbalanced_concatenation.html
@@ -7,7 +7,7 @@
Unbalanced “concatenation” — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/docs_troubleshoot/troubleshoots/using_more_samples_in_mixed_precision_quantization.html b/docs/docs_troubleshoot/troubleshoots/using_more_samples_in_mixed_precision_quantization.html
index e1a640647..50db9dad6 100644
--- a/docs/docs_troubleshoot/troubleshoots/using_more_samples_in_mixed_precision_quantization.html
+++ b/docs/docs_troubleshoot/troubleshoots/using_more_samples_in_mixed_precision_quantization.html
@@ -7,7 +7,7 @@
Using more samples in Mixed Precision quantization — TroubleShooting Documentation (MCT XQuant Extension Tool): ver 1.0
-
+
diff --git a/docs/genindex.html b/docs/genindex.html
index 4b0eddcc2..e1f138f60 100644
--- a/docs/genindex.html
+++ b/docs/genindex.html
@@ -6,7 +6,7 @@
Index — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/guidelines/XQuant_Extension_Tool.html b/docs/guidelines/XQuant_Extension_Tool.html
index 5970df91c..84606063a 100644
--- a/docs/guidelines/XQuant_Extension_Tool.html
+++ b/docs/guidelines/XQuant_Extension_Tool.html
@@ -7,7 +7,7 @@
XQuant Extension Tool — MCT Documentation: ver 2.6.0
-
+
@@ -86,7 +86,7 @@ How to Runthe XQuant tutorial with xquant_report_troubleshoot_pytorch_experimental.
-from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
+from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
# xquant_report_pytorch_experimental --> xquant_report_troubleshoot_pytorch_experimental
result = xquant_report_troubleshoot_pytorch_experimental(
float_model,
@@ -111,7 +111,7 @@ How to Runxquant_config = XQuantConfig(report_dir='./log_tensorboard_xquant')
-from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
+from model_compression_toolkit.xquant import xquant_report_troubleshoot_pytorch_experimental
result = xquant_report_troubleshoot_pytorch_experimental(
float_model,
quantized_model,
@@ -208,7 +208,7 @@ Understanding the Judgeable Troubleshoots
-WARNING:Model Compression Toolkit:There are output values that deviate significantly from the average. Refer to the following images and the TroubleShooting Documentation (MCT XQuant Extension Tool) of 'Outlier Removal'.
+WARNING:Model Compression Toolkit:There are output values that deviate significantly from the average. Refer to the following images and the TroubleShooting Documentation (MCT XQuant Extension Tool) of 'Outlier Removal'.
diff --git a/docs/guidelines/visualization.html b/docs/guidelines/visualization.html
index e713d300b..5783d82f7 100644
--- a/docs/guidelines/visualization.html
+++ b/docs/guidelines/visualization.html
@@ -7,7 +7,7 @@
Visualization within TensorBoard — MCT Documentation: ver 2.6.0
-
+
@@ -50,7 +50,7 @@ Navigation
Visualization within TensorBoard¶
One may log various graphs and data collected in different phases of the model quantization and display them within the Tensorboard UI.
To use it, all you have to do is to set a logger path. Setting a path is done by calling set_log_folder.
-import model_compression_toolkit as mct
+import model_compression_toolkit as mct
mct.set_log_folder('/logger/dir/path')
diff --git a/docs/index.html b/docs/index.html
index 0a5b860a4..2ec291518 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -7,7 +7,7 @@
Model Compression Toolkit User Guide — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/search.html b/docs/search.html
index 37d780dc0..740d162e7 100644
--- a/docs/search.html
+++ b/docs/search.html
@@ -6,7 +6,7 @@
Search — MCT Documentation: ver 2.6.0
-
+
diff --git a/docs/searchindex.js b/docs/searchindex.js
index 31df9fddc..eda5370b3 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"API Docs": [[13, null]], "API Documentation": [[50, "api-documentation"]], "About XQuant Extension Tool": [[48, "about-xquant-extension-tool"]], "Actions": [[43, "actions"]], "Attribute Filters": [[42, "attribute-filters"]], "AttributeQuantizationConfig": [[45, "attributequantizationconfig"]], "BNLayerWeightingType": [[1, "bnlayerweightingtype"]], "BaseKerasTrainableQuantizer": [[46, "basekerastrainablequantizer"]], "BasePytorchTrainableQuantizer": [[46, "basepytorchtrainablequantizer"]], "BatchNormAlignemntLossType": [[1, "batchnormalignemntlosstype"]], "BitWidthConfig": [[0, null]], "ChannelAxis": [[3, "channelaxis"]], "ChannelsFilteringStrategy": [[6, "channelsfilteringstrategy"]], "CoreConfig": [[39, null]], "Cosine Similarity Comparison": [[49, "cosine-similarity-comparison"]], "Data Generation Configuration": [[1, null]], "DataInitType": [[1, "datainittype"]], "DebugConfig": [[40, "debugconfig"]], "DefaultDict Class": [[2, null]], "EditRule": [[43, "editrule"]], "Enable a Logger": [[35, null]], "Filters": [[43, "filters"]], "FrameworkInfo Class": [[3, null]], "Fusing": [[45, "fusing"]], "GPTQHessianScoresConfig Class": [[4, "gptqhessianscoresconfig-class"]], "Get DataGenerationConfig for Keras Models": [[14, null]], "Get DataGenerationConfig for Pytorch Models": [[16, null]], "Get GradientPTQConfig for Keras Models": [[15, null]], "Get GradientPTQConfig for Pytorch Models": [[17, null]], "Get Resource Utilization information for Keras Models": [[22, null]], "Get Resource Utilization information for PyTorch Models": [[30, null]], "Get TargetPlatformCapabilities for sdsp converter version": [[19, null]], "Get TargetPlatformCapabilities for tpc version": [[18, null]], "GradientPTQConfig Class": [[4, null]], "GradualActivationQuantizationConfig": [[4, "gradualactivationquantizationconfig"]], "How to Run": [[48, "how-to-run"]], "ImageGranularity": [[1, "imagegranularity"]], "ImageNormalizationType": [[1, "imagenormalizationtype"]], "ImagePipelineType": [[1, "imagepipelinetype"]], "ImportanceMetric": [[6, "importancemetric"]], "Indices and tables": [[13, "indices-and-tables"]], "Install": [[50, "install"]], "Keras Data Generation": [[20, null]], "Keras Gradient Based Post Training Quantization": [[21, null]], "Keras Post Training Quantization": [[24, null]], "Keras Quantization Aware Training Model Finalize": [[26, null]], "Keras Quantization Aware Training Model Init": [[27, null]], "Keras Structured Pruning": [[25, null]], "Keras Tutorial": [[41, "keras-tutorial"]], "KerasExportSerializationFormat": [[41, "kerasexportserializationformat"]], "Keys in the processing state dictionary": [[40, "id1"]], "Layer Attributes Filters": [[42, null]], "Load Quantized Keras Model": [[23, null]], "MCTQ": [[41, "mctq"]], "MCTQ Quantization Format": [[41, "mctq-quantization-format"]], "ManualBitWidthSelection": [[0, "manualbitwidthselection"]], "Mixed-precision Configuration Bit-width": [[49, "mixed-precision-configuration-bit-width"]], "MixedPrecisionQuantizationConfig": [[5, null]], "Model Compression Toolkit User Guide": [[50, null]], "MpDistanceWeighting": [[5, "mpdistanceweighting"]], "MpMetricNormalization": [[5, "mpmetricnormalization"]], "ONNX": [[41, "onnx"]], "ONNX model output names": [[41, "onnx-model-output-names"]], "ONNX opset version": [[41, "onnx-opset-version"]], "OpQuantizationConfig": [[45, "opquantizationconfig"]], "OperatorSetGroup": [[45, "operatorsetgroup"]], "OperatorsSet": [[45, "operatorsset"]], "OutputLossType": [[1, "outputlosstype"]], "Overall Process Flow": [[48, "overall-process-flow"]], "Overview": [[50, "overview"]], "Pruning Configuration": [[6, null]], "Pruning Information": [[7, null]], "PyTorch Quantization Aware Training Model Finalize": [[33, null]], "PyTorch Quantization Aware Training Model Init": [[34, null]], "Pytorch Data Generation": [[28, null]], "Pytorch Gradient Based Post Training Quantization": [[29, null]], "Pytorch Post Training Quantization": [[31, null]], "Pytorch Structured Pruning": [[32, null]], "Pytorch Tutorial": [[41, "pytorch-tutorial"]], "PytorchExportSerializationFormat": [[41, "pytorchexportserializationformat"]], "QATConfig": [[44, "qatconfig"]], "QFractionLinearAnnealingConfig": [[4, "qfractionlinearannealingconfig"]], "QuantizationConfig": [[8, null]], "QuantizationConfigOptions": [[45, "quantizationconfigoptions"]], "QuantizationErrorMethod": [[9, null]], "QuantizationFormat": [[41, "quantizationformat"]], "QuantizationMethod": [[45, "quantizationmethod"]], "Quickstart": [[50, "quickstart"]], "References": [[50, "references"]], "ResourceUtilization": [[10, null]], "RoundingType": [[4, "roundingtype"]], "SchedulerType": [[1, "schedulertype"]], "Supported Features": [[50, "supported-features"]], "TargetPlatformCapabilities": [[45, "targetplatformcapabilities"]], "Technical Constraints": [[50, "technical-constraints"]], "TrainableQuantizerActivationConfig": [[46, "trainablequantizeractivationconfig"]], "TrainableQuantizerWeightsConfig": [[46, "trainablequantizerweightsconfig"]], "TrainingMethod": [[44, "trainingmethod"], [46, "trainingmethod"]], "Understanding the General Troubleshoots": [[48, "understanding-the-general-troubleshoots"]], "Understanding the Judgeable Troubleshoots": [[48, "understanding-the-judgeable-troubleshoots"]], "Understanding the Quantization Error Graph": [[48, "understanding-the-quantization-error-graph"]], "Use exported model for inference": [[41, "use-exported-model-for-inference"]], "Visualization within TensorBoard": [[49, null]], "XQuant Configuration": [[12, null]], "XQuant Extension Tool": [[48, null]], "XQuant Report Keras": [[36, null]], "XQuant Report Pytorch": [[37, null]], "XQuant Report Troubleshoot Pytorch": [[38, null]], "XQuantConfig Format and Examples": [[48, "xquantconfig-format-and-examples"]], "XQuantConfig parameter": [[48, "id3"]], "core": [[13, "core"]], "data_generation": [[13, "data-generation"]], "debug_config Module": [[40, null]], "exporter": [[13, "exporter"]], "exporter Module": [[41, null]], "gptq": [[13, "gptq"]], "keras serialization format": [[41, "keras-serialization-format"]], "keras_export_model": [[41, "keras-export-model"]], "keras_load_quantized_model": [[13, "keras-load-quantized-model"]], "network_editor Module": [[43, null]], "pruning": [[13, "pruning"]], "ptq": [[13, "ptq"]], "pytorch_export_model": [[41, "pytorch-export-model"]], "qat": [[13, "qat"]], "qat_config Module": [[44, null]], "set_log_folder": [[13, "set-log-folder"]], "target_platform_capabilities": [[13, "target-platform-capabilities"]], "target_platform_capabilities Module": [[45, null]], "trainable_infrastructure": [[13, "trainable-infrastructure"]], "trainable_infrastructure Module": [[46, null]], "wrapper": [[11, null], [13, "wrapper"]], "xquant": [[13, "xquant"]]}, "docnames": ["api/api_docs/classes/BitWidthConfig", "api/api_docs/classes/DataGenerationConfig", "api/api_docs/classes/DefaultDict", "api/api_docs/classes/FrameworkInfo", "api/api_docs/classes/GradientPTQConfig", "api/api_docs/classes/MixedPrecisionQuantizationConfig", "api/api_docs/classes/PruningConfig", "api/api_docs/classes/PruningInfo", "api/api_docs/classes/QuantizationConfig", "api/api_docs/classes/QuantizationErrorMethod", "api/api_docs/classes/ResourceUtilization", "api/api_docs/classes/Wrapper", "api/api_docs/classes/XQuantConfig", "api/api_docs/index", "api/api_docs/methods/get_keras_data_generation_config", "api/api_docs/methods/get_keras_gptq_config", "api/api_docs/methods/get_pytorch_data_generation_config", "api/api_docs/methods/get_pytroch_gptq_config", "api/api_docs/methods/get_target_platform_capabilities", "api/api_docs/methods/get_target_platform_capabilities_sdsp", "api/api_docs/methods/keras_data_generation_experimental", "api/api_docs/methods/keras_gradient_post_training_quantization", "api/api_docs/methods/keras_kpi_data", "api/api_docs/methods/keras_load_quantizad_model", "api/api_docs/methods/keras_post_training_quantization", "api/api_docs/methods/keras_pruning_experimental", "api/api_docs/methods/keras_quantization_aware_training_finalize_experimental", "api/api_docs/methods/keras_quantization_aware_training_init_experimental", "api/api_docs/methods/pytorch_data_generation_experimental", "api/api_docs/methods/pytorch_gradient_post_training_quantization", "api/api_docs/methods/pytorch_kpi_data", "api/api_docs/methods/pytorch_post_training_quantization", "api/api_docs/methods/pytorch_pruning_experimental", "api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental", "api/api_docs/methods/pytorch_quantization_aware_training_init_experimental", "api/api_docs/methods/set_logger_path", "api/api_docs/methods/xquant_report_keras_experimental", "api/api_docs/methods/xquant_report_pytorch_experimental", "api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental", "api/api_docs/modules/core_config", "api/api_docs/modules/debug_config", "api/api_docs/modules/exporter", "api/api_docs/modules/layer_filters", "api/api_docs/modules/network_editor", "api/api_docs/modules/qat_config", "api/api_docs/modules/target_platform_capabilities", "api/api_docs/modules/trainable_infrastructure", "api/api_docs/notes/tpc_note", "guidelines/XQuant_Extension_Tool", "guidelines/visualization", "index"], "envversion": {"sphinx": 64, "sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2}, "filenames": ["api/api_docs/classes/BitWidthConfig.rst", "api/api_docs/classes/DataGenerationConfig.rst", "api/api_docs/classes/DefaultDict.rst", "api/api_docs/classes/FrameworkInfo.rst", "api/api_docs/classes/GradientPTQConfig.rst", "api/api_docs/classes/MixedPrecisionQuantizationConfig.rst", "api/api_docs/classes/PruningConfig.rst", "api/api_docs/classes/PruningInfo.rst", "api/api_docs/classes/QuantizationConfig.rst", "api/api_docs/classes/QuantizationErrorMethod.rst", "api/api_docs/classes/ResourceUtilization.rst", "api/api_docs/classes/Wrapper.rst", "api/api_docs/classes/XQuantConfig.rst", "api/api_docs/index.rst", "api/api_docs/methods/get_keras_data_generation_config.rst", "api/api_docs/methods/get_keras_gptq_config.rst", "api/api_docs/methods/get_pytorch_data_generation_config.rst", "api/api_docs/methods/get_pytroch_gptq_config.rst", "api/api_docs/methods/get_target_platform_capabilities.rst", "api/api_docs/methods/get_target_platform_capabilities_sdsp.rst", "api/api_docs/methods/keras_data_generation_experimental.rst", "api/api_docs/methods/keras_gradient_post_training_quantization.rst", "api/api_docs/methods/keras_kpi_data.rst", "api/api_docs/methods/keras_load_quantizad_model.rst", "api/api_docs/methods/keras_post_training_quantization.rst", "api/api_docs/methods/keras_pruning_experimental.rst", "api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.rst", "api/api_docs/methods/keras_quantization_aware_training_init_experimental.rst", "api/api_docs/methods/pytorch_data_generation_experimental.rst", "api/api_docs/methods/pytorch_gradient_post_training_quantization.rst", "api/api_docs/methods/pytorch_kpi_data.rst", "api/api_docs/methods/pytorch_post_training_quantization.rst", "api/api_docs/methods/pytorch_pruning_experimental.rst", "api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.rst", "api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.rst", "api/api_docs/methods/set_logger_path.rst", "api/api_docs/methods/xquant_report_keras_experimental.rst", "api/api_docs/methods/xquant_report_pytorch_experimental.rst", "api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.rst", "api/api_docs/modules/core_config.rst", "api/api_docs/modules/debug_config.rst", "api/api_docs/modules/exporter.rst", "api/api_docs/modules/layer_filters.rst", "api/api_docs/modules/network_editor.rst", "api/api_docs/modules/qat_config.rst", "api/api_docs/modules/target_platform_capabilities.rst", "api/api_docs/modules/trainable_infrastructure.rst", "api/api_docs/notes/tpc_note.rst", "guidelines/XQuant_Extension_Tool.rst", "guidelines/visualization.rst", "index.rst"], "indexentries": {"add_metadata (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.add_metadata", false]], "attributefilter (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.AttributeFilter", false]], "attributequantizationconfig (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig", false]], "base_config (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.quantizationconfigoptions attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions.base_config", false]], "basekerastrainablequantizer (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.BaseKerasTrainableQuantizer", false]], "basepytorchtrainablequantizer (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.BasePytorchTrainableQuantizer", false]], "batchnormalignemntlosstype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.BatchNormAlignemntLossType", false]], "bit_width (model_compression_toolkit.core.common.quantization.bit_width_config.manualbitwidthselection attribute)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection.bit_width", false]], "bitwidthconfig (class in model_compression_toolkit.core)": [[0, "model_compression_toolkit.core.BitWidthConfig", false]], "bnlayerweightingtype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.BNLayerWeightingType", false]], "changecandidatesactivationquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesActivationQuantConfigAttr", false]], "changecandidatesactivationquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesActivationQuantizationMethod", false]], "changecandidatesweightsquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesWeightsQuantConfigAttr", false]], "changecandidatesweightsquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesWeightsQuantizationMethod", false]], "changefinalactivationquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalActivationQuantConfigAttr", false]], "changefinalweightsquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalWeightsQuantConfigAttr", false]], "changefinalweightsquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalWeightsQuantizationMethod", false]], "changequantizationparamfunction (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeQuantizationParamFunction", false]], "channelaxis (class in model_compression_toolkit.core)": [[3, "model_compression_toolkit.core.ChannelAxis", false]], "channels_filtering_strategy (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.channels_filtering_strategy", false]], "channelsfilteringstrategy (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.ChannelsFilteringStrategy", false]], "coreconfig (class in model_compression_toolkit.core)": [[39, "model_compression_toolkit.core.CoreConfig", false]], "datagenerationconfig (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.DataGenerationConfig", false]], "datainittype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.DataInitType", false]], "debugconfig (class in model_compression_toolkit.core)": [[40, "model_compression_toolkit.core.DebugConfig", false]], "default_qco (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.default_qco", false]], "defaultdict (class in model_compression_toolkit)": [[2, "model_compression_toolkit.DefaultDict", false]], "editrule (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.EditRule", false]], "enable_weights_quantization (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.enable_weights_quantization", false]], "eq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Eq", false]], "filter (model_compression_toolkit.core.common.quantization.bit_width_config.manualbitwidthselection attribute)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection.filter", false]], "frameworkinfo (class in model_compression_toolkit.core)": [[3, "model_compression_toolkit.core.FrameworkInfo", false]], "fuse_op_quantization_config (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.fuse_op_quantization_config", false]], "fusing (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing", false]], "fusing_patterns (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.fusing_patterns", false]], "get() (model_compression_toolkit.defaultdict method)": [[2, "model_compression_toolkit.DefaultDict.get", false]], "get_keras_data_generation_config() (in module model_compression_toolkit.data_generation)": [[14, "model_compression_toolkit.data_generation.get_keras_data_generation_config", false]], "get_keras_gptq_config() (in module model_compression_toolkit.gptq)": [[15, "model_compression_toolkit.gptq.get_keras_gptq_config", false]], "get_pytorch_data_generation_config() (in module model_compression_toolkit.data_generation)": [[16, "model_compression_toolkit.data_generation.get_pytorch_data_generation_config", false]], "get_pytorch_gptq_config() (in module model_compression_toolkit.gptq)": [[17, "model_compression_toolkit.gptq.get_pytorch_gptq_config", false]], "get_target_platform_capabilities() (in module model_compression_toolkit)": [[18, "model_compression_toolkit.get_target_platform_capabilities", false]], "get_target_platform_capabilities_sdsp() (in module model_compression_toolkit)": [[19, "model_compression_toolkit.get_target_platform_capabilities_sdsp", false]], "gptqhessianscoresconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GPTQHessianScoresConfig", false]], "gradientptqconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GradientPTQConfig", false]], "gradualactivationquantizationconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GradualActivationQuantizationConfig", false]], "greater (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Greater", false]], "greatereq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.GreaterEq", false]], "imagegranularity (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImageGranularity", false]], "imagenormalizationtype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImageNormalizationType", false]], "imagepipelinetype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImagePipelineType", false]], "importance_metric (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.importance_metric", false]], "importance_scores (model_compression_toolkit.pruning.pruninginfo property)": [[7, "model_compression_toolkit.pruning.PruningInfo.importance_scores", false]], "importancemetric (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.ImportanceMetric", false]], "insert_preserving_quantizers (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.insert_preserving_quantizers", false]], "is_simd_padding (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.is_simd_padding", false]], "keras_data_generation_experimental() (in module model_compression_toolkit.data_generation)": [[20, "model_compression_toolkit.data_generation.keras_data_generation_experimental", false]], "keras_export_model (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.keras_export_model", false]], "keras_gradient_post_training_quantization() (in module model_compression_toolkit.gptq)": [[21, "model_compression_toolkit.gptq.keras_gradient_post_training_quantization", false]], "keras_load_quantized_model() (in module model_compression_toolkit)": [[23, "model_compression_toolkit.keras_load_quantized_model", false]], "keras_post_training_quantization() (in module model_compression_toolkit.ptq)": [[24, "model_compression_toolkit.ptq.keras_post_training_quantization", false]], "keras_pruning_experimental() (in module model_compression_toolkit.pruning)": [[25, "model_compression_toolkit.pruning.keras_pruning_experimental", false]], "keras_quantization_aware_training_finalize_experimental() (in module model_compression_toolkit.qat)": [[26, "model_compression_toolkit.qat.keras_quantization_aware_training_finalize_experimental", false]], "keras_quantization_aware_training_init_experimental() (in module model_compression_toolkit.qat)": [[27, "model_compression_toolkit.qat.keras_quantization_aware_training_init_experimental", false]], "keras_resource_utilization_data() (in module model_compression_toolkit.core)": [[22, "model_compression_toolkit.core.keras_resource_utilization_data", false]], "kerasexportserializationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.KerasExportSerializationFormat", false]], "keys() (model_compression_toolkit.defaultdict method)": [[2, "model_compression_toolkit.DefaultDict.keys", false]], "lut_values_bitwidth (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.lut_values_bitwidth", false]], "manual_activation_bit_width_selection_list (model_compression_toolkit.core.bitwidthconfig attribute)": [[0, "model_compression_toolkit.core.BitWidthConfig.manual_activation_bit_width_selection_list", false]], "manual_weights_bit_width_selection_list (model_compression_toolkit.core.bitwidthconfig attribute)": [[0, "model_compression_toolkit.core.BitWidthConfig.manual_weights_bit_width_selection_list", false]], "manualbitwidthselection (class in model_compression_toolkit.core.common.quantization.bit_width_config)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection", false]], "mctwrapper (class in model_compression_toolkit.wrapper.mct_wrapper)": [[11, "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper", false]], "mixedprecisionquantizationconfig (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MixedPrecisionQuantizationConfig", false]], "mpdistanceweighting (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MpDistanceWeighting", false]], "mpmetricnormalization (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MpMetricNormalization", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsetgroup attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.name", false]], "nodenamefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeNameFilter", false]], "nodenamescopefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeNameScopeFilter", false]], "nodetypefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeTypeFilter", false]], "noteq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.NotEq", false]], "num_score_approximations (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.num_score_approximations", false]], "operator_groups (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.operator_groups", false]], "operator_set (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.operator_set", false]], "operators_set (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsetgroup attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup.operators_set", false]], "operatorsetgroup (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup", false]], "operatorsset (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet", false]], "opquantizationconfig (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OpQuantizationConfig", false]], "outputlosstype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.OutputLossType", false]], "pruning_masks (model_compression_toolkit.pruning.pruninginfo property)": [[7, "model_compression_toolkit.pruning.PruningInfo.pruning_masks", false]], "pruningconfig (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.PruningConfig", false]], "pruninginfo (class in model_compression_toolkit.pruning)": [[7, "model_compression_toolkit.pruning.PruningInfo", false]], "pytorch_data_generation_experimental() (in module model_compression_toolkit.data_generation)": [[28, "model_compression_toolkit.data_generation.pytorch_data_generation_experimental", false]], "pytorch_export_model (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.pytorch_export_model", false]], "pytorch_gradient_post_training_quantization() (in module model_compression_toolkit.gptq)": [[29, "model_compression_toolkit.gptq.pytorch_gradient_post_training_quantization", false]], "pytorch_post_training_quantization() (in module model_compression_toolkit.ptq)": [[31, "model_compression_toolkit.ptq.pytorch_post_training_quantization", false]], "pytorch_pruning_experimental() (in module model_compression_toolkit.pruning)": [[32, "model_compression_toolkit.pruning.pytorch_pruning_experimental", false]], "pytorch_quantization_aware_training_finalize_experimental() (in module model_compression_toolkit.qat)": [[33, "model_compression_toolkit.qat.pytorch_quantization_aware_training_finalize_experimental", false]], "pytorch_quantization_aware_training_init_experimental() (in module model_compression_toolkit.qat)": [[34, "model_compression_toolkit.qat.pytorch_quantization_aware_training_init_experimental", false]], "pytorch_resource_utilization_data() (in module model_compression_toolkit.core)": [[30, "model_compression_toolkit.core.pytorch_resource_utilization_data", false]], "pytorchexportserializationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.PytorchExportSerializationFormat", false]], "qatconfig (class in model_compression_toolkit.qat)": [[44, "model_compression_toolkit.qat.QATConfig", false]], "qc_options (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.qc_options", false]], "qfractionlinearannealingconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.QFractionLinearAnnealingConfig", false]], "quantization_configurations (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.quantizationconfigoptions attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions.quantization_configurations", false]], "quantizationconfig (class in model_compression_toolkit.core)": [[8, "model_compression_toolkit.core.QuantizationConfig", false]], "quantizationconfigoptions (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions", false]], "quantizationerrormethod (class in model_compression_toolkit.core)": [[9, "model_compression_toolkit.core.QuantizationErrorMethod", false]], "quantizationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.QuantizationFormat", false]], "quantizationmethod (class in model_compression_toolkit.target_platform_capabilities)": [[45, "model_compression_toolkit.target_platform_capabilities.QuantizationMethod", false]], "quantize_and_export() (model_compression_toolkit.wrapper.mct_wrapper.mctwrapper method)": [[11, "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper.quantize_and_export", false]], "resourceutilization (class in model_compression_toolkit.core)": [[10, "model_compression_toolkit.core.ResourceUtilization", false]], "roundingtype (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.RoundingType", false]], "schedulertype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.SchedulerType", false]], "schema_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.SCHEMA_VERSION", false]], "set_log_folder() (in module model_compression_toolkit)": [[35, "model_compression_toolkit.set_log_folder", false]], "smaller (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Smaller", false]], "smallereq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.SmallerEq", false]], "targetplatformcapabilities (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities", false]], "tpc_minor_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_minor_version", false]], "tpc_patch_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_patch_version", false]], "tpc_platform_type (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_platform_type", false]], "trainablequantizeractivationconfig (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainableQuantizerActivationConfig", false]], "trainablequantizerweightsconfig (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig", false]], "trainingmethod (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainingMethod", false]], "type (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.type", false]], "weights_n_bits (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_n_bits", false]], "weights_per_channel_threshold (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold", false]], "weights_quantization_method (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_quantization_method", false]], "xquant_report_keras_experimental() (in module model_compression_toolkit.xquant.keras.facade_xquant_report)": [[36, "model_compression_toolkit.xquant.keras.facade_xquant_report.xquant_report_keras_experimental", false]], "xquant_report_pytorch_experimental() (in module model_compression_toolkit.xquant.pytorch.facade_xquant_report)": [[37, "model_compression_toolkit.xquant.pytorch.facade_xquant_report.xquant_report_pytorch_experimental", false]], "xquant_report_troubleshoot_pytorch_experimental() (in module model_compression_toolkit.xquant.pytorch.facade_xquant_report)": [[38, "model_compression_toolkit.xquant.pytorch.facade_xquant_report.xquant_report_troubleshoot_pytorch_experimental", false]], "xquantconfig (class in model_compression_toolkit.xquant.common.xquant_config)": [[12, "model_compression_toolkit.xquant.common.xquant_config.XQuantConfig", false]]}, "objects": {"model_compression_toolkit": [[2, 0, 1, "", "DefaultDict"], [18, 3, 1, "", "get_target_platform_capabilities"], [19, 3, 1, "", "get_target_platform_capabilities_sdsp"], [23, 3, 1, "", "keras_load_quantized_model"], [35, 3, 1, "", "set_log_folder"]], "model_compression_toolkit.DefaultDict": [[2, 1, 1, "", "get"], [2, 1, 1, "", "keys"]], "model_compression_toolkit.core": [[0, 0, 1, "", "BitWidthConfig"], [3, 0, 1, "", "ChannelAxis"], [39, 0, 1, "", "CoreConfig"], [40, 0, 1, "", "DebugConfig"], [3, 0, 1, "", "FrameworkInfo"], [5, 0, 1, "", "MixedPrecisionQuantizationConfig"], [5, 0, 1, "", "MpDistanceWeighting"], [5, 0, 1, "", "MpMetricNormalization"], [8, 0, 1, "", "QuantizationConfig"], [9, 0, 1, "", "QuantizationErrorMethod"], [10, 0, 1, "", "ResourceUtilization"], [22, 3, 1, "", "keras_resource_utilization_data"], [30, 3, 1, "", "pytorch_resource_utilization_data"]], "model_compression_toolkit.core.BitWidthConfig": [[0, 2, 1, "", "manual_activation_bit_width_selection_list"], [0, 2, 1, "", "manual_weights_bit_width_selection_list"]], "model_compression_toolkit.core.common.quantization.bit_width_config": [[0, 0, 1, "", "ManualBitWidthSelection"]], "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection": [[0, 2, 1, "", "bit_width"], [0, 2, 1, "", "filter"]], "model_compression_toolkit.core.network_editor": [[43, 0, 1, "", "ChangeCandidatesActivationQuantConfigAttr"], [43, 0, 1, "", "ChangeCandidatesActivationQuantizationMethod"], [43, 0, 1, "", "ChangeCandidatesWeightsQuantConfigAttr"], [43, 0, 1, "", "ChangeCandidatesWeightsQuantizationMethod"], [43, 0, 1, "", "ChangeFinalActivationQuantConfigAttr"], [43, 0, 1, "", "ChangeFinalWeightsQuantConfigAttr"], [43, 0, 1, "", "ChangeFinalWeightsQuantizationMethod"], [43, 0, 1, "", "ChangeQuantizationParamFunction"], [43, 0, 1, "", "EditRule"], [43, 0, 1, "", "NodeNameFilter"], [43, 0, 1, "", "NodeNameScopeFilter"], [43, 0, 1, "", "NodeTypeFilter"]], "model_compression_toolkit.data_generation": [[1, 0, 1, "", "BNLayerWeightingType"], [1, 0, 1, "", "BatchNormAlignemntLossType"], [1, 0, 1, "", "DataGenerationConfig"], [1, 0, 1, "", "DataInitType"], [1, 0, 1, "", "ImageGranularity"], [1, 0, 1, "", "ImageNormalizationType"], [1, 0, 1, "", "ImagePipelineType"], [1, 0, 1, "", "OutputLossType"], [1, 0, 1, "", "SchedulerType"], [14, 3, 1, "", "get_keras_data_generation_config"], [16, 3, 1, "", "get_pytorch_data_generation_config"], [20, 3, 1, "", "keras_data_generation_experimental"], [28, 3, 1, "", "pytorch_data_generation_experimental"]], "model_compression_toolkit.exporter": [[41, 0, 1, "", "KerasExportSerializationFormat"], [41, 0, 1, "", "PytorchExportSerializationFormat"], [41, 0, 1, "", "QuantizationFormat"], [41, 0, 1, "", "keras_export_model"], [41, 0, 1, "", "pytorch_export_model"]], "model_compression_toolkit.gptq": [[4, 0, 1, "", "GPTQHessianScoresConfig"], [4, 0, 1, "", "GradientPTQConfig"], [4, 0, 1, "", "GradualActivationQuantizationConfig"], [4, 0, 1, "", "QFractionLinearAnnealingConfig"], [4, 0, 1, "", "RoundingType"], [15, 3, 1, "", "get_keras_gptq_config"], [17, 3, 1, "", "get_pytorch_gptq_config"], [21, 3, 1, "", "keras_gradient_post_training_quantization"], [29, 3, 1, "", "pytorch_gradient_post_training_quantization"]], "model_compression_toolkit.pruning": [[6, 0, 1, "", "ChannelsFilteringStrategy"], [6, 0, 1, "", "ImportanceMetric"], [6, 0, 1, "", "PruningConfig"], [7, 0, 1, "", "PruningInfo"], [25, 3, 1, "", "keras_pruning_experimental"], [32, 3, 1, "", "pytorch_pruning_experimental"]], "model_compression_toolkit.pruning.PruningConfig": [[6, 2, 1, "", "channels_filtering_strategy"], [6, 2, 1, "", "importance_metric"], [6, 2, 1, "", "num_score_approximations"]], "model_compression_toolkit.pruning.PruningInfo": [[7, 4, 1, "", "importance_scores"], [7, 4, 1, "", "pruning_masks"]], "model_compression_toolkit.ptq": [[24, 3, 1, "", "keras_post_training_quantization"], [31, 3, 1, "", "pytorch_post_training_quantization"]], "model_compression_toolkit.qat": [[44, 0, 1, "", "QATConfig"], [26, 3, 1, "", "keras_quantization_aware_training_finalize_experimental"], [27, 3, 1, "", "keras_quantization_aware_training_init_experimental"], [33, 3, 1, "", "pytorch_quantization_aware_training_finalize_experimental"], [34, 3, 1, "", "pytorch_quantization_aware_training_init_experimental"]], "model_compression_toolkit.target_platform_capabilities": [[42, 0, 1, "", "AttributeFilter"], [42, 0, 1, "", "Eq"], [42, 0, 1, "", "Greater"], [42, 0, 1, "", "GreaterEq"], [42, 0, 1, "", "NotEq"], [45, 0, 1, "", "QuantizationMethod"], [42, 0, 1, "", "Smaller"], [42, 0, 1, "", "SmallerEq"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema": [[45, 0, 1, "", "AttributeQuantizationConfig"], [45, 0, 1, "", "Fusing"], [45, 0, 1, "", "OpQuantizationConfig"], [45, 0, 1, "", "OperatorSetGroup"], [45, 0, 1, "", "OperatorsSet"], [45, 0, 1, "", "QuantizationConfigOptions"], [45, 0, 1, "", "TargetPlatformCapabilities"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig": [[45, 2, 1, "", "enable_weights_quantization"], [45, 2, 1, "", "lut_values_bitwidth"], [45, 2, 1, "", "weights_n_bits"], [45, 2, 1, "", "weights_per_channel_threshold"], [45, 2, 1, "", "weights_quantization_method"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing": [[45, 2, 1, "", "fuse_op_quantization_config"], [45, 2, 1, "", "name"], [45, 2, 1, "", "operator_groups"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup": [[45, 2, 1, "", "name"], [45, 2, 1, "", "operators_set"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet": [[45, 2, 1, "", "name"], [45, 2, 1, "", "qc_options"], [45, 2, 1, "", "type"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions": [[45, 2, 1, "", "base_config"], [45, 2, 1, "", "quantization_configurations"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities": [[45, 2, 1, "", "SCHEMA_VERSION"], [45, 2, 1, "", "add_metadata"], [45, 2, 1, "", "default_qco"], [45, 2, 1, "", "fusing_patterns"], [45, 2, 1, "", "insert_preserving_quantizers"], [45, 2, 1, "", "is_simd_padding"], [45, 2, 1, "", "name"], [45, 2, 1, "", "operator_set"], [45, 2, 1, "", "tpc_minor_version"], [45, 2, 1, "", "tpc_patch_version"], [45, 2, 1, "", "tpc_platform_type"]], "model_compression_toolkit.trainable_infrastructure": [[46, 0, 1, "", "BaseKerasTrainableQuantizer"], [46, 0, 1, "", "BasePytorchTrainableQuantizer"], [46, 0, 1, "", "TrainableQuantizerActivationConfig"], [46, 0, 1, "", "TrainableQuantizerWeightsConfig"], [46, 0, 1, "", "TrainingMethod"]], "model_compression_toolkit.wrapper.mct_wrapper": [[11, 0, 1, "", "MCTWrapper"]], "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper": [[11, 1, 1, "", "quantize_and_export"]], "model_compression_toolkit.xquant.common.xquant_config": [[12, 0, 1, "", "XQuantConfig"]], "model_compression_toolkit.xquant.keras.facade_xquant_report": [[36, 3, 1, "", "xquant_report_keras_experimental"]], "model_compression_toolkit.xquant.pytorch.facade_xquant_report": [[37, 3, 1, "", "xquant_report_pytorch_experimental"], [38, 3, 1, "", "xquant_report_troubleshoot_pytorch_experimental"]]}, "objnames": {"0": ["py", "class", "Python class"], "1": ["py", "method", "Python method"], "2": ["py", "attribute", "Python attribute"], "3": ["py", "function", "Python function"], "4": ["py", "property", "Python property"]}, "objtypes": {"0": "py:class", "1": "py:method", "2": "py:attribute", "3": "py:function", "4": "py:property"}, "terms": {"": [3, 6, 8, 10, 21, 24, 25, 26, 27, 29, 31, 32, 34, 35, 41, 42, 43, 45, 46, 48, 50], "0": [1, 3, 4, 5, 7, 8, 11, 12, 14, 16, 21, 24, 25, 26, 27, 32, 40, 41, 46, 48], "05": 8, "06": 5, "08153": 46, "1": [1, 3, 4, 5, 7, 8, 11, 12, 17, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32, 33, 40, 41, 48, 50], "10": [20, 21, 24, 27, 28, 29, 31, 34], "100": 40, "10000000000": 5, "14": 11, "15": 41, "16": [12, 41, 48], "1902": 46, "1e": [5, 15, 17], "1st": 15, "2": [3, 8, 12, 15, 17, 20, 28, 40, 45, 46, 48, 50], "20": 49, "2021": 50, "2023": 50, "224": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "2f": 40, "2nd": 15, "3": [3, 11, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 41, 46], "32": [4, 5, 11], "3e": [15, 17], "3rd": 15, "4": [15, 17, 20, 21, 24, 25, 27, 28, 29, 31, 32, 34, 48], "4th": 15, "5": [11, 12, 15, 17, 25, 32, 48], "50": [25, 32], "52587890625e": 8, "6": [28, 40], "75": [11, 21, 24, 26, 27], "8": [20, 21, 24, 26, 27, 28, 41, 45, 46], "9": 43, "A": [0, 3, 4, 5, 7, 8, 13, 15, 17, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 43, 44, 45, 50], "And": 48, "As": [5, 48, 49], "By": [4, 5, 11, 25, 29, 31, 32, 41, 49], "For": [3, 8, 12, 18, 19, 20, 21, 24, 26, 27, 28, 34, 41, 45, 46, 47, 48, 49, 50], "If": [2, 3, 4, 5, 12, 15, 17, 21, 24, 26, 27, 29, 31, 39, 40, 41, 42, 45, 48], "In": [5, 20, 21, 24, 27, 28, 29, 31, 34, 41, 42, 44, 48], "It": [2, 11, 12, 45, 46, 48], "No": 1, "One": 49, "The": [0, 1, 3, 4, 5, 6, 7, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 36, 37, 38, 40, 41, 43, 45, 46, 48, 49], "Then": [3, 21, 24, 27, 29, 31, 34, 43, 49], "There": [41, 48, 49], "These": [48, 49], "To": [41, 48, 49], "With": 48, "_": [21, 24, 27, 29, 31, 34, 41], "__call__": 40, "__import__": 40, "__init__": 40, "_input_data": 41, "_model_input_nam": 41, "_model_output_nam": 41, "_with_model_output_loss_object": 48, "about": [3, 4, 7, 13, 15, 17, 21, 24, 26, 27, 40, 41, 45, 46], "abov": [12, 48], "absolut": 9, "abstract": [13, 46], "accept": [15, 40, 45], "access": 7, "accord": [13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 41, 42], "accordingli": 45, "accuraci": [12, 48], "achiev": 25, "act": 7, "act_hessian_default_batch_s": [15, 17], "action": 40, "activ": [0, 3, 4, 5, 8, 10, 11, 21, 22, 24, 27, 29, 30, 31, 34, 41, 43, 44, 45, 46, 48, 49], "activation_bias_correct": 8, "activation_bias_correction_threshold": 8, "activation_channel_equ": 8, "activation_error_method": [8, 11], "activation_memori": 10, "activation_min_max_map": 3, "activation_n_bit": [45, 46], "activation_op": 3, "activation_quantization_candid": 46, "activation_quantization_method": [43, 45, 46], "activation_quantization_param": 46, "activation_quantization_params_fn": 43, "activation_quantizer_map": 3, "activation_quantizer_params_overrid": 44, "activation_training_method": 44, "ad": 45, "adam": [14, 15, 17], "add": [1, 3, 12, 14, 16, 23, 46], "add_metadata": 45, "addit": [23, 41, 48], "address": 45, "advanc": 3, "affect": [21, 24, 26, 27], "after": [13, 21, 23, 24, 27, 34, 48, 50], "aim": [25, 32], "algorithm": 5, "align": [1, 14, 16], "all": [1, 3, 4, 5, 8, 43, 46, 49], "allimag": [1, 16], "allow": [6, 12, 20, 28, 40, 41, 45], "along": 49, "also": [25, 32, 45], "an": [1, 2, 3, 4, 7, 11, 13, 21, 24, 27, 34, 36, 37, 38, 40, 41, 42, 43, 45, 46, 48, 50], "analysi": [25, 32], "analyz": [25, 32, 38], "analyze_similar": 40, "ani": [1, 2, 3, 5, 11, 36, 37, 38, 41, 42, 46], "anneal": 4, "api": [3, 4, 24, 27, 34, 44, 48], "append": 40, "appli": [0, 1, 5, 8, 13, 41, 42, 43, 45, 48], "applic": [21, 22, 24, 25, 26, 27, 41], "approach": 6, "appropri": 48, "approxim": [6, 25, 32], "ar": [3, 5, 12, 18, 19, 21, 24, 25, 27, 29, 31, 32, 34, 40, 41, 45, 46, 47, 48, 49], "architectur": [25, 32], "argument": [4, 40, 41, 45], "arrai": [7, 11], "art": 50, "arxiv": [46, 50], "assess": [25, 32], "associ": [25, 32], "assum": [25, 32], "astyp": 41, "attent": [4, 15, 17, 46], "attirbut": 3, "attr": 42, "attr_nam": 43, "attr_valu": 43, "attr_weights_configs_map": 45, "attribut": [43, 45, 46], "attributefilt": 42, "auto": 13, "automat": 48, "auxiliari": [15, 17], "avail": 41, "averag": [1, 5, 14, 15, 16, 17, 48], "avg": 5, "awar": [13, 44, 46, 50], "axi": [3, 46, 48], "backend": 45, "bar": 40, "base": [1, 4, 5, 8, 9, 11, 13, 15, 17, 18, 19, 20, 25, 28, 31, 32, 46, 48, 50], "base_config": 45, "basenod": 7, "basenodematch": 0, "basic": 46, "batch": [1, 4, 5, 14, 15, 16, 17, 20, 21, 24, 27, 28, 29, 31, 34], "batchnorm": [1, 14, 16, 20, 21, 24, 27, 29, 31, 34], "batchnorm2d": 28, "batchnormalignemntlosstyp": [14, 16], "batchwis": [1, 14], "been": [7, 40], "begin": 4, "behavior": [40, 48], "being": [21, 24, 27, 29, 31, 34, 40, 45, 46], "below": [12, 48], "between": [4, 5, 12, 21, 29, 31, 45, 48, 49], "bia": [4, 11, 15, 17, 21, 24, 26, 27], "bidwidth": 5, "bit": [0, 5, 10, 13, 21, 24, 26, 27, 34, 39, 41, 43, 45, 46, 50], "bit_width": 0, "bit_width_config": [0, 39], "bitwidth": [5, 12, 21, 24, 26, 27, 48], "bitwidthconfig": [13, 39], "block": [46, 49], "bn_alignment_loss_typ": [1, 14, 16], "bn_layer_typ": [1, 14, 16], "bnlayerweightingtyp": [14, 16], "bool": [1, 4, 5, 11, 12, 14, 15, 16, 17, 40, 45, 46], "boolean": 23, "bop": 10, "both": [11, 21, 24, 29, 31, 33, 46, 49], "build": [22, 30, 46, 50], "built": [27, 34, 46], "bypass": 40, "byte": [10, 21, 24, 25, 27, 32, 34, 49], "c": [12, 48], "calcul": [5, 6, 13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 48], "calibr": [11, 21, 22, 24, 27, 29, 30, 31, 34], "call": [22, 30, 35, 45, 49], "callabl": [3, 5, 11, 12, 15, 17, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 36, 37, 38, 40, 41, 42], "callback": 40, "can": [3, 4, 8, 11, 13, 15, 17, 20, 22, 25, 28, 30, 32, 40, 41, 43, 45, 46, 48, 49, 50], "candid": [5, 21, 24, 26, 27, 43], "cannot": 45, "capabl": [11, 18, 19, 25, 30, 32], "case": 5, "caus": [12, 13, 38, 48], "chang": [20, 28, 41, 43, 48, 49], "changecandidatesactivationquantconfigattr": 43, "changecandidatesactivationquantizationmethod": 43, "changecandidatesweightsquantconfigattr": 43, "changecandidatesweightsquantizationmethod": 43, "changefinalactivationquantconfigattr": 43, "changefinalweightsquantconfigattr": 43, "changefinalweightsquantizationmethod": 43, "changequantizationmethod": 43, "changequantizationparamfunct": 43, "channel": [3, 6, 7, 13, 25, 32, 45, 46, 49], "channels_filtering_strategi": 6, "check": [5, 41, 42, 43], "choos": [1, 4, 41], "chosen": 49, "circl": 48, "class": [0, 1, 5, 6, 7, 8, 9, 10, 11, 12, 13, 23, 39, 40, 41, 42, 43, 44, 45, 46], "clibrat": 31, "click": 49, "clip": [1, 14, 16], "clone": 50, "coeffici": [3, 21, 24, 26, 27, 29, 31, 45, 46], "cohen": 50, "collaps": 11, "collect": [3, 21, 24, 27, 29, 31, 34, 36, 37, 38, 49], "com": 50, "combin": 45, "common": [0, 12], "compar": [5, 21, 29, 31, 48, 49], "comparison": 50, "compat": 41, "compil": 23, "complet": [4, 11, 40], "completedcompon": 40, "compon": [40, 45, 46, 48], "component_nam": 40, "compress": [11, 13, 20, 25, 28, 29, 32, 48], "comput": [3, 4, 5, 9, 12, 13, 15, 17, 22, 30, 36, 40, 49], "compute_distance_fn": 5, "concat_threshold_upd": 8, "concaten": [12, 45, 48], "concatn": [12, 48], "config": [4, 20, 21, 24, 25, 26, 27, 28, 29, 32, 33, 34, 39, 43, 46], "configur": [0, 4, 5, 8, 10, 11, 13, 14, 15, 16, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 33, 34, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 48, 50], "configuration_overwrit": 5, "confirm": 48, "connect": 11, "consid": [6, 14, 16, 25, 32, 45], "consol": 48, "constant": [6, 43, 46], "constraint": [21, 24, 25, 29, 31, 32], "contain": [7, 13, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 46, 48], "conv2d": [3, 20, 21, 24, 26, 27, 28, 43, 45], "conveni": 35, "convent": 48, "convert": [11, 13, 26, 33, 45], "core": [0, 3, 5, 8, 9, 10, 11, 21, 22, 24, 25, 26, 27, 29, 30, 32, 33, 34, 39, 40, 43], "core_config": [21, 22, 24, 26, 27, 29, 30, 31, 33, 34, 40], "coreconfig": [13, 21, 22, 24, 26, 27, 29, 30, 31, 33, 34, 40], "correct": 11, "correspond": [7, 48], "cosin": [48, 50], "count_param": [21, 24, 25, 26, 27], "countermeasur": 48, "cpuexecutionprovid": 41, "creat": [3, 4, 8, 11, 13, 14, 15, 16, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 42, 43, 45, 48], "creation": 41, "crop": 1, "cudaexecutionprovid": 41, "cui": 40, "current": [4, 40, 41], "currentcompon": 40, "custom": [5, 12, 20, 23, 27, 28, 41], "custom_metric_fn": 5, "custom_object": [23, 26, 27], "custom_similarity_metr": 12, "custom_tpc_opset_to_lay": 8, "cut": 40, "dash": 48, "data": [13, 14, 16, 22, 25, 30, 32, 36, 37, 38, 40, 41, 45, 49, 50], "data_gen_batch_s": [1, 14, 16, 20, 28], "data_gener": [1, 14, 16, 20, 28], "data_generation_config": [20, 28], "data_init_typ": [1, 14, 16], "dataclass": [39, 40], "datagenerationconfig": [1, 13, 20, 28], "datainittyp": [14, 16], "dataset": [4, 11, 15, 17, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 41, 48, 49], "debug": [39, 40], "debug_config": 39, "debugconfig": 39, "deeper": 49, "def": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 40, 41], "default": [1, 2, 4, 5, 6, 11, 14, 15, 16, 17, 21, 24, 25, 29, 31, 32, 39, 41, 44, 45, 49], "default_data_gen_b": [14, 16], "default_factori": 2, "default_keras_extra_pixel": 14, "default_keras_initial_lr": 14, "default_keras_output_loss_multipli": 14, "default_keras_tpc": [21, 24, 25, 27], "default_n_it": [14, 16], "default_onnx_opset_vers": 41, "default_pytorch_bn_layer_typ": 16, "default_pytorch_extra_pixel": 16, "default_pytorch_initial_lr": 16, "default_pytorch_last_layer_typ": 16, "default_pytorch_output_loss_multipli": 16, "default_pytorch_tpc": [29, 31, 32, 34], "default_qco": 45, "default_valu": 2, "default_weight_attr_config": 45, "defaultdict": [3, 13], "defin": [0, 4, 5, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 40, 45, 46, 48], "degrad": [12, 13, 38, 48], "demonstr": [41, 45], "dens": [3, 20], "dense_nparam": [25, 32], "depend": [1, 21, 24, 27, 29, 31, 34], "describ": 48, "descript": [11, 40], "desir": [13, 21, 22, 24, 26, 27, 29, 30, 31, 34], "detail": [41, 45, 48], "detect": [12, 13, 38, 48], "determin": [6, 25, 32, 45], "develop": 50, "deviat": 48, "devic": [13, 18], "device_typ": 18, "diagram": 45, "diamant": 50, "dict": [3, 7, 12, 36, 37, 38, 41, 45, 46, 48], "dictionari": [2, 3, 4, 12, 26, 27, 36, 37, 38, 41, 43, 44, 46], "differ": [1, 8, 13, 21, 24, 26, 27, 41, 45, 48, 49], "dikstein": 50, "dir": [12, 48, 49], "directori": [12, 13, 35, 48], "disabl": [15, 17, 40], "displai": [40, 48, 49], "distanc": [5, 11], "distance_weighting_method": [5, 11], "distil": [4, 50], "distribut": 9, "diverg": [9, 49], "divers": 1, "divid": 3, "divis": 49, "dnn": 46, "do": [1, 48, 49], "document": [13, 24, 27, 34, 48], "doe": 48, "doesn": 50, "don": 35, "done": 49, "dot": 49, "dqa": 46, "dror": 50, "dtype": 41, "dummi": 17, "durat": [25, 32], "dure": [4, 13, 14, 15, 16, 17, 18, 19, 36, 37, 38, 41, 43, 45, 46, 47, 49], "e": [3, 5, 11, 21, 24, 27, 29, 31, 34, 50], "each": [5, 6, 7, 12, 21, 24, 25, 27, 29, 31, 32, 34, 43, 45, 46, 48, 49], "easi": 48, "easili": [13, 50], "edit": [39, 40, 43], "editrul": 40, "either": 45, "element": [7, 45], "empti": 2, "emul": 46, "enabl": [1, 5, 8, 11, 13, 15, 17, 40, 46, 50], "enable_activation_quant": [45, 46], "enable_weights_quant": [45, 46], "encapsul": [0, 8], "end_step": 4, "engin": 50, "enhanc": 50, "ensur": 5, "entir": 13, "enum": [1, 3, 4, 6, 9, 46], "epoch": [4, 11, 15, 17], "epsilon": 5, "eptq": 50, "eq": 42, "equal": 42, "er_list": 43, "error": [9, 11, 12, 40], "estim": [4, 46], "etc": [3, 10, 13, 21, 24, 27, 29, 31, 34, 49], "euclidean": 49, "evalu": [5, 36, 37, 38], "even": 48, "exact": 17, "exampl": [3, 8, 11, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 43, 45, 46, 50], "exceed": 48, "execut": 48, "exist": [2, 43, 48], "exp": 5, "exp_distance_weighting_sigma": 5, "expect": [4, 49], "experiment": [13, 20, 28, 50], "explain": [12, 13, 36, 37, 38, 46], "explicitli": 45, "expon": 5, "exponenti": 5, "export": 11, "extend": [25, 32], "extens": [11, 41, 50], "extra": [1, 14, 16], "extra_pixel": [1, 14, 16], "extrem": 48, "f": 40, "facade_xquant_report": [36, 37, 38], "factor": [4, 5, 9, 15, 17], "factori": [0, 4, 39, 40], "fake": 41, "fake_qu": [27, 34], "fakely_qu": 41, "fallback": 45, "fals": [4, 5, 8, 11, 12, 14, 15, 17, 40, 46], "familiar": 48, "featur": 40, "fetch": 45, "few": [49, 50], "field": [18, 19, 42, 45, 47], "figur": [40, 49], "file": [23, 26, 27, 35, 40, 41], "filepath": 23, "filter": [0, 1, 6], "final": [4, 5, 12, 13, 20, 28, 43, 48, 49, 50], "find": [21, 24, 27, 34], "fine": [15, 17, 25, 26, 27, 32, 33, 34], "first": [1, 21, 24, 27, 29, 31, 34, 41, 49], "first_layer_multipli": 1, "fix": 45, "fixed_scal": [18, 19, 45, 47], "fixed_zero_point": [18, 19, 45, 47], "flag": [1, 11, 40, 45], "flatten": [20, 28], "flip": 1, "float": [1, 4, 5, 11, 12, 14, 15, 16, 17, 21, 27, 29, 31, 34, 36, 37, 38, 41, 45, 46, 48, 49], "float32": [25, 32, 41], "float_model": [11, 36, 37, 38, 41, 48], "flush": 40, "fold": [21, 24, 27, 29, 31, 34], "folder": [35, 48], "follow": [3, 4, 11, 12, 40, 46, 48, 49], "footprint": [25, 32], "form": 45, "format": [3, 13], "fraction": 4, "framework": [3, 11, 46], "frameworkquantizationcap": [22, 29, 30, 31], "free": [6, 20, 25, 28, 32, 50], "freez": 46, "freeze_quant_param": 46, "friendli": [25, 32, 50], "from": [3, 4, 11, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 43, 45, 46, 47, 48, 49, 50], "from_config": 46, "function": [3, 4, 5, 11, 12, 13, 14, 15, 16, 17, 20, 23, 25, 28, 32, 35, 40, 43, 45, 46, 48], "fuse_op_quantization_config": 45, "fusing_pattern": 45, "futur": [18, 19, 20, 28, 45, 47], "g": [3, 11, 21, 24, 27, 29, 31, 34], "gather": [45, 49], "gaussian": [1, 14, 16], "gener": [2, 12, 13, 14, 16, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 45, 49, 50], "generated_imag": [20, 28], "get": [2, 3, 4, 5, 13, 21, 24, 26, 27, 29, 31, 33, 34, 45, 49], "get_config": 46, "get_input": 41, "get_keras_data_generation_config": [13, 14, 20], "get_keras_gptq_config": [11, 13, 15, 21], "get_ort_session_opt": 41, "get_output": 41, "get_pytorch_data_generation_config": [13, 16, 28], "get_pytorch_gptq_config": [11, 13, 17], "get_target_platform_cap": [13, 18, 45], "get_target_platform_capabilities_sdsp": [13, 19, 45], "git": 50, "github": [41, 50], "given": [2, 21, 22, 24, 27, 29, 30, 31, 34], "gordon": 50, "gptq": [4, 11, 15, 17, 21, 29, 40], "gptq_conf": [15, 17, 29], "gptq_config": [21, 29, 31], "gptq_quantizer_params_overrid": 4, "gptq_representative_data_gen": [21, 29], "grad": 1, "gradient": [1, 4, 11, 13, 31, 50], "gradientptq": [4, 13], "gradientptqconfig": [13, 21, 29], "gradual": 4, "gradual_activation_quant": [15, 17], "gradual_activation_quantization_config": 4, "gradualactivationquant": [15, 17], "gradualactivationquantizationconfig": [15, 17], "granular": [1, 14, 16], "graph": [22, 30, 43, 49], "greater": 42, "greatereq": 42, "greedi": [5, 6], "group": [3, 6, 25, 32, 45], "h": 50, "ha": [7, 40, 41, 42, 43], "habi": 50, "handl": [11, 21, 24, 27, 29, 31, 34], "handler": 35, "hardwar": [13, 25, 32, 45, 46, 50], "have": [3, 41, 42, 48, 49], "henc": 45, "here": [12, 25, 32, 41, 45, 48, 50], "hessian": [4, 5, 6, 9, 11, 15, 17, 25, 32, 50], "hessian_batch_s": [4, 5, 15, 17], "hessian_weights_config": 4, "hessians_num_sampl": 4, "higher": [25, 32], "highlight": 48, "hight": 28, "histogram": [21, 24, 27, 29, 31, 34, 49], "histori": 40, "hmse": 9, "hold": [3, 39, 42, 45], "holder": 46, "how": [3, 6, 21, 22, 24, 27, 29, 31, 34, 40, 41, 46, 50], "howev": 41, "hptq": [45, 50], "http": [46, 50], "hw": 22, "i": [1, 2, 3, 4, 5, 6, 7, 9, 11, 12, 13, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 39, 40, 41, 42, 43, 45, 46, 48, 49, 50], "ident": [1, 5], "identifi": [25, 32, 45, 48], "ignor": [18, 19, 45, 47], "ilp": [21, 24, 27, 34], "imag": [1, 4, 5, 11, 14, 16, 20, 21, 24, 27, 28, 29, 31, 34, 48, 49], "image_clip": [1, 14, 16], "image_granular": [1, 14, 16], "image_normalization_typ": [1, 14, 16], "image_pipeline_typ": [1, 14, 16], "imagegranular": [14, 16], "imagenet": 1, "imagenet1k_v1": 32, "imagenormalizationtyp": [14, 16], "imagepipelinetyp": [14, 16], "imagewis": 1, "impact": [25, 32], "implement": [12, 46], "implment": 46, "import": [3, 6, 7, 8, 11, 13, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 43, 46, 48, 49], "importance_metr": 6, "importance_scor": 7, "improv": [5, 25, 32, 48], "imx500": [11, 41, 45], "imx500_tp_model": 18, "in_model": [21, 22, 24, 26, 27, 30, 33, 34], "in_modul": [31, 48], "includ": [4, 7, 11, 21, 24, 27, 29, 31, 34, 45, 46], "increas": [4, 5], "index": [3, 13], "indic": [3, 7, 25, 32, 45, 48], "individu": 48, "induc": 9, "inf": [8, 10, 11], "infer": [13, 26, 33, 45, 46], "inferablequant": [26, 33], "inferencesess": 41, "info": [6, 35, 40], "inform": [3, 4, 13, 15, 17, 18, 19, 21, 24, 25, 27, 29, 31, 32, 34, 40, 45, 46, 47], "infrastructur": 46, "init": [13, 43, 50], "initi": [1, 2, 4, 6, 11, 12, 14, 16, 27, 34, 46, 48], "initial_lr": [1, 14, 16], "initial_q_fract": 4, "inner": 2, "input": [1, 5, 11, 14, 16, 21, 24, 27, 29, 31, 34, 40, 45, 48], "input_sc": 8, "input_shap": 20, "insert": 49, "insert_preserving_quant": 45, "instal": 41, "instanc": [4, 11, 13, 15, 17, 43, 45, 49], "instanti": [4, 8, 44], "instruct": 45, "insuffici": [12, 48], "int": [0, 1, 4, 5, 6, 12, 14, 15, 16, 17, 20, 28, 35, 40, 41, 45, 46, 48], "int8": 41, "integ": [5, 41, 45], "interest": 5, "interfac": [4, 11, 17], "introduc": 46, "inverse_min_max_diff": 1, "involv": [20, 25, 28, 32], "is_detect_under_threshold_quantize_error": 12, "is_keras_layer_export": 41, "is_layer_exportable_fn": 41, "is_pytorch_layer_export": 41, "is_simd_pad": 45, "issu": [5, 41, 48], "item": 48, "iter": [1, 14, 16, 20, 21, 24, 27, 28, 29, 31, 34, 40], "its": [2, 3, 11, 13, 23, 25, 32, 42, 45, 49], "jen": 50, "judg": [12, 13, 38, 48], "judgment": 48, "just": 50, "keep": [33, 40, 50], "kei": [2, 11, 12, 25, 32, 42], "kept": [7, 27, 34], "ker": 27, "kera": [3, 11, 13, 43, 46, 50], "keras_appl": [1, 14], "keras_data_generation_experiment": [13, 20], "keras_default_tpc": 22, "keras_file_path": 41, "keras_gradient_post_training_quant": [13, 15, 21], "keras_load_quantized_model": 23, "keras_post_training_quant": [13, 24, 41, 43, 49], "keras_pruning_experiment": [13, 25], "keras_quantization_aware_training_finalize_experiment": [13, 26], "keras_quantization_aware_training_init_experiment": [13, 26, 27], "keras_resource_utilization_data": [13, 22], "kernel": [3, 21, 24, 26, 27, 43, 46], "kernel_channels_map": 3, "kernel_op": 3, "kernel_ops_attributes_map": 3, "keyword": 45, "kl": [9, 49], "know": [3, 13], "knowledg": [4, 50], "known_dict": 2, "kwarg": 43, "l": [25, 50], "l2": 1, "l2_squar": [1, 14, 16], "l_p_valu": 8, "label": [6, 25, 32, 45, 50], "lambda": 41, "larg": [12, 48], "larger": 5, "last": [3, 4, 5, 48], "last_lay": 5, "last_layer_typ": [1, 16], "latenc": 41, "latest": 50, "launch": 49, "layaer": [13, 38], "layer": [1, 3, 5, 7, 11, 12, 14, 15, 16, 17, 20, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 40, 41, 43, 45, 46, 48, 49], "layer_min_max_map": 3, "layer_weighting_typ": [1, 14, 16], "layerfilterparam": 42, "learn": [1, 14, 15, 16, 17, 46], "learnabl": 46, "least": 6, "left": 11, "let": 41, "level": 35, "lfh": [6, 25, 32], "librari": [3, 8], "like": [8, 45], "limit": [6, 21, 24, 26, 27, 29, 31, 34], "line": 48, "linear": [4, 11, 28], "linear_collaps": [8, 11], "linearli": 4, "link": 48, "list": [0, 1, 3, 5, 11, 14, 15, 16, 20, 28, 40, 41, 43, 50], "liter": 45, "ll": [20, 28], "load": [13, 26, 27, 41, 46], "load_model": [26, 27], "loadopt": 23, "log": [4, 12, 13, 15, 17, 35, 48, 49], "log_funct": [4, 15, 17], "log_norm": 4, "log_tensorboard_xqu": 48, "logdir": 49, "logger": [13, 40, 49], "longer": 41, "look": [24, 27, 34, 45, 50], "lookup": 45, "loss": [1, 4, 12, 14, 15, 16, 17, 21, 25, 29, 31, 32, 48], "low": 11, "lp": 9, "lsq": 46, "lut_pot_quant": 45, "lut_sym_quant": 45, "lut_values_bitwidth": 45, "mae": [9, 49], "mai": [20, 21, 24, 27, 28, 29, 31, 34, 42, 49], "main": [11, 45, 48, 49], "make": [9, 40], "manag": [0, 11], "mandatori": 41, "mani": 49, "manipul": [0, 1], "manner": 45, "manual": [0, 13, 39, 48], "manual_activation_bit_width_selection_list": 0, "manual_weights_bit_width_selection_list": 0, "manualweightsbitwidthselect": 0, "map": [3, 45], "mask": 7, "match": [18, 19, 42, 43], "mathemat": 49, "max": [1, 3, 5, 8, 9, 21, 22, 24, 27, 29, 30, 31, 34, 49], "maxbit": 5, "maxim": [21, 24, 27, 34], "mct": [3, 8, 11, 13, 15, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 39, 40, 41, 43, 45, 46, 47, 48, 49, 50], "mct_current_schema": 45, "mct_quantiz": 41, "mct_wrapper": 11, "mctwrapper": 11, "mean": [1, 4, 9, 49], "measur": [6, 10, 12, 48, 49], "meet": [25, 32], "memori": [10, 25, 32, 49], "messag": 48, "metadata": [7, 45], "method": [4, 5, 6, 9, 11, 13, 25, 32, 35, 41, 43, 44, 45, 46], "metric": [4, 5, 6, 12, 36, 37, 38, 48], "metric_epsilon": 5, "metric_norm": 5, "metric_normalization_threshold": 5, "min": [1, 3, 5, 8, 9, 21, 24, 27, 29, 31, 34, 49], "min_threshold": [8, 46], "minbit": 5, "minim": [5, 9, 21, 25, 29, 31, 32], "minimum": 46, "minor": 45, "minut": 50, "mix": [5, 10, 11, 12, 13, 21, 22, 24, 26, 27, 29, 30, 31, 34, 39, 45, 48, 50], "mixed_precis": 11, "mixed_precision_config": [21, 22, 24, 26, 27, 39], "mixedprecisionquantizationconfig": [11, 13, 21, 22, 24, 26, 27, 39], "mkstemp": 41, "mobilenet": [21, 22], "mobilenet_v2": [24, 26, 27, 29, 30, 31, 33, 34, 41], "mobilenetv2": [24, 26, 27, 41, 49], "model": [3, 4, 5, 7, 8, 10, 11, 12, 13, 18, 19, 20, 21, 24, 25, 28, 29, 31, 32, 36, 37, 38, 39, 40, 43, 44, 45, 46, 48, 49], "model_compression_toolkit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 48, 49], "model_fil": [26, 27], "model_format_onnx_mctq": 41, "model_mp": 5, "model_output": 41, "modifi": [13, 43], "modul": [13, 28, 29, 30, 31, 32, 37, 38], "more": [9, 18, 19, 24, 25, 27, 32, 34, 41, 45, 47, 48, 49], "most": 48, "mse": [8, 9, 11, 12, 48, 49], "much": 40, "multipl": [3, 5, 35, 45], "multiple_tensors_mse_loss": 4, "multipli": [1, 12, 14, 16, 48], "must": [25, 32, 45], "n_epoch": [4, 11, 15, 17, 21], "n_imag": [20, 28], "n_iter": [1, 14, 16, 20, 28], "nadam": 15, "name": [12, 40, 43, 45, 48, 49], "nchw": 3, "ndarrai": 7, "necessari": [4, 11, 41, 46, 48], "need": [3, 11, 13, 21, 24, 27, 29, 31, 34, 41, 42, 46, 48], "neg": [1, 5, 48], "negative_min_max_diff": [1, 16], "network": [3, 6, 11, 33, 39, 40, 43, 49, 50], "network_editor": [13, 40], "netzer": 50, "neural": [6, 11, 50], "neuron": 7, "new": [43, 45], "next": [20, 28, 41, 42], "nhwc": 3, "nn": [28, 37, 38], "no_norm": 1, "no_quantization_op": 3, "noclip": [8, 9], "node": [0, 27, 34, 41, 43, 46, 49], "node_nam": 43, "node_name_scop": 43, "node_typ": 43, "nodenamefilt": 43, "nodenamescopefilt": 43, "nodetypefilt": 43, "nois": 9, "non": [5, 15, 17, 45], "none": [1, 2, 4, 5, 8, 11, 12, 15, 17, 21, 23, 24, 27, 29, 31, 34, 35, 39, 40, 41, 43, 44, 45, 46], "norm": [9, 49], "norm_scor": [4, 5], "normal": [1, 4, 5, 14, 16], "note": [21, 24, 26, 27], "notebook": 50, "noteq": 42, "notic": [20, 25, 28, 32, 41], "now": [6, 18, 19, 34, 41, 45, 46, 47, 49], "np": [7, 11, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "num_calibration_batch": [21, 24, 27, 29, 31, 34], "num_interest_points_factor": 5, "num_of_imag": [5, 11, 21, 24], "num_score_approxim": [6, 25, 32], "number": [1, 4, 5, 6, 11, 12, 14, 15, 16, 17, 20, 21, 24, 25, 27, 28, 29, 31, 32, 34, 40, 45, 46, 48], "numel": 32, "numer": 5, "numpi": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "o": 50, "object": [0, 3, 4, 5, 6, 10, 12, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 26, 27, 29, 30, 31, 34, 41, 43, 45, 46, 48], "observ": [21, 29, 31, 45, 49], "one": [5, 42, 49], "onli": [3, 4, 5, 6, 12, 21, 24, 26, 27, 41, 45], "onlin": [27, 34], "onnx": 11, "onnx_file_path": 41, "onnx_opset_vers": 41, "onnxruntim": 41, "op": [42, 45], "open": [41, 49, 50], "oper": [3, 10, 40, 42, 45], "operator_group": 45, "operator_set": 45, "operators_set": 45, "operatorsetnam": 45, "opquantizationconfig": [18, 19, 47], "optim": [1, 3, 4, 10, 11, 13, 14, 15, 16, 17, 18, 19, 21, 22, 24, 27, 29, 30, 31, 34, 39, 45, 46, 47, 50], "optimizer_bia": 4, "optimizer_quantization_paramet": 4, "optimizer_rest": [4, 15, 17], "optimizerv2": 15, "option": [11, 13, 21, 23, 24, 25, 27, 29, 31, 32, 34, 41, 45], "order": [15, 17, 21, 24, 27, 34, 40, 41, 42, 44], "org": 46, "orient": [13, 46], "origin": [25, 35, 36, 37, 38, 49], "ort": 41, "other": [1, 11, 15, 17, 48], "otherwis": 45, "our": [21, 24, 26, 27, 34, 50], "out": [3, 6], "out1": 50, "out2": 50, "out3": 50, "out_channel_axis_map": 3, "outlier": [12, 48], "output": [1, 3, 12, 14, 16, 20, 21, 24, 27, 28, 29, 31, 33, 34, 40, 45, 48, 49, 50], "output_image_s": [20, 28], "output_loss_multipli": [1, 14, 16], "output_loss_typ": [1, 14, 16], "output_nam": 41, "outputlosstyp": [14, 16], "over": 5, "overrid": [4, 44], "overwrit": 5, "p": 32, "packag": [41, 46, 50], "pad": 45, "page": 13, "pair": 49, "param": [17, 40, 43, 46], "param_item": 11, "paramet": [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46], "pars": 45, "part": 41, "pass": [2, 3, 5, 15, 17, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 43], "patch": 45, "path": [11, 13, 23, 35, 41, 48, 49], "pattern": 45, "pdf": 46, "per": [1, 3, 4, 21, 24, 27, 34, 45, 46, 49], "per_sampl": 4, "percentag": [5, 40], "peretz": 50, "perform": [6, 10, 11, 20, 25, 28, 32], "phase": 49, "pinpoint": 40, "pip": [41, 50], "pipelin": [1, 11, 14, 16], "pixel": [1, 14, 16], "place": 45, "plan": 41, "platform": [11, 18, 19, 21, 24, 25, 26, 27, 30, 32, 45], "pleas": [24, 27, 34, 41, 44, 48, 50], "plot": [40, 49], "point": [4, 5, 15, 17, 21, 29, 31, 36, 37, 38, 45, 49], "posit": 45, "possibl": [9, 21, 24, 27, 34, 45, 49], "post": [4, 11, 13, 25, 27, 32, 34, 50], "power": [21, 24, 27, 29, 31, 34, 45], "power_of_two": 45, "poweroftwo": 46, "pre": 5, "preced": [21, 24, 27, 29, 31, 34], "precis": [5, 10, 11, 12, 13, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 34, 39, 45, 48, 50], "predefin": [5, 6], "predict": 41, "prepar": [11, 13, 27, 34], "preprint": 50, "present": [2, 48, 49], "preserv": 45, "pretrain": [33, 34], "prevent": 5, "print": 40, "prior": 5, "prioriti": 11, "problemat": 40, "procedur": 48, "process": [4, 5, 8, 13, 14, 15, 16, 17, 18, 19, 20, 25, 28, 32, 39, 43, 44, 45, 47, 49], "product": 49, "progress": 40, "progress_info_callback": 40, "progress_perc": 40, "progressinfocallback": 40, "project": [41, 50], "properti": 7, "propos": [46, 48], "provid": [2, 11, 20, 25, 28, 32, 40, 41, 45, 46, 48, 49], "prune": [10, 50], "pruned_model": [25, 32], "pruning_config": [25, 32], "pruning_info": [25, 32], "pruning_mask": 7, "pruning_num_score_approxim": 6, "pruningconfig": [6, 13, 25, 32], "pruninginfo": [7, 13, 25, 32], "ptq": [11, 24, 31, 41, 48], "purpos": [20, 28, 40], "py": 50, "pydantic_cor": 45, "pypi": 50, "python": [35, 50], "pytorch": [11, 13, 45, 46, 50], "pytorch_data_generation_experiment": [13, 28], "pytorch_default_tpc": 30, "pytorch_gradient_post_training_quant": [13, 17, 29], "pytorch_post_training_quant": [13, 31, 41, 48], "pytorch_pruning_experiment": [13, 32], "pytorch_quantization_aware_training_finalize_experiment": [13, 33], "pytorch_quantization_aware_training_init_experiment": [13, 33, 34], "pytorch_resource_utilization_data": [13, 30], "q": 41, "q_fraction_scheduler_polici": 4, "qat": [26, 27, 33, 34, 44], "qat_config": [13, 27, 34], "qatconfig": [27, 34], "qc": 8, "qc_option": 45, "qmodel": 11, "qnnpack": 45, "quant": 41, "quantifi": [7, 49], "quantiz": [0, 3, 4, 5, 8, 9, 11, 12, 13, 15, 17, 20, 22, 28, 30, 36, 37, 38, 39, 40, 43, 44, 45, 46, 49, 50], "quantization_config": [39, 46], "quantization_configur": 45, "quantization_format": 41, "quantization_info": [21, 24, 26, 27, 29, 31, 33, 34], "quantization_preserv": [18, 19, 45, 47], "quantizationconfig": [13, 39], "quantizationerrormethod": [8, 11, 13], "quantizationmethod": [3, 46], "quantize_and_export": 11, "quantize_reported_dir": [12, 48], "quantized_exportable_model": 41, "quantized_info": 48, "quantized_model": [11, 21, 24, 26, 27, 33, 34, 36, 37, 38, 48], "quantized_modul": [29, 31], "quantizewrapp": [13, 27, 33, 34], "question": 41, "r": 50, "radam": 16, "rais": 45, "random": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "random_data_gen": 48, "rang": [3, 12, 21, 24, 27, 29, 31, 34, 48], "rate": [1, 14, 15, 16, 17], "ratio": [11, 12, 48], "readi": 33, "readm": 41, "receiv": [11, 40], "recent": 48, "recommend": 48, "recov": [25, 32], "red": 48, "reduc": [5, 25, 32], "reduce_on_plateau": [1, 14], "reduce_on_plateau_with_reset": 16, "reduceonplateau": 1, "refer": [41, 48], "refine_mp_solut": 5, "regard": 42, "regular": [1, 4, 15, 17], "regularization_factor": [4, 15, 17], "regularized_min_max_diff": [1, 14], "relat": [3, 7, 13, 45], "releas": 50, "relev": 41, "relu": 3, "relu_bound_to_power_of_2": 8, "remain": 40, "remov": [12, 25, 32, 33, 48], "replac": [26, 48], "report": [12, 13, 48], "report_dir": [12, 48], "repositori": 41, "repr_datagen": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34], "repr_dataset": [36, 37, 38, 41], "repres": [4, 5, 10, 11, 15, 17, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 48, 49], "representative_data_gen": [21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 41, 48], "representative_dataset": 11, "request": 2, "requir": [21, 24, 27, 29, 31, 34, 46, 49], "research": 50, "reshap": [3, 20], "residu": 11, "residual_collaps": [8, 11], "resnet50": [25, 32, 41], "resnet50_weight": 32, "resourc": [6, 10, 11, 13, 21, 24, 25, 26, 27, 32, 33, 34, 49], "resourceutil": [13, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 34], "respect": 48, "respectivli": 3, "rest": 4, "result": 48, "retrain": [25, 32], "retriev": [18, 19, 40, 45], "return": [2, 4, 5, 7, 11, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41], "round": 4, "rounding_typ": 4, "ru": [21, 24, 26, 27], "ru_data": [22, 30], "rule": [40, 43], "run": [4, 15, 17, 40, 41, 49], "runner": 40, "same": [1, 41, 45], "sampl": [4, 15, 17, 49], "save": [3, 11, 12, 27, 35, 41, 46, 48], "save_model_path": [11, 41], "saved_model": 23, "savedmodel": 23, "scalar": 49, "scale": [4, 5, 45], "scale_log_norm": 4, "schedul": [1, 4, 14, 16, 40], "scheduler_typ": [1, 14, 16], "schedulertyp": [14, 16], "schema": 45, "schema_vers": 45, "score": [4, 5, 6, 7, 9, 11, 15, 17, 25, 32], "sdsp": [11, 13, 45], "sdsp_v3_14": 19, "sdsp_version": [11, 19], "search": [5, 10, 13, 21, 24, 27, 29, 31, 34], "second": 49, "section": 40, "see": [4, 17, 48, 50], "seen": 49, "select": [0, 3, 6, 8, 9, 11, 13, 39, 41, 44, 45, 46], "self": [40, 45], "semiconductor": 50, "sensit": [5, 6, 25, 32], "sequenti": [20, 28], "serial": 13, "serialization_format": 41, "sess": 41, "session": 41, "set": [3, 11, 12, 13, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 40, 41, 43, 45, 46, 48, 49], "set_log_fold": [35, 48, 49], "setup": [11, 50], "sever": [21, 24, 27, 29, 31, 34, 49], "shift": 48, "shift_negative_activation_correct": 8, "shift_negative_params_search": 8, "shift_negative_ratio": 8, "shift_negative_threshold_recalcul": 8, "shortli": 45, "should": [3, 6, 15, 21, 22, 24, 25, 26, 27, 29, 31, 32, 34, 41, 45, 49], "show": 49, "shown": 48, "sigma": 5, "signal": 9, "signed": 45, "signific": [7, 48], "significantli": 48, "simd": [25, 32, 45], "simd_siz": 45, "similar": [9, 12, 36, 37, 38, 40, 48, 50], "similarli": 45, "simpl": [20, 28], "simplic": [20, 28], "simul": 40, "simulate_schedul": 40, "simultan": 45, "singl": 45, "six": 48, "size": [1, 4, 5, 14, 15, 16, 17, 20, 21, 24, 26, 27, 28, 34, 41, 46], "skip": [12, 40, 41, 48], "slowli": 41, "small": 48, "smaller": 42, "smallereq": 42, "smooth": [1, 46], "smoothing_and_augment": [1, 14, 16], "so": [11, 41], "softmax": 3, "softmax_shift": 8, "softquant": 4, "solut": 50, "solver": [21, 24, 27, 34], "some": [18, 19, 20, 28, 41, 45, 47, 49], "soni": 50, "sonysemiconductorsolut": 50, "sourc": 50, "specif": [0, 3, 11, 13, 25, 32, 43, 48, 49], "specifi": [6, 11, 12, 14, 16, 18, 20, 23, 25, 28, 32, 41, 45, 48], "sphinx": 13, "sqnr": [12, 48], "squar": [1, 9], "stabl": 50, "stage": 49, "standard": [25, 32, 40, 46], "start": [20, 28, 41, 46, 50], "start_step": 4, "state": 50, "state_dict": 32, "statist": [3, 21, 24, 27, 29, 31, 34, 49], "stderr": 40, "ste": [4, 44, 46], "step": [1, 4, 40, 46, 48], "store": [7, 46], "str": [3, 11, 12, 18, 19, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 35, 36, 37, 38, 40, 41, 42, 45, 48], "straight": [4, 46], "strategi": [6, 25, 32], "string": 43, "structur": [13, 50], "student": 4, "success": 11, "suffer": 41, "suggest": 48, "sum": [10, 22, 25, 30, 32], "support": [4, 11, 41], "supported_input_activation_n_bit": 45, "sure": 40, "sy": 40, "symmetr": [21, 24, 27, 29, 31, 34, 45, 46], "t": [35, 50], "tab": 49, "tabl": 45, "tag": 49, "take": [5, 24, 27, 34, 50], "target": [4, 11, 13, 18, 19, 21, 22, 24, 25, 26, 27, 30, 32, 33, 34, 45], "target_platform_cap": [21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 42, 46], "target_q_fract": 4, "target_resource_util": [21, 24, 25, 27, 29, 31, 32, 34], "targetplatformcap": [13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34], "teacher": 4, "tempfil": 41, "tensor": [5, 11, 12, 15, 17, 20, 22, 28, 30, 45, 46, 49, 50], "tensorboard": [40, 50], "tensorflow": [3, 11, 13, 15, 20, 21, 22, 24, 25, 26, 27, 41, 43, 45, 50], "tf": [3, 11, 15, 20, 23, 26, 27], "tflite": [41, 45], "than": [5, 42, 48], "thei": 3, "them": [45, 49], "thi": [5, 7, 8, 9, 11, 13, 20, 21, 23, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 40, 41, 45, 46, 48, 50], "those": 48, "three": [3, 48], "threshold": [5, 8, 9, 11, 12, 21, 24, 27, 29, 31, 34, 45, 46, 48], "threshold_bitwidth_mixed_precis": 48, "threshold_bitwidth_mixed_precision_with_model_output_loss_object": 12, "threshold_degrade_layer_ratio": [12, 48], "threshold_quantize_error": [12, 48], "threshold_ratio_unbalanced_concaten": [12, 48], "threshold_zscore_outlier_remov": [12, 48], "through": [4, 20, 25, 28, 46], "throughout": 4, "thu": [25, 32, 49], "time": [3, 6, 46], "togeth": [25, 32], "tool": [11, 13, 46, 50], "toolkit": [11, 13, 20, 28, 29, 48], "torch": [17, 28, 37, 38, 41, 50], "torchscript": 41, "torchvis": [1, 16, 29, 30, 31, 32, 33, 34, 41], "total": [10, 22, 30, 40], "total_memori": 10, "totalcompon": 40, "tpc": [11, 13, 25, 32, 45], "tpc_minor_vers": 45, "tpc_patch_vers": 45, "tpc_platform_typ": 45, "tpc_v1_0": 18, "tpc_version": 18, "trace": 41, "track": 40, "train": [4, 11, 13, 44, 46, 50], "train_bia": 4, "trainabl": [23, 26, 46], "trainable_infrastructur": 44, "trainablequant": 26, "transform": [1, 21, 24, 27, 29, 31, 34], "transpos": 3, "treat": 45, "troubleshoot": 13, "true": [1, 5, 8, 11, 12, 15, 16, 17, 23, 33, 34, 40, 46], "try": 5, "tun": 34, "tune": [15, 17, 25, 26, 27, 32, 33], "tupl": [1, 3, 11, 14, 16, 20, 21, 24, 25, 28, 29, 31, 32, 43, 45], "tutori": 48, "two": [5, 12, 21, 24, 27, 29, 31, 34, 41, 45, 48, 49], "type": [0, 1, 2, 4, 5, 6, 7, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32, 35, 36, 37, 38, 40, 41, 43, 45, 48], "ui": 49, "unbalanc": [12, 48], "unchang": 40, "under": 49, "unifi": 11, "uniform": [45, 46], "union": [1, 14, 16, 20, 21, 22, 24, 25, 27, 28, 29, 30, 31, 32, 34, 45], "uniqu": 45, "up": [6, 20, 28, 35, 45, 49], "updat": [4, 11], "upon": 46, "us": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 47, 48, 49, 50], "use_hessian_based_scor": [5, 11], "use_hessian_based_weight": [15, 17], "use_hessian_sample_attent": [15, 17], "use_mixed_precis": 11, "user": [11, 13, 21, 24, 26, 27, 29, 31, 33, 34, 40, 48], "userinform": [21, 24, 29, 31], "util": [6, 11, 13, 21, 24, 25, 26, 27, 32, 33, 34, 46], "v": 50, "valid": [36, 37, 38, 45, 46, 48], "validation_dataset": [36, 37, 38, 48], "validationerror": 45, "valu": [1, 2, 3, 4, 5, 6, 9, 11, 12, 21, 24, 25, 26, 27, 32, 40, 41, 42, 43, 45, 46, 48], "valuabl": 9, "variabl": [11, 15, 17], "variou": [11, 20, 28, 49], "vector": [4, 49], "verbos": 35, "version": [11, 13, 20, 28, 45], "via": [41, 50], "view": 49, "visit": [44, 50], "visual": [40, 48, 50], "wa": [2, 41, 48], "wai": [49, 50], "walk": [20, 28], "want": 3, "warn": [11, 48], "we": [3, 20, 21, 24, 25, 27, 28, 32, 34, 41, 43, 45, 46, 49], "weight": [0, 1, 3, 4, 5, 8, 10, 11, 14, 15, 16, 17, 21, 22, 25, 27, 29, 30, 31, 32, 33, 34, 41, 43, 44, 45, 46, 49], "weight_quantizer_params_overrid": 44, "weight_training_method": 44, "weights_bias_correct": [8, 11], "weights_channels_axi": 46, "weights_compression_ratio": 11, "weights_error_method": 8, "weights_memori": [6, 10, 21, 24, 25, 27, 32, 34], "weights_n_bit": [43, 45, 46], "weights_per_channel_threshold": [45, 46], "weights_quantization_candid": 46, "weights_quantization_method": [43, 45, 46], "weights_quantization_param": 46, "weights_quantization_params_fn": 43, "weights_second_moment_correct": 8, "were": 49, "when": [1, 2, 3, 4, 5, 6, 9, 10, 12, 13, 15, 17, 21, 24, 26, 27, 40, 41, 42, 44, 45, 46, 48, 49], "where": [7, 12, 41, 43, 48, 49], "whether": [4, 5, 7, 11, 14, 15, 16, 17, 23, 40, 41, 45, 46], "which": [4, 6, 40, 41, 42, 43, 45, 46], "while": [8, 21, 24, 26, 27, 34, 40, 45], "who": 48, "width": [0, 5, 12, 13, 21, 24, 27, 28, 34, 39, 45, 48, 50], "within": [40, 45, 48, 50], "without": 13, "work": 50, "would": 49, "wrap": [2, 3, 23, 27, 34, 42, 45, 46], "wrapper": [27, 33, 34, 46], "writer": 49, "x": 48, "xquant": [11, 50], "xquant_config": [12, 36, 37, 38, 48], "xquant_report_keras_experiment": [13, 36], "xquant_report_pytorch_experiment": [13, 37, 48], "xquant_report_troubleshoot_pytorch_experiment": [12, 13, 38, 48], "xquantconfig": [12, 13, 36, 37, 38], "y": 48, "yield": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "you": [8, 11, 40, 41, 45, 49, 50], "your": [41, 48], "z": 11, "z_score": [12, 48], "z_threshold": [8, 11], "zero": [5, 45]}, "titles": ["BitWidthConfig", "Data Generation Configuration", "DefaultDict Class", "FrameworkInfo Class", "GradientPTQConfig Class", "MixedPrecisionQuantizationConfig", "Pruning Configuration", "Pruning Information", "QuantizationConfig", "QuantizationErrorMethod", "ResourceUtilization", "wrapper", "XQuant Configuration", "API Docs", "Get DataGenerationConfig for Keras Models", "Get GradientPTQConfig for Keras Models", "Get DataGenerationConfig for Pytorch Models", "Get GradientPTQConfig for Pytorch Models", "Get TargetPlatformCapabilities for tpc version", "Get TargetPlatformCapabilities for sdsp converter version", "Keras Data Generation", "Keras Gradient Based Post Training Quantization", "Get Resource Utilization information for Keras Models", "Load Quantized Keras Model", "Keras Post Training Quantization", "Keras Structured Pruning", "Keras Quantization Aware Training Model Finalize", "Keras Quantization Aware Training Model Init", "Pytorch Data Generation", "Pytorch Gradient Based Post Training Quantization", "Get Resource Utilization information for PyTorch Models", "Pytorch Post Training Quantization", "Pytorch Structured Pruning", "PyTorch Quantization Aware Training Model Finalize", "PyTorch Quantization Aware Training Model Init", "Enable a Logger", "XQuant Report Keras", "XQuant Report Pytorch", "XQuant Report Troubleshoot Pytorch", "CoreConfig", "debug_config Module", "exporter Module", "Layer Attributes Filters", "network_editor Module", "qat_config Module", "target_platform_capabilities Module", "trainable_infrastructure Module", "<no title>", "XQuant Extension Tool", "Visualization within TensorBoard", "Model Compression Toolkit User Guide"], "titleterms": {"about": 48, "action": 43, "api": [13, 50], "attribut": 42, "attributequantizationconfig": 45, "awar": [26, 27, 33, 34], "base": [21, 29], "basekerastrainablequant": 46, "basepytorchtrainablequant": 46, "batchnormalignemntlosstyp": 1, "bit": 49, "bitwidthconfig": 0, "bnlayerweightingtyp": 1, "channelaxi": 3, "channelsfilteringstrategi": 6, "class": [2, 3, 4], "comparison": 49, "compress": 50, "configur": [1, 6, 12, 49], "constraint": 50, "convert": 19, "core": 13, "coreconfig": 39, "cosin": 49, "data": [1, 20, 28], "data_gener": 13, "datagenerationconfig": [14, 16], "datainittyp": 1, "debug_config": 40, "debugconfig": 40, "defaultdict": 2, "dictionari": 40, "doc": 13, "document": 50, "editrul": 43, "enabl": 35, "error": 48, "exampl": 48, "export": [13, 41], "extens": 48, "featur": 50, "filter": [42, 43], "final": [26, 33], "flow": 48, "format": [41, 48], "frameworkinfo": 3, "fuse": 45, "gener": [1, 20, 28, 48], "get": [14, 15, 16, 17, 18, 19, 22, 30], "gptq": 13, "gptqhessianscoresconfig": 4, "gradient": [21, 29], "gradientptqconfig": [4, 15, 17], "gradualactivationquantizationconfig": 4, "graph": 48, "guid": 50, "how": 48, "imagegranular": 1, "imagenormalizationtyp": 1, "imagepipelinetyp": 1, "importancemetr": 6, "indic": 13, "infer": 41, "inform": [7, 22, 30], "init": [27, 34], "instal": 50, "judgeabl": 48, "kei": 40, "kera": [14, 15, 20, 21, 22, 23, 24, 25, 26, 27, 36, 41], "keras_export_model": 41, "keras_load_quantized_model": 13, "kerasexportserializationformat": 41, "layer": 42, "load": 23, "logger": 35, "manualbitwidthselect": 0, "mctq": 41, "mix": 49, "mixedprecisionquantizationconfig": 5, "model": [14, 15, 16, 17, 22, 23, 26, 27, 30, 33, 34, 41, 50], "modul": [40, 41, 43, 44, 45, 46], "mpdistanceweight": 5, "mpmetricnorm": 5, "name": 41, "network_editor": 43, "onnx": 41, "operatorsetgroup": 45, "operatorsset": 45, "opquantizationconfig": 45, "opset": 41, "output": 41, "outputlosstyp": 1, "overal": 48, "overview": 50, "paramet": 48, "post": [21, 24, 29, 31], "precis": 49, "process": [40, 48], "prune": [6, 7, 13, 25, 32], "ptq": 13, "pytorch": [16, 17, 28, 29, 30, 31, 32, 33, 34, 37, 38, 41], "pytorch_export_model": 41, "pytorchexportserializationformat": 41, "qat": 13, "qat_config": 44, "qatconfig": 44, "qfractionlinearannealingconfig": 4, "quantiz": [21, 23, 24, 26, 27, 29, 31, 33, 34, 41, 48], "quantizationconfig": 8, "quantizationconfigopt": 45, "quantizationerrormethod": 9, "quantizationformat": 41, "quantizationmethod": 45, "quickstart": 50, "refer": 50, "report": [36, 37, 38], "resourc": [22, 30], "resourceutil": 10, "roundingtyp": 4, "run": 48, "schedulertyp": 1, "sdsp": 19, "serial": 41, "set_log_fold": 13, "similar": 49, "state": 40, "structur": [25, 32], "support": 50, "tabl": 13, "target_platform_cap": [13, 45], "targetplatformcap": [18, 19, 45], "technic": 50, "tensorboard": 49, "tool": 48, "toolkit": 50, "tpc": 18, "train": [21, 24, 26, 27, 29, 31, 33, 34], "trainable_infrastructur": [13, 46], "trainablequantizeractivationconfig": 46, "trainablequantizerweightsconfig": 46, "trainingmethod": [44, 46], "troubleshoot": [38, 48], "tutori": 41, "understand": 48, "us": 41, "user": 50, "util": [22, 30], "version": [18, 19, 41], "visual": 49, "width": 49, "within": 49, "wrapper": [11, 13], "xquant": [12, 13, 36, 37, 38, 48], "xquantconfig": 48}})
\ No newline at end of file
+Search.setIndex({"alltitles": {"API Docs": [[13, null]], "API Documentation": [[50, "api-documentation"]], "About XQuant Extension Tool": [[48, "about-xquant-extension-tool"]], "Actions": [[43, "actions"]], "Attribute Filters": [[42, "attribute-filters"]], "AttributeQuantizationConfig": [[45, "attributequantizationconfig"]], "BNLayerWeightingType": [[1, "bnlayerweightingtype"]], "BaseKerasTrainableQuantizer": [[46, "basekerastrainablequantizer"]], "BasePytorchTrainableQuantizer": [[46, "basepytorchtrainablequantizer"]], "BatchNormAlignemntLossType": [[1, "batchnormalignemntlosstype"]], "BitWidthConfig": [[0, null]], "ChannelAxis": [[3, "channelaxis"]], "ChannelsFilteringStrategy": [[6, "channelsfilteringstrategy"]], "CoreConfig": [[39, null]], "Cosine Similarity Comparison": [[49, "cosine-similarity-comparison"]], "Data Generation Configuration": [[1, null]], "DataInitType": [[1, "datainittype"]], "DebugConfig": [[40, "debugconfig"]], "DefaultDict Class": [[2, null]], "EditRule": [[43, "editrule"]], "Enable a Logger": [[35, null]], "Filters": [[43, "filters"]], "FrameworkInfo Class": [[3, null]], "Fusing": [[45, "fusing"]], "GPTQHessianScoresConfig Class": [[4, "gptqhessianscoresconfig-class"]], "Get DataGenerationConfig for Keras Models": [[14, null]], "Get DataGenerationConfig for Pytorch Models": [[16, null]], "Get GradientPTQConfig for Keras Models": [[15, null]], "Get GradientPTQConfig for Pytorch Models": [[17, null]], "Get Resource Utilization information for Keras Models": [[22, null]], "Get Resource Utilization information for PyTorch Models": [[30, null]], "Get TargetPlatformCapabilities for sdsp converter version": [[19, null]], "Get TargetPlatformCapabilities for tpc version": [[18, null]], "GradientPTQConfig Class": [[4, null]], "GradualActivationQuantizationConfig": [[4, "gradualactivationquantizationconfig"]], "How to Run": [[48, "how-to-run"]], "ImageGranularity": [[1, "imagegranularity"]], "ImageNormalizationType": [[1, "imagenormalizationtype"]], "ImagePipelineType": [[1, "imagepipelinetype"]], "ImportanceMetric": [[6, "importancemetric"]], "Indices and tables": [[13, "indices-and-tables"]], "Install": [[50, "install"]], "Keras Data Generation": [[20, null]], "Keras Gradient Based Post Training Quantization": [[21, null]], "Keras Post Training Quantization": [[24, null]], "Keras Quantization Aware Training Model Finalize": [[26, null]], "Keras Quantization Aware Training Model Init": [[27, null]], "Keras Structured Pruning": [[25, null]], "Keras Tutorial": [[41, "keras-tutorial"]], "KerasExportSerializationFormat": [[41, "kerasexportserializationformat"]], "Keys in the processing state dictionary": [[40, "id1"]], "Layer Attributes Filters": [[42, null]], "Load Quantized Keras Model": [[23, null]], "MCTQ": [[41, "mctq"]], "MCTQ Quantization Format": [[41, "mctq-quantization-format"]], "ManualBitWidthSelection": [[0, "manualbitwidthselection"]], "Mixed-precision Configuration Bit-width": [[49, "mixed-precision-configuration-bit-width"]], "MixedPrecisionQuantizationConfig": [[5, null]], "Model Compression Toolkit User Guide": [[50, null]], "MpDistanceWeighting": [[5, "mpdistanceweighting"]], "MpMetricNormalization": [[5, "mpmetricnormalization"]], "ONNX": [[41, "onnx"]], "ONNX model output names": [[41, "onnx-model-output-names"]], "ONNX opset version": [[41, "onnx-opset-version"]], "OpQuantizationConfig": [[45, "opquantizationconfig"]], "OperatorSetGroup": [[45, "operatorsetgroup"]], "OperatorsSet": [[45, "operatorsset"]], "OutputLossType": [[1, "outputlosstype"]], "Overall Process Flow": [[48, "overall-process-flow"]], "Overview": [[50, "overview"]], "Pruning Configuration": [[6, null]], "Pruning Information": [[7, null]], "PyTorch Quantization Aware Training Model Finalize": [[33, null]], "PyTorch Quantization Aware Training Model Init": [[34, null]], "Pytorch Data Generation": [[28, null]], "Pytorch Gradient Based Post Training Quantization": [[29, null]], "Pytorch Post Training Quantization": [[31, null]], "Pytorch Structured Pruning": [[32, null]], "Pytorch Tutorial": [[41, "pytorch-tutorial"]], "PytorchExportSerializationFormat": [[41, "pytorchexportserializationformat"]], "QATConfig": [[44, "qatconfig"]], "QFractionLinearAnnealingConfig": [[4, "qfractionlinearannealingconfig"]], "QuantizationConfig": [[8, null]], "QuantizationConfigOptions": [[45, "quantizationconfigoptions"]], "QuantizationErrorMethod": [[9, null]], "QuantizationFormat": [[41, "quantizationformat"]], "QuantizationMethod": [[45, "quantizationmethod"]], "Quickstart": [[50, "quickstart"]], "References": [[50, "references"]], "ResourceUtilization": [[10, null]], "RoundingType": [[4, "roundingtype"]], "SchedulerType": [[1, "schedulertype"]], "Supported Features": [[50, "supported-features"]], "TargetPlatformCapabilities": [[45, "targetplatformcapabilities"]], "Technical Constraints": [[50, "technical-constraints"]], "TrainableQuantizerActivationConfig": [[46, "trainablequantizeractivationconfig"]], "TrainableQuantizerWeightsConfig": [[46, "trainablequantizerweightsconfig"]], "TrainingMethod": [[44, "trainingmethod"], [46, "trainingmethod"]], "Understanding the General Troubleshoots": [[48, "understanding-the-general-troubleshoots"]], "Understanding the Judgeable Troubleshoots": [[48, "understanding-the-judgeable-troubleshoots"]], "Understanding the Quantization Error Graph": [[48, "understanding-the-quantization-error-graph"]], "Use exported model for inference": [[41, "use-exported-model-for-inference"]], "Visualization within TensorBoard": [[49, null]], "XQuant Configuration": [[12, null]], "XQuant Extension Tool": [[48, null]], "XQuant Report Keras": [[36, null]], "XQuant Report Pytorch": [[37, null]], "XQuant Report Troubleshoot Pytorch": [[38, null]], "XQuantConfig Format and Examples": [[48, "xquantconfig-format-and-examples"]], "XQuantConfig parameter": [[48, "id3"]], "core": [[13, "core"]], "data_generation": [[13, "data-generation"]], "debug_config Module": [[40, null]], "exporter": [[13, "exporter"]], "exporter Module": [[41, null]], "gptq": [[13, "gptq"]], "keras serialization format": [[41, "keras-serialization-format"]], "keras_export_model": [[41, "keras-export-model"]], "keras_load_quantized_model": [[13, "keras-load-quantized-model"]], "network_editor Module": [[43, null]], "pruning": [[13, "pruning"]], "ptq": [[13, "ptq"]], "pytorch_export_model": [[41, "pytorch-export-model"]], "qat": [[13, "qat"]], "qat_config Module": [[44, null]], "set_log_folder": [[13, "set-log-folder"]], "target_platform_capabilities": [[13, "target-platform-capabilities"]], "target_platform_capabilities Module": [[45, null]], "trainable_infrastructure": [[13, "trainable-infrastructure"]], "trainable_infrastructure Module": [[46, null]], "wrapper": [[11, null], [13, "wrapper"]], "xquant": [[13, "xquant"]]}, "docnames": ["api/api_docs/classes/BitWidthConfig", "api/api_docs/classes/DataGenerationConfig", "api/api_docs/classes/DefaultDict", "api/api_docs/classes/FrameworkInfo", "api/api_docs/classes/GradientPTQConfig", "api/api_docs/classes/MixedPrecisionQuantizationConfig", "api/api_docs/classes/PruningConfig", "api/api_docs/classes/PruningInfo", "api/api_docs/classes/QuantizationConfig", "api/api_docs/classes/QuantizationErrorMethod", "api/api_docs/classes/ResourceUtilization", "api/api_docs/classes/Wrapper", "api/api_docs/classes/XQuantConfig", "api/api_docs/index", "api/api_docs/methods/get_keras_data_generation_config", "api/api_docs/methods/get_keras_gptq_config", "api/api_docs/methods/get_pytorch_data_generation_config", "api/api_docs/methods/get_pytroch_gptq_config", "api/api_docs/methods/get_target_platform_capabilities", "api/api_docs/methods/get_target_platform_capabilities_sdsp", "api/api_docs/methods/keras_data_generation_experimental", "api/api_docs/methods/keras_gradient_post_training_quantization", "api/api_docs/methods/keras_kpi_data", "api/api_docs/methods/keras_load_quantizad_model", "api/api_docs/methods/keras_post_training_quantization", "api/api_docs/methods/keras_pruning_experimental", "api/api_docs/methods/keras_quantization_aware_training_finalize_experimental", "api/api_docs/methods/keras_quantization_aware_training_init_experimental", "api/api_docs/methods/pytorch_data_generation_experimental", "api/api_docs/methods/pytorch_gradient_post_training_quantization", "api/api_docs/methods/pytorch_kpi_data", "api/api_docs/methods/pytorch_post_training_quantization", "api/api_docs/methods/pytorch_pruning_experimental", "api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental", "api/api_docs/methods/pytorch_quantization_aware_training_init_experimental", "api/api_docs/methods/set_logger_path", "api/api_docs/methods/xquant_report_keras_experimental", "api/api_docs/methods/xquant_report_pytorch_experimental", "api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental", "api/api_docs/modules/core_config", "api/api_docs/modules/debug_config", "api/api_docs/modules/exporter", "api/api_docs/modules/layer_filters", "api/api_docs/modules/network_editor", "api/api_docs/modules/qat_config", "api/api_docs/modules/target_platform_capabilities", "api/api_docs/modules/trainable_infrastructure", "api/api_docs/notes/tpc_note", "guidelines/XQuant_Extension_Tool", "guidelines/visualization", "index"], "envversion": {"sphinx": 64, "sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2}, "filenames": ["api/api_docs/classes/BitWidthConfig.rst", "api/api_docs/classes/DataGenerationConfig.rst", "api/api_docs/classes/DefaultDict.rst", "api/api_docs/classes/FrameworkInfo.rst", "api/api_docs/classes/GradientPTQConfig.rst", "api/api_docs/classes/MixedPrecisionQuantizationConfig.rst", "api/api_docs/classes/PruningConfig.rst", "api/api_docs/classes/PruningInfo.rst", "api/api_docs/classes/QuantizationConfig.rst", "api/api_docs/classes/QuantizationErrorMethod.rst", "api/api_docs/classes/ResourceUtilization.rst", "api/api_docs/classes/Wrapper.rst", "api/api_docs/classes/XQuantConfig.rst", "api/api_docs/index.rst", "api/api_docs/methods/get_keras_data_generation_config.rst", "api/api_docs/methods/get_keras_gptq_config.rst", "api/api_docs/methods/get_pytorch_data_generation_config.rst", "api/api_docs/methods/get_pytroch_gptq_config.rst", "api/api_docs/methods/get_target_platform_capabilities.rst", "api/api_docs/methods/get_target_platform_capabilities_sdsp.rst", "api/api_docs/methods/keras_data_generation_experimental.rst", "api/api_docs/methods/keras_gradient_post_training_quantization.rst", "api/api_docs/methods/keras_kpi_data.rst", "api/api_docs/methods/keras_load_quantizad_model.rst", "api/api_docs/methods/keras_post_training_quantization.rst", "api/api_docs/methods/keras_pruning_experimental.rst", "api/api_docs/methods/keras_quantization_aware_training_finalize_experimental.rst", "api/api_docs/methods/keras_quantization_aware_training_init_experimental.rst", "api/api_docs/methods/pytorch_data_generation_experimental.rst", "api/api_docs/methods/pytorch_gradient_post_training_quantization.rst", "api/api_docs/methods/pytorch_kpi_data.rst", "api/api_docs/methods/pytorch_post_training_quantization.rst", "api/api_docs/methods/pytorch_pruning_experimental.rst", "api/api_docs/methods/pytorch_quantization_aware_training_finalize_experimental.rst", "api/api_docs/methods/pytorch_quantization_aware_training_init_experimental.rst", "api/api_docs/methods/set_logger_path.rst", "api/api_docs/methods/xquant_report_keras_experimental.rst", "api/api_docs/methods/xquant_report_pytorch_experimental.rst", "api/api_docs/methods/xquant_report_troubleshoot_pytorch_experimental.rst", "api/api_docs/modules/core_config.rst", "api/api_docs/modules/debug_config.rst", "api/api_docs/modules/exporter.rst", "api/api_docs/modules/layer_filters.rst", "api/api_docs/modules/network_editor.rst", "api/api_docs/modules/qat_config.rst", "api/api_docs/modules/target_platform_capabilities.rst", "api/api_docs/modules/trainable_infrastructure.rst", "api/api_docs/notes/tpc_note.rst", "guidelines/XQuant_Extension_Tool.rst", "guidelines/visualization.rst", "index.rst"], "indexentries": {"add_metadata (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.add_metadata", false]], "attributefilter (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.AttributeFilter", false]], "attributequantizationconfig (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig", false]], "base_config (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.quantizationconfigoptions attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions.base_config", false]], "basekerastrainablequantizer (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.BaseKerasTrainableQuantizer", false]], "basepytorchtrainablequantizer (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.BasePytorchTrainableQuantizer", false]], "batchnormalignemntlosstype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.BatchNormAlignemntLossType", false]], "bit_width (model_compression_toolkit.core.common.quantization.bit_width_config.manualbitwidthselection attribute)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection.bit_width", false]], "bitwidthconfig (class in model_compression_toolkit.core)": [[0, "model_compression_toolkit.core.BitWidthConfig", false]], "bnlayerweightingtype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.BNLayerWeightingType", false]], "changecandidatesactivationquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesActivationQuantConfigAttr", false]], "changecandidatesactivationquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesActivationQuantizationMethod", false]], "changecandidatesweightsquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesWeightsQuantConfigAttr", false]], "changecandidatesweightsquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeCandidatesWeightsQuantizationMethod", false]], "changefinalactivationquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalActivationQuantConfigAttr", false]], "changefinalweightsquantconfigattr (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalWeightsQuantConfigAttr", false]], "changefinalweightsquantizationmethod (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeFinalWeightsQuantizationMethod", false]], "changequantizationparamfunction (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.ChangeQuantizationParamFunction", false]], "channelaxis (class in model_compression_toolkit.core)": [[3, "model_compression_toolkit.core.ChannelAxis", false]], "channels_filtering_strategy (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.channels_filtering_strategy", false]], "channelsfilteringstrategy (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.ChannelsFilteringStrategy", false]], "coreconfig (class in model_compression_toolkit.core)": [[39, "model_compression_toolkit.core.CoreConfig", false]], "datagenerationconfig (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.DataGenerationConfig", false]], "datainittype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.DataInitType", false]], "debugconfig (class in model_compression_toolkit.core)": [[40, "model_compression_toolkit.core.DebugConfig", false]], "default_qco (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.default_qco", false]], "defaultdict (class in model_compression_toolkit)": [[2, "model_compression_toolkit.DefaultDict", false]], "editrule (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.EditRule", false]], "enable_weights_quantization (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.enable_weights_quantization", false]], "eq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Eq", false]], "filter (model_compression_toolkit.core.common.quantization.bit_width_config.manualbitwidthselection attribute)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection.filter", false]], "frameworkinfo (class in model_compression_toolkit.core)": [[3, "model_compression_toolkit.core.FrameworkInfo", false]], "fuse_op_quantization_config (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.fuse_op_quantization_config", false]], "fusing (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing", false]], "fusing_patterns (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.fusing_patterns", false]], "get() (model_compression_toolkit.defaultdict method)": [[2, "model_compression_toolkit.DefaultDict.get", false]], "get_keras_data_generation_config() (in module model_compression_toolkit.data_generation)": [[14, "model_compression_toolkit.data_generation.get_keras_data_generation_config", false]], "get_keras_gptq_config() (in module model_compression_toolkit.gptq)": [[15, "model_compression_toolkit.gptq.get_keras_gptq_config", false]], "get_pytorch_data_generation_config() (in module model_compression_toolkit.data_generation)": [[16, "model_compression_toolkit.data_generation.get_pytorch_data_generation_config", false]], "get_pytorch_gptq_config() (in module model_compression_toolkit.gptq)": [[17, "model_compression_toolkit.gptq.get_pytorch_gptq_config", false]], "get_target_platform_capabilities() (in module model_compression_toolkit)": [[18, "model_compression_toolkit.get_target_platform_capabilities", false]], "get_target_platform_capabilities_sdsp() (in module model_compression_toolkit)": [[19, "model_compression_toolkit.get_target_platform_capabilities_sdsp", false]], "gptqhessianscoresconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GPTQHessianScoresConfig", false]], "gradientptqconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GradientPTQConfig", false]], "gradualactivationquantizationconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.GradualActivationQuantizationConfig", false]], "greater (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Greater", false]], "greatereq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.GreaterEq", false]], "imagegranularity (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImageGranularity", false]], "imagenormalizationtype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImageNormalizationType", false]], "imagepipelinetype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.ImagePipelineType", false]], "importance_metric (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.importance_metric", false]], "importance_scores (model_compression_toolkit.pruning.pruninginfo property)": [[7, "model_compression_toolkit.pruning.PruningInfo.importance_scores", false]], "importancemetric (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.ImportanceMetric", false]], "insert_preserving_quantizers (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.insert_preserving_quantizers", false]], "is_simd_padding (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.is_simd_padding", false]], "keras_data_generation_experimental() (in module model_compression_toolkit.data_generation)": [[20, "model_compression_toolkit.data_generation.keras_data_generation_experimental", false]], "keras_export_model (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.keras_export_model", false]], "keras_gradient_post_training_quantization() (in module model_compression_toolkit.gptq)": [[21, "model_compression_toolkit.gptq.keras_gradient_post_training_quantization", false]], "keras_load_quantized_model() (in module model_compression_toolkit)": [[23, "model_compression_toolkit.keras_load_quantized_model", false]], "keras_post_training_quantization() (in module model_compression_toolkit.ptq)": [[24, "model_compression_toolkit.ptq.keras_post_training_quantization", false]], "keras_pruning_experimental() (in module model_compression_toolkit.pruning)": [[25, "model_compression_toolkit.pruning.keras_pruning_experimental", false]], "keras_quantization_aware_training_finalize_experimental() (in module model_compression_toolkit.qat)": [[26, "model_compression_toolkit.qat.keras_quantization_aware_training_finalize_experimental", false]], "keras_quantization_aware_training_init_experimental() (in module model_compression_toolkit.qat)": [[27, "model_compression_toolkit.qat.keras_quantization_aware_training_init_experimental", false]], "keras_resource_utilization_data() (in module model_compression_toolkit.core)": [[22, "model_compression_toolkit.core.keras_resource_utilization_data", false]], "kerasexportserializationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.KerasExportSerializationFormat", false]], "keys() (model_compression_toolkit.defaultdict method)": [[2, "model_compression_toolkit.DefaultDict.keys", false]], "lut_values_bitwidth (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.lut_values_bitwidth", false]], "manual_activation_bit_width_selection_list (model_compression_toolkit.core.bitwidthconfig attribute)": [[0, "model_compression_toolkit.core.BitWidthConfig.manual_activation_bit_width_selection_list", false]], "manual_weights_bit_width_selection_list (model_compression_toolkit.core.bitwidthconfig attribute)": [[0, "model_compression_toolkit.core.BitWidthConfig.manual_weights_bit_width_selection_list", false]], "manualbitwidthselection (class in model_compression_toolkit.core.common.quantization.bit_width_config)": [[0, "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection", false]], "mctwrapper (class in model_compression_toolkit.wrapper.mct_wrapper)": [[11, "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper", false]], "mixedprecisionquantizationconfig (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MixedPrecisionQuantizationConfig", false]], "mpdistanceweighting (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MpDistanceWeighting", false]], "mpmetricnormalization (class in model_compression_toolkit.core)": [[5, "model_compression_toolkit.core.MpMetricNormalization", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsetgroup attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.name", false]], "name (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.name", false]], "nodenamefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeNameFilter", false]], "nodenamescopefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeNameScopeFilter", false]], "nodetypefilter (class in model_compression_toolkit.core.network_editor)": [[43, "model_compression_toolkit.core.network_editor.NodeTypeFilter", false]], "noteq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.NotEq", false]], "num_score_approximations (model_compression_toolkit.pruning.pruningconfig attribute)": [[6, "model_compression_toolkit.pruning.PruningConfig.num_score_approximations", false]], "operator_groups (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.fusing attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing.operator_groups", false]], "operator_set (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.operator_set", false]], "operators_set (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsetgroup attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup.operators_set", false]], "operatorsetgroup (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup", false]], "operatorsset (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet", false]], "opquantizationconfig (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OpQuantizationConfig", false]], "outputlosstype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.OutputLossType", false]], "pruning_masks (model_compression_toolkit.pruning.pruninginfo property)": [[7, "model_compression_toolkit.pruning.PruningInfo.pruning_masks", false]], "pruningconfig (class in model_compression_toolkit.pruning)": [[6, "model_compression_toolkit.pruning.PruningConfig", false]], "pruninginfo (class in model_compression_toolkit.pruning)": [[7, "model_compression_toolkit.pruning.PruningInfo", false]], "pytorch_data_generation_experimental() (in module model_compression_toolkit.data_generation)": [[28, "model_compression_toolkit.data_generation.pytorch_data_generation_experimental", false]], "pytorch_export_model (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.pytorch_export_model", false]], "pytorch_gradient_post_training_quantization() (in module model_compression_toolkit.gptq)": [[29, "model_compression_toolkit.gptq.pytorch_gradient_post_training_quantization", false]], "pytorch_post_training_quantization() (in module model_compression_toolkit.ptq)": [[31, "model_compression_toolkit.ptq.pytorch_post_training_quantization", false]], "pytorch_pruning_experimental() (in module model_compression_toolkit.pruning)": [[32, "model_compression_toolkit.pruning.pytorch_pruning_experimental", false]], "pytorch_quantization_aware_training_finalize_experimental() (in module model_compression_toolkit.qat)": [[33, "model_compression_toolkit.qat.pytorch_quantization_aware_training_finalize_experimental", false]], "pytorch_quantization_aware_training_init_experimental() (in module model_compression_toolkit.qat)": [[34, "model_compression_toolkit.qat.pytorch_quantization_aware_training_init_experimental", false]], "pytorch_resource_utilization_data() (in module model_compression_toolkit.core)": [[30, "model_compression_toolkit.core.pytorch_resource_utilization_data", false]], "pytorchexportserializationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.PytorchExportSerializationFormat", false]], "qatconfig (class in model_compression_toolkit.qat)": [[44, "model_compression_toolkit.qat.QATConfig", false]], "qc_options (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.qc_options", false]], "qfractionlinearannealingconfig (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.QFractionLinearAnnealingConfig", false]], "quantization_configurations (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.quantizationconfigoptions attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions.quantization_configurations", false]], "quantizationconfig (class in model_compression_toolkit.core)": [[8, "model_compression_toolkit.core.QuantizationConfig", false]], "quantizationconfigoptions (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions", false]], "quantizationerrormethod (class in model_compression_toolkit.core)": [[9, "model_compression_toolkit.core.QuantizationErrorMethod", false]], "quantizationformat (class in model_compression_toolkit.exporter)": [[41, "model_compression_toolkit.exporter.QuantizationFormat", false]], "quantizationmethod (class in model_compression_toolkit.target_platform_capabilities)": [[45, "model_compression_toolkit.target_platform_capabilities.QuantizationMethod", false]], "quantize_and_export() (model_compression_toolkit.wrapper.mct_wrapper.mctwrapper method)": [[11, "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper.quantize_and_export", false]], "resourceutilization (class in model_compression_toolkit.core)": [[10, "model_compression_toolkit.core.ResourceUtilization", false]], "roundingtype (class in model_compression_toolkit.gptq)": [[4, "model_compression_toolkit.gptq.RoundingType", false]], "schedulertype (class in model_compression_toolkit.data_generation)": [[1, "model_compression_toolkit.data_generation.SchedulerType", false]], "schema_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.SCHEMA_VERSION", false]], "set_log_folder() (in module model_compression_toolkit)": [[35, "model_compression_toolkit.set_log_folder", false]], "smaller (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.Smaller", false]], "smallereq (class in model_compression_toolkit.target_platform_capabilities)": [[42, "model_compression_toolkit.target_platform_capabilities.SmallerEq", false]], "targetplatformcapabilities (class in model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities", false]], "tpc_minor_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_minor_version", false]], "tpc_patch_version (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_patch_version", false]], "tpc_platform_type (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.targetplatformcapabilities attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities.tpc_platform_type", false]], "trainablequantizeractivationconfig (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainableQuantizerActivationConfig", false]], "trainablequantizerweightsconfig (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainableQuantizerWeightsConfig", false]], "trainingmethod (class in model_compression_toolkit.trainable_infrastructure)": [[46, "model_compression_toolkit.trainable_infrastructure.TrainingMethod", false]], "type (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.operatorsset attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet.type", false]], "weights_n_bits (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_n_bits", false]], "weights_per_channel_threshold (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_per_channel_threshold", false]], "weights_quantization_method (model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.attributequantizationconfig attribute)": [[45, "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig.weights_quantization_method", false]], "xquant_report_keras_experimental() (in module model_compression_toolkit.xquant.keras.facade_xquant_report)": [[36, "model_compression_toolkit.xquant.keras.facade_xquant_report.xquant_report_keras_experimental", false]], "xquant_report_pytorch_experimental() (in module model_compression_toolkit.xquant.pytorch.facade_xquant_report)": [[37, "model_compression_toolkit.xquant.pytorch.facade_xquant_report.xquant_report_pytorch_experimental", false]], "xquant_report_troubleshoot_pytorch_experimental() (in module model_compression_toolkit.xquant.pytorch.facade_xquant_report)": [[38, "model_compression_toolkit.xquant.pytorch.facade_xquant_report.xquant_report_troubleshoot_pytorch_experimental", false]], "xquantconfig (class in model_compression_toolkit.xquant.common.xquant_config)": [[12, "model_compression_toolkit.xquant.common.xquant_config.XQuantConfig", false]]}, "objects": {"model_compression_toolkit": [[2, 0, 1, "", "DefaultDict"], [18, 3, 1, "", "get_target_platform_capabilities"], [19, 3, 1, "", "get_target_platform_capabilities_sdsp"], [23, 3, 1, "", "keras_load_quantized_model"], [35, 3, 1, "", "set_log_folder"]], "model_compression_toolkit.DefaultDict": [[2, 1, 1, "", "get"], [2, 1, 1, "", "keys"]], "model_compression_toolkit.core": [[0, 0, 1, "", "BitWidthConfig"], [3, 0, 1, "", "ChannelAxis"], [39, 0, 1, "", "CoreConfig"], [40, 0, 1, "", "DebugConfig"], [3, 0, 1, "", "FrameworkInfo"], [5, 0, 1, "", "MixedPrecisionQuantizationConfig"], [5, 0, 1, "", "MpDistanceWeighting"], [5, 0, 1, "", "MpMetricNormalization"], [8, 0, 1, "", "QuantizationConfig"], [9, 0, 1, "", "QuantizationErrorMethod"], [10, 0, 1, "", "ResourceUtilization"], [22, 3, 1, "", "keras_resource_utilization_data"], [30, 3, 1, "", "pytorch_resource_utilization_data"]], "model_compression_toolkit.core.BitWidthConfig": [[0, 2, 1, "", "manual_activation_bit_width_selection_list"], [0, 2, 1, "", "manual_weights_bit_width_selection_list"]], "model_compression_toolkit.core.common.quantization.bit_width_config": [[0, 0, 1, "", "ManualBitWidthSelection"]], "model_compression_toolkit.core.common.quantization.bit_width_config.ManualBitWidthSelection": [[0, 2, 1, "", "bit_width"], [0, 2, 1, "", "filter"]], "model_compression_toolkit.core.network_editor": [[43, 0, 1, "", "ChangeCandidatesActivationQuantConfigAttr"], [43, 0, 1, "", "ChangeCandidatesActivationQuantizationMethod"], [43, 0, 1, "", "ChangeCandidatesWeightsQuantConfigAttr"], [43, 0, 1, "", "ChangeCandidatesWeightsQuantizationMethod"], [43, 0, 1, "", "ChangeFinalActivationQuantConfigAttr"], [43, 0, 1, "", "ChangeFinalWeightsQuantConfigAttr"], [43, 0, 1, "", "ChangeFinalWeightsQuantizationMethod"], [43, 0, 1, "", "ChangeQuantizationParamFunction"], [43, 0, 1, "", "EditRule"], [43, 0, 1, "", "NodeNameFilter"], [43, 0, 1, "", "NodeNameScopeFilter"], [43, 0, 1, "", "NodeTypeFilter"]], "model_compression_toolkit.data_generation": [[1, 0, 1, "", "BNLayerWeightingType"], [1, 0, 1, "", "BatchNormAlignemntLossType"], [1, 0, 1, "", "DataGenerationConfig"], [1, 0, 1, "", "DataInitType"], [1, 0, 1, "", "ImageGranularity"], [1, 0, 1, "", "ImageNormalizationType"], [1, 0, 1, "", "ImagePipelineType"], [1, 0, 1, "", "OutputLossType"], [1, 0, 1, "", "SchedulerType"], [14, 3, 1, "", "get_keras_data_generation_config"], [16, 3, 1, "", "get_pytorch_data_generation_config"], [20, 3, 1, "", "keras_data_generation_experimental"], [28, 3, 1, "", "pytorch_data_generation_experimental"]], "model_compression_toolkit.exporter": [[41, 0, 1, "", "KerasExportSerializationFormat"], [41, 0, 1, "", "PytorchExportSerializationFormat"], [41, 0, 1, "", "QuantizationFormat"], [41, 0, 1, "", "keras_export_model"], [41, 0, 1, "", "pytorch_export_model"]], "model_compression_toolkit.gptq": [[4, 0, 1, "", "GPTQHessianScoresConfig"], [4, 0, 1, "", "GradientPTQConfig"], [4, 0, 1, "", "GradualActivationQuantizationConfig"], [4, 0, 1, "", "QFractionLinearAnnealingConfig"], [4, 0, 1, "", "RoundingType"], [15, 3, 1, "", "get_keras_gptq_config"], [17, 3, 1, "", "get_pytorch_gptq_config"], [21, 3, 1, "", "keras_gradient_post_training_quantization"], [29, 3, 1, "", "pytorch_gradient_post_training_quantization"]], "model_compression_toolkit.pruning": [[6, 0, 1, "", "ChannelsFilteringStrategy"], [6, 0, 1, "", "ImportanceMetric"], [6, 0, 1, "", "PruningConfig"], [7, 0, 1, "", "PruningInfo"], [25, 3, 1, "", "keras_pruning_experimental"], [32, 3, 1, "", "pytorch_pruning_experimental"]], "model_compression_toolkit.pruning.PruningConfig": [[6, 2, 1, "", "channels_filtering_strategy"], [6, 2, 1, "", "importance_metric"], [6, 2, 1, "", "num_score_approximations"]], "model_compression_toolkit.pruning.PruningInfo": [[7, 4, 1, "", "importance_scores"], [7, 4, 1, "", "pruning_masks"]], "model_compression_toolkit.ptq": [[24, 3, 1, "", "keras_post_training_quantization"], [31, 3, 1, "", "pytorch_post_training_quantization"]], "model_compression_toolkit.qat": [[44, 0, 1, "", "QATConfig"], [26, 3, 1, "", "keras_quantization_aware_training_finalize_experimental"], [27, 3, 1, "", "keras_quantization_aware_training_init_experimental"], [33, 3, 1, "", "pytorch_quantization_aware_training_finalize_experimental"], [34, 3, 1, "", "pytorch_quantization_aware_training_init_experimental"]], "model_compression_toolkit.target_platform_capabilities": [[42, 0, 1, "", "AttributeFilter"], [42, 0, 1, "", "Eq"], [42, 0, 1, "", "Greater"], [42, 0, 1, "", "GreaterEq"], [42, 0, 1, "", "NotEq"], [45, 0, 1, "", "QuantizationMethod"], [42, 0, 1, "", "Smaller"], [42, 0, 1, "", "SmallerEq"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema": [[45, 0, 1, "", "AttributeQuantizationConfig"], [45, 0, 1, "", "Fusing"], [45, 0, 1, "", "OpQuantizationConfig"], [45, 0, 1, "", "OperatorSetGroup"], [45, 0, 1, "", "OperatorsSet"], [45, 0, 1, "", "QuantizationConfigOptions"], [45, 0, 1, "", "TargetPlatformCapabilities"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.AttributeQuantizationConfig": [[45, 2, 1, "", "enable_weights_quantization"], [45, 2, 1, "", "lut_values_bitwidth"], [45, 2, 1, "", "weights_n_bits"], [45, 2, 1, "", "weights_per_channel_threshold"], [45, 2, 1, "", "weights_quantization_method"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.Fusing": [[45, 2, 1, "", "fuse_op_quantization_config"], [45, 2, 1, "", "name"], [45, 2, 1, "", "operator_groups"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorSetGroup": [[45, 2, 1, "", "name"], [45, 2, 1, "", "operators_set"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.OperatorsSet": [[45, 2, 1, "", "name"], [45, 2, 1, "", "qc_options"], [45, 2, 1, "", "type"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.QuantizationConfigOptions": [[45, 2, 1, "", "base_config"], [45, 2, 1, "", "quantization_configurations"]], "model_compression_toolkit.target_platform_capabilities.schema.mct_current_schema.TargetPlatformCapabilities": [[45, 2, 1, "", "SCHEMA_VERSION"], [45, 2, 1, "", "add_metadata"], [45, 2, 1, "", "default_qco"], [45, 2, 1, "", "fusing_patterns"], [45, 2, 1, "", "insert_preserving_quantizers"], [45, 2, 1, "", "is_simd_padding"], [45, 2, 1, "", "name"], [45, 2, 1, "", "operator_set"], [45, 2, 1, "", "tpc_minor_version"], [45, 2, 1, "", "tpc_patch_version"], [45, 2, 1, "", "tpc_platform_type"]], "model_compression_toolkit.trainable_infrastructure": [[46, 0, 1, "", "BaseKerasTrainableQuantizer"], [46, 0, 1, "", "BasePytorchTrainableQuantizer"], [46, 0, 1, "", "TrainableQuantizerActivationConfig"], [46, 0, 1, "", "TrainableQuantizerWeightsConfig"], [46, 0, 1, "", "TrainingMethod"]], "model_compression_toolkit.wrapper.mct_wrapper": [[11, 0, 1, "", "MCTWrapper"]], "model_compression_toolkit.wrapper.mct_wrapper.MCTWrapper": [[11, 1, 1, "", "quantize_and_export"]], "model_compression_toolkit.xquant.common.xquant_config": [[12, 0, 1, "", "XQuantConfig"]], "model_compression_toolkit.xquant.keras.facade_xquant_report": [[36, 3, 1, "", "xquant_report_keras_experimental"]], "model_compression_toolkit.xquant.pytorch.facade_xquant_report": [[37, 3, 1, "", "xquant_report_pytorch_experimental"], [38, 3, 1, "", "xquant_report_troubleshoot_pytorch_experimental"]]}, "objnames": {"0": ["py", "class", "Python class"], "1": ["py", "method", "Python method"], "2": ["py", "attribute", "Python attribute"], "3": ["py", "function", "Python function"], "4": ["py", "property", "Python property"]}, "objtypes": {"0": "py:class", "1": "py:method", "2": "py:attribute", "3": "py:function", "4": "py:property"}, "terms": {"": [3, 6, 8, 10, 21, 24, 25, 26, 27, 29, 31, 32, 34, 35, 41, 42, 43, 45, 46, 48, 50], "0": [1, 3, 4, 5, 7, 8, 11, 12, 14, 16, 21, 24, 25, 26, 27, 32, 40, 41, 46, 48], "05": 8, "06": 5, "08153": 46, "1": [1, 3, 4, 5, 7, 8, 9, 11, 12, 17, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32, 33, 40, 41, 48, 50], "10": [20, 21, 24, 27, 28, 29, 31, 34], "100": 40, "10000000000": 5, "14": 11, "15": 41, "16": [12, 41, 48], "1902": 46, "1e": [5, 15, 17], "1st": 15, "2": [3, 8, 9, 12, 15, 17, 20, 28, 40, 45, 46, 48, 50], "20": 49, "2021": 50, "2023": 50, "224": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "2f": 40, "2nd": 15, "3": [3, 9, 11, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 41, 46], "32": [4, 5, 11], "3e": [15, 17], "3rd": 15, "4": [15, 17, 20, 21, 24, 25, 27, 28, 29, 31, 32, 34, 48], "4th": 15, "5": [11, 12, 15, 17, 25, 32, 48], "50": [25, 32], "52587890625e": 8, "6": [28, 40], "75": [11, 21, 24, 26, 27], "8": [20, 21, 24, 26, 27, 28, 41, 45, 46], "9": 43, "A": [0, 3, 4, 5, 7, 8, 13, 15, 17, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 43, 44, 45, 50], "And": 48, "As": [5, 48, 49], "By": [4, 5, 11, 25, 29, 31, 32, 41, 49], "For": [3, 8, 12, 18, 19, 20, 21, 24, 26, 27, 28, 34, 41, 45, 46, 47, 48, 49, 50], "If": [2, 3, 4, 5, 9, 12, 15, 17, 21, 24, 26, 27, 29, 31, 39, 40, 41, 42, 45, 48], "In": [5, 20, 21, 24, 27, 28, 29, 31, 34, 41, 42, 44, 48], "It": [2, 9, 11, 12, 45, 46, 48], "No": 1, "One": 49, "The": [0, 1, 3, 4, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 36, 37, 38, 40, 41, 43, 45, 46, 48, 49], "Then": [3, 21, 24, 27, 29, 31, 34, 43, 49], "There": [41, 48, 49], "These": [48, 49], "To": [41, 48, 49], "With": 48, "_": [21, 24, 27, 29, 31, 34, 41], "__call__": 40, "__import__": 40, "__init__": 40, "_input_data": 41, "_model_input_nam": 41, "_model_output_nam": 41, "_with_model_output_loss_object": 48, "about": [3, 4, 7, 13, 15, 17, 21, 24, 26, 27, 40, 41, 45, 46], "abov": [12, 48], "absolut": 9, "abstract": [13, 46], "accept": [15, 40, 45], "access": 7, "accord": [13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 41, 42], "accordingli": 45, "accuraci": [9, 12, 48], "achiev": 25, "across": 9, "act": 7, "act_hessian_default_batch_s": [15, 17], "action": 40, "activ": [0, 3, 4, 5, 8, 10, 11, 21, 22, 24, 27, 29, 30, 31, 34, 41, 43, 44, 45, 46, 48, 49], "activation_bias_correct": 8, "activation_bias_correction_threshold": 8, "activation_channel_equ": 8, "activation_error_method": [8, 11], "activation_memori": 10, "activation_min_max_map": 3, "activation_n_bit": [45, 46], "activation_op": 3, "activation_quantization_candid": 46, "activation_quantization_method": [43, 45, 46], "activation_quantization_param": 46, "activation_quantization_params_fn": 43, "activation_quantizer_map": 3, "activation_quantizer_params_overrid": 44, "activation_training_method": 44, "ad": 45, "adam": [14, 15, 17], "add": [1, 3, 12, 14, 16, 23, 46], "add_metadata": 45, "addit": [23, 41, 48], "address": 45, "advanc": 3, "affect": [21, 24, 26, 27], "after": [13, 21, 23, 24, 27, 34, 48, 50], "aim": [25, 32], "algorithm": 5, "align": [1, 14, 16], "all": [1, 3, 4, 5, 8, 43, 46, 49], "allimag": [1, 16], "allow": [6, 12, 20, 28, 40, 41, 45], "along": 49, "also": [25, 32, 45], "amount": 9, "an": [1, 2, 3, 4, 7, 9, 11, 13, 21, 24, 27, 34, 36, 37, 38, 40, 41, 42, 43, 45, 46, 48, 50], "analysi": [25, 32], "analyz": [25, 32, 38], "analyze_similar": 40, "ani": [1, 2, 3, 5, 11, 36, 37, 38, 41, 42, 46], "anneal": 4, "anomali": 9, "api": [3, 4, 24, 27, 34, 44, 48], "append": 40, "appli": [0, 1, 5, 8, 13, 41, 42, 43, 45, 48], "applic": [21, 22, 24, 25, 26, 27, 41], "approach": 6, "appropri": 48, "approxim": [6, 25, 32], "ar": [3, 5, 9, 12, 18, 19, 21, 24, 25, 27, 29, 31, 32, 34, 40, 41, 45, 46, 47, 48, 49], "architectur": [25, 32], "argument": [4, 40, 41, 45], "arrai": [7, 11], "art": 50, "arxiv": [46, 50], "assess": [25, 32], "associ": [25, 32], "assum": [25, 32], "astyp": 41, "attent": [4, 15, 17, 46], "attirbut": 3, "attr": 42, "attr_nam": 43, "attr_valu": 43, "attr_weights_configs_map": 45, "attribut": [43, 45, 46], "attributefilt": 42, "auto": 13, "automat": 48, "auxiliari": [15, 17], "avail": 41, "averag": [1, 5, 14, 15, 16, 17, 48], "avg": 5, "avoid": 9, "awar": [13, 44, 46, 50], "axi": [3, 46, 48], "backend": 45, "bar": 40, "base": [1, 4, 5, 8, 9, 11, 13, 15, 17, 18, 19, 20, 25, 28, 31, 32, 46, 48, 50], "base_config": 45, "basenod": 7, "basenodematch": 0, "basic": [9, 46], "batch": [1, 4, 5, 14, 15, 16, 17, 20, 21, 24, 27, 28, 29, 31, 34], "batchnorm": [1, 14, 16, 20, 21, 24, 27, 29, 31, 34], "batchnorm2d": 28, "batchnormalignemntlosstyp": [14, 16], "batchwis": [1, 14], "been": [7, 40], "begin": 4, "behavior": [9, 40, 48], "being": [21, 24, 27, 29, 31, 34, 40, 45, 46], "below": [12, 48], "between": [4, 5, 12, 21, 29, 31, 45, 48, 49], "bia": [4, 9, 11, 15, 17, 21, 24, 26, 27], "bias": 9, "bidwidth": 5, "bit": [0, 5, 10, 13, 21, 24, 26, 27, 34, 39, 41, 43, 45, 46, 50], "bit_width": 0, "bit_width_config": [0, 39], "bitwidth": [5, 12, 21, 24, 26, 27, 48], "bitwidthconfig": [13, 39], "block": [46, 49], "bn_alignment_loss_typ": [1, 14, 16], "bn_layer_typ": [1, 14, 16], "bnlayerweightingtyp": [14, 16], "bool": [1, 4, 5, 11, 12, 14, 15, 16, 17, 40, 45, 46], "boolean": 23, "bop": 10, "both": [11, 21, 24, 29, 31, 33, 46, 49], "build": [22, 30, 46, 50], "built": [27, 34, 46], "bypass": 40, "byte": [10, 21, 24, 25, 27, 32, 34, 49], "c": [12, 48], "calcul": [5, 6, 13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 48], "calibr": [11, 21, 22, 24, 27, 29, 30, 31, 34], "call": [22, 30, 35, 45, 49], "callabl": [3, 5, 11, 12, 15, 17, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 36, 37, 38, 40, 41, 42], "callback": 40, "can": [3, 4, 8, 11, 13, 15, 17, 20, 22, 25, 28, 30, 32, 40, 41, 43, 45, 46, 48, 49, 50], "candid": [5, 21, 24, 26, 27, 43], "cannot": 45, "capabl": [11, 18, 19, 25, 30, 32], "case": 5, "caus": [12, 13, 38, 48], "chang": [20, 28, 41, 43, 48, 49], "changecandidatesactivationquantconfigattr": 43, "changecandidatesactivationquantizationmethod": 43, "changecandidatesweightsquantconfigattr": 43, "changecandidatesweightsquantizationmethod": 43, "changefinalactivationquantconfigattr": 43, "changefinalweightsquantconfigattr": 43, "changefinalweightsquantizationmethod": 43, "changequantizationmethod": 43, "changequantizationparamfunct": 43, "channel": [3, 6, 7, 13, 25, 32, 45, 46, 49], "channels_filtering_strategi": 6, "check": [5, 41, 42, 43], "choos": [1, 4, 41], "chosen": 49, "circl": 48, "class": [0, 1, 5, 6, 7, 8, 9, 10, 11, 12, 13, 23, 39, 40, 41, 42, 43, 44, 45, 46], "clibrat": 31, "click": 49, "clip": [1, 9, 14, 16], "clone": 50, "close": 9, "coeffici": [3, 21, 24, 26, 27, 29, 31, 45, 46], "cohen": 50, "collaps": 11, "collect": [3, 21, 24, 27, 29, 31, 34, 36, 37, 38, 49], "com": 50, "combin": 45, "common": [0, 12], "compar": [5, 21, 29, 31, 48, 49], "comparison": 50, "compat": 41, "compil": 23, "complet": [4, 11, 40], "completedcompon": 40, "compon": [40, 45, 46, 48], "component_nam": 40, "compress": [11, 13, 20, 25, 28, 29, 32, 48], "comput": [3, 4, 5, 9, 12, 13, 15, 17, 22, 30, 36, 40, 49], "compute_distance_fn": 5, "concat_threshold_upd": 8, "concaten": [12, 45, 48], "concatn": [12, 48], "config": [4, 20, 21, 24, 25, 26, 27, 28, 29, 32, 33, 34, 39, 43, 46], "configur": [0, 4, 5, 8, 10, 11, 13, 14, 15, 16, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 33, 34, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 48, 50], "configuration_overwrit": 5, "confirm": 48, "connect": 11, "consid": [6, 14, 16, 25, 32, 45], "consol": 48, "constant": [6, 43, 46], "constraint": [21, 24, 25, 29, 31, 32], "contain": [7, 13, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 46, 48], "conv2d": [3, 20, 21, 24, 26, 27, 28, 43, 45], "conveni": 35, "convent": 48, "convert": [11, 13, 26, 33, 45], "core": [0, 3, 5, 8, 9, 10, 11, 21, 22, 24, 25, 26, 27, 29, 30, 32, 33, 34, 39, 40, 43], "core_config": [21, 22, 24, 26, 27, 29, 30, 31, 33, 34, 40], "coreconfig": [13, 21, 22, 24, 26, 27, 29, 30, 31, 33, 34, 40], "correct": 11, "correspond": [7, 48], "cosin": [48, 50], "count_param": [21, 24, 25, 26, 27], "countermeasur": 48, "cpuexecutionprovid": 41, "creat": [3, 4, 8, 11, 13, 14, 15, 16, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 42, 43, 45, 48], "creation": 41, "crop": 1, "cudaexecutionprovid": 41, "cui": 40, "current": [4, 40, 41], "currentcompon": 40, "custom": [5, 12, 20, 23, 27, 28, 41], "custom_metric_fn": 5, "custom_object": [23, 26, 27], "custom_similarity_metr": 12, "custom_tpc_opset_to_lay": 8, "cut": 40, "dash": 48, "data": [9, 13, 14, 16, 22, 25, 30, 32, 36, 37, 38, 40, 41, 45, 49, 50], "data_gen_batch_s": [1, 14, 16, 20, 28], "data_gener": [1, 14, 16, 20, 28], "data_generation_config": [20, 28], "data_init_typ": [1, 14, 16], "dataclass": [39, 40], "datagenerationconfig": [1, 13, 20, 28], "datainittyp": [14, 16], "dataset": [4, 11, 15, 17, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 41, 48, 49], "debug": [9, 39, 40], "debug_config": 39, "debugconfig": 39, "deeper": 49, "def": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 40, 41], "default": [1, 2, 4, 5, 6, 9, 11, 14, 15, 16, 17, 21, 24, 25, 29, 31, 32, 39, 41, 44, 45, 49], "default_data_gen_b": [14, 16], "default_factori": 2, "default_keras_extra_pixel": 14, "default_keras_initial_lr": 14, "default_keras_output_loss_multipli": 14, "default_keras_tpc": [21, 24, 25, 27], "default_n_it": [14, 16], "default_onnx_opset_vers": 41, "default_pytorch_bn_layer_typ": 16, "default_pytorch_extra_pixel": 16, "default_pytorch_initial_lr": 16, "default_pytorch_last_layer_typ": 16, "default_pytorch_output_loss_multipli": 16, "default_pytorch_tpc": [29, 31, 32, 34], "default_qco": 45, "default_valu": 2, "default_weight_attr_config": 45, "defaultdict": [3, 13], "defin": [0, 4, 5, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 40, 45, 46, 48], "degrad": [12, 13, 38, 48], "demonstr": [41, 45], "dens": [3, 20], "dense_nparam": [25, 32], "depend": [1, 21, 24, 27, 29, 31, 34], "describ": 48, "descript": [11, 40], "desir": [13, 21, 22, 24, 26, 27, 29, 30, 31, 34], "detail": [41, 45, 48], "detect": [9, 12, 13, 38, 48], "determin": [6, 25, 32, 45], "develop": 50, "deviat": 48, "devic": [13, 18], "device_typ": 18, "diagram": 45, "diamant": 50, "dict": [3, 7, 12, 36, 37, 38, 41, 45, 46, 48], "dictionari": [2, 3, 4, 12, 26, 27, 36, 37, 38, 41, 43, 44, 46], "differ": [1, 8, 13, 21, 24, 26, 27, 41, 45, 48, 49], "dikstein": 50, "dir": [12, 48, 49], "directori": [12, 13, 35, 48], "disabl": [15, 17, 40], "displai": [40, 48, 49], "distanc": [5, 11], "distance_weighting_method": [5, 11], "distil": [4, 50], "distribut": 9, "diverg": [9, 49], "divers": 1, "divid": 3, "divis": 49, "dnn": 46, "do": [1, 48, 49], "document": [13, 24, 27, 34, 48], "doe": 48, "doesn": 50, "don": 35, "done": 49, "dot": 49, "dqa": 46, "dror": 50, "dtype": 41, "dummi": 17, "durat": [25, 32], "dure": [4, 13, 14, 15, 16, 17, 18, 19, 36, 37, 38, 41, 43, 45, 46, 47, 49], "e": [3, 5, 11, 21, 24, 27, 29, 31, 34, 50], "each": [5, 6, 7, 12, 21, 24, 25, 27, 29, 31, 32, 34, 43, 45, 46, 48, 49], "easi": 48, "easili": [13, 50], "edit": [39, 40, 43], "editrul": 40, "effect": 9, "either": 45, "element": [7, 45], "empti": 2, "emul": 46, "enabl": [1, 5, 8, 11, 13, 15, 17, 40, 46, 50], "enable_activation_quant": [45, 46], "enable_weights_quant": [45, 46], "encapsul": [0, 8], "end_step": 4, "engin": 50, "enhanc": 50, "ensur": 5, "entir": [9, 13], "enum": [1, 3, 4, 6, 9, 46], "epoch": [4, 11, 15, 17], "epsilon": 5, "eptq": 50, "eq": 42, "equal": [9, 42], "er_list": 43, "error": [9, 11, 12, 40], "especi": 9, "estim": [4, 46], "etc": [3, 10, 13, 21, 24, 27, 29, 31, 34, 49], "euclidean": 49, "evalu": [5, 36, 37, 38], "even": 48, "exact": 17, "exampl": [3, 8, 9, 11, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 43, 45, 46, 50], "exceed": 48, "execut": 48, "exist": [2, 43, 48], "exp": 5, "exp_distance_weighting_sigma": 5, "expect": [4, 49], "experiment": [13, 20, 28, 50], "explain": [12, 13, 36, 37, 38, 46], "explicitli": 45, "expon": 5, "exponenti": 5, "export": 11, "extend": [25, 32], "extens": [11, 41, 50], "extra": [1, 14, 16], "extra_pixel": [1, 14, 16], "extrem": [9, 48], "f": 40, "facade_xquant_report": [36, 37, 38], "factor": [4, 5, 9, 15, 17], "factori": [0, 4, 39, 40], "fake": 41, "fake_qu": [27, 34], "fakely_qu": 41, "fallback": 45, "fals": [4, 5, 8, 11, 12, 14, 15, 17, 40, 46], "familiar": 48, "featur": 40, "fetch": 45, "few": [9, 49, 50], "field": [18, 19, 42, 45, 47], "figur": [40, 49], "file": [23, 26, 27, 35, 40, 41], "filepath": 23, "filter": [0, 1, 6], "final": [4, 5, 12, 13, 20, 28, 43, 48, 49, 50], "find": [21, 24, 27, 34], "fine": [15, 17, 25, 26, 27, 32, 33, 34], "first": [1, 21, 24, 27, 29, 31, 34, 41, 49], "first_layer_multipli": 1, "fix": 45, "fixed_scal": [18, 19, 45, 47], "fixed_zero_point": [18, 19, 45, 47], "flag": [1, 11, 40, 45], "flatten": [20, 28], "flip": 1, "float": [1, 4, 5, 11, 12, 14, 15, 16, 17, 21, 27, 29, 31, 34, 36, 37, 38, 41, 45, 46, 48, 49], "float32": [25, 32, 41], "float_model": [11, 36, 37, 38, 41, 48], "flush": 40, "fold": [21, 24, 27, 29, 31, 34], "folder": [35, 48], "follow": [3, 4, 11, 12, 40, 46, 48, 49], "footprint": [25, 32], "form": 45, "format": [3, 13], "fraction": 4, "framework": [3, 11, 46], "frameworkquantizationcap": [22, 29, 30, 31], "free": [6, 20, 25, 28, 32, 50], "freez": 46, "freeze_quant_param": 46, "friendli": [25, 32, 50], "from": [3, 4, 11, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 43, 45, 46, 47, 48, 49, 50], "from_config": 46, "function": [3, 4, 5, 11, 12, 13, 14, 15, 16, 17, 20, 23, 25, 28, 32, 35, 40, 43, 45, 46, 48], "fuse_op_quantization_config": 45, "fusing_pattern": 45, "futur": [18, 19, 20, 28, 45, 47], "g": [3, 11, 21, 24, 27, 29, 31, 34], "gather": [45, 49], "gaussian": [1, 14, 16], "gener": [2, 12, 13, 14, 16, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 45, 49, 50], "generated_imag": [20, 28], "get": [2, 3, 4, 5, 13, 21, 24, 26, 27, 29, 31, 33, 34, 45, 49], "get_config": 46, "get_input": 41, "get_keras_data_generation_config": [13, 14, 20], "get_keras_gptq_config": [11, 13, 15, 21], "get_ort_session_opt": 41, "get_output": 41, "get_pytorch_data_generation_config": [13, 16, 28], "get_pytorch_gptq_config": [11, 13, 17], "get_target_platform_cap": [13, 18, 45], "get_target_platform_capabilities_sdsp": [13, 19, 45], "git": 50, "github": [41, 50], "given": [2, 21, 22, 24, 27, 29, 30, 31, 34], "gordon": 50, "gptq": [4, 9, 11, 15, 17, 21, 29, 40], "gptq_conf": [15, 17, 29], "gptq_config": [21, 29, 31], "gptq_quantizer_params_overrid": 4, "gptq_representative_data_gen": [21, 29], "grad": 1, "gradient": [1, 4, 11, 13, 31, 50], "gradientptq": [4, 13], "gradientptqconfig": [13, 21, 29], "gradual": 4, "gradual_activation_quant": [15, 17], "gradual_activation_quantization_config": 4, "gradualactivationquant": [15, 17], "gradualactivationquantizationconfig": [15, 17], "granular": [1, 14, 16], "graph": [22, 30, 43, 49], "greater": 42, "greatereq": 42, "greedi": [5, 6], "group": [3, 6, 25, 32, 45], "h": 50, "ha": [7, 40, 41, 42, 43], "habi": 50, "handl": [11, 21, 24, 27, 29, 31, 34], "handler": 35, "hardwar": [13, 25, 32, 45, 46, 50], "have": [3, 41, 42, 48, 49], "henc": 45, "here": [12, 25, 32, 41, 45, 48, 50], "hessian": [4, 5, 6, 9, 11, 15, 17, 25, 32, 50], "hessian_batch_s": [4, 5, 15, 17], "hessian_weights_config": 4, "hessians_num_sampl": 4, "higher": [25, 32], "highlight": 48, "hight": 28, "histogram": [21, 24, 27, 29, 31, 34, 49], "histori": 40, "hmse": 9, "hold": [3, 39, 42, 45], "holder": 46, "how": [3, 6, 9, 21, 22, 24, 27, 29, 31, 34, 40, 41, 46, 50], "howev": 41, "hptq": [45, 50], "http": [46, 50], "hw": 22, "i": [1, 2, 3, 4, 5, 6, 7, 9, 11, 12, 13, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 39, 40, 41, 42, 43, 45, 46, 48, 49, 50], "ident": [1, 5], "identifi": [25, 32, 45, 48], "ignor": [18, 19, 45, 47], "ilp": [21, 24, 27, 34], "imag": [1, 4, 5, 11, 14, 16, 20, 21, 24, 27, 28, 29, 31, 34, 48, 49], "image_clip": [1, 14, 16], "image_granular": [1, 14, 16], "image_normalization_typ": [1, 14, 16], "image_pipeline_typ": [1, 14, 16], "imagegranular": [14, 16], "imagenet": 1, "imagenet1k_v1": 32, "imagenormalizationtyp": [14, 16], "imagepipelinetyp": [14, 16], "imagewis": 1, "impact": [25, 32], "implement": [12, 46], "implment": 46, "import": [3, 6, 7, 8, 9, 11, 13, 15, 17, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 40, 41, 43, 46, 48, 49], "importance_metr": 6, "importance_scor": 7, "improv": [5, 25, 32, 48], "imx500": [11, 41, 45], "imx500_tp_model": 18, "in_model": [21, 22, 24, 26, 27, 30, 33, 34], "in_modul": [31, 48], "includ": [4, 7, 11, 21, 24, 27, 29, 31, 34, 45, 46], "increas": [4, 5], "index": [3, 13], "indic": [3, 7, 25, 32, 45, 48], "individu": 48, "induc": 9, "inf": [8, 10, 11], "infer": [13, 26, 33, 45, 46], "inferablequant": [26, 33], "inferencesess": 41, "influenc": 9, "info": [6, 35, 40], "inform": [3, 4, 13, 15, 17, 18, 19, 21, 24, 25, 27, 29, 31, 32, 34, 40, 45, 46, 47], "infrastructur": 46, "init": [13, 43, 50], "initi": [1, 2, 4, 6, 11, 12, 14, 16, 27, 34, 46, 48], "initial_lr": [1, 14, 16], "initial_q_fract": 4, "inner": 2, "input": [1, 5, 11, 14, 16, 21, 24, 27, 29, 31, 34, 40, 45, 48], "input_sc": 8, "input_shap": 20, "insert": 49, "insert_preserving_quant": 45, "instal": 41, "instanc": [4, 11, 13, 15, 17, 43, 45, 49], "instanti": [4, 8, 44], "instruct": 45, "insuffici": [12, 48], "int": [0, 1, 4, 5, 6, 12, 14, 15, 16, 17, 20, 28, 35, 40, 41, 45, 46, 48], "int8": 41, "integ": [5, 9, 41, 45], "interest": 5, "interfac": [4, 11, 17], "introduc": 46, "inverse_min_max_diff": 1, "involv": [20, 25, 28, 32], "is_detect_under_threshold_quantize_error": 12, "is_keras_layer_export": 41, "is_layer_exportable_fn": 41, "is_pytorch_layer_export": 41, "is_simd_pad": 45, "issu": [5, 41, 48], "item": 48, "iter": [1, 14, 16, 20, 21, 24, 27, 28, 29, 31, 34, 40], "its": [2, 3, 11, 13, 23, 25, 32, 42, 45, 49], "jen": 50, "judg": [12, 13, 38, 48], "judgment": 48, "just": 50, "keep": [33, 40, 50], "kei": [2, 11, 12, 25, 32, 42], "kept": [7, 27, 34], "ker": 27, "kera": [3, 11, 13, 43, 46, 50], "keras_appl": [1, 14], "keras_data_generation_experiment": [13, 20], "keras_default_tpc": 22, "keras_file_path": 41, "keras_gradient_post_training_quant": [13, 15, 21], "keras_load_quantized_model": 23, "keras_post_training_quant": [13, 24, 41, 43, 49], "keras_pruning_experiment": [13, 25], "keras_quantization_aware_training_finalize_experiment": [13, 26], "keras_quantization_aware_training_init_experiment": [13, 26, 27], "keras_resource_utilization_data": [13, 22], "kernel": [3, 21, 24, 26, 27, 43, 46], "kernel_channels_map": 3, "kernel_op": 3, "kernel_ops_attributes_map": 3, "keyword": 45, "kl": [9, 49], "know": [3, 13], "knowledg": [4, 50], "known_dict": 2, "kwarg": 43, "l": [25, 50], "l2": 1, "l2_squar": [1, 14, 16], "l_p_valu": [8, 9], "label": [6, 25, 32, 45, 50], "lambda": 41, "larg": [12, 48], "larger": 5, "last": [3, 4, 5, 48], "last_lay": 5, "last_layer_typ": [1, 16], "latenc": 41, "latest": 50, "launch": 49, "layaer": [13, 38], "layer": [1, 3, 5, 7, 9, 11, 12, 14, 15, 16, 17, 20, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 40, 41, 43, 45, 46, 48, 49], "layer_min_max_map": 3, "layer_weighting_typ": [1, 14, 16], "layerfilterparam": 42, "learn": [1, 14, 15, 16, 17, 46], "learnabl": 46, "least": 6, "left": 11, "let": 41, "level": 35, "lfh": [6, 25, 32], "librari": [3, 8], "like": [8, 45], "limit": [6, 21, 24, 26, 27, 29, 31, 34], "line": 48, "linear": [4, 11, 28], "linear_collaps": [8, 11], "linearli": 4, "link": 48, "list": [0, 1, 3, 5, 11, 14, 15, 16, 20, 28, 40, 41, 43, 50], "liter": 45, "ll": [20, 28], "load": [13, 26, 27, 41, 46], "load_model": [26, 27], "loadopt": 23, "log": [4, 12, 13, 15, 17, 35, 48, 49], "log_funct": [4, 15, 17], "log_norm": 4, "log_tensorboard_xqu": 48, "logdir": 49, "logger": [13, 40, 49], "longer": 41, "look": [24, 27, 34, 45, 50], "lookup": 45, "loss": [1, 4, 12, 14, 15, 16, 17, 21, 25, 29, 31, 32, 48], "lot": 9, "low": 11, "lp": 9, "lsq": 46, "lut_pot_quant": 45, "lut_sym_quant": 45, "lut_values_bitwidth": 45, "mae": [9, 49], "mai": [20, 21, 24, 27, 28, 29, 31, 34, 42, 49], "main": [11, 45, 48, 49], "maintain": 9, "make": [9, 40], "manag": [0, 11], "mandatori": 41, "mani": 49, "manipul": [0, 1], "manner": 45, "manual": [0, 13, 39, 48], "manual_activation_bit_width_selection_list": 0, "manual_weights_bit_width_selection_list": 0, "manualweightsbitwidthselect": 0, "map": [3, 45], "mask": 7, "match": [18, 19, 42, 43], "mathemat": 49, "max": [1, 3, 5, 8, 9, 21, 22, 24, 27, 29, 30, 31, 34, 49], "maxbit": 5, "maxim": [21, 24, 27, 34], "mct": [3, 8, 11, 13, 15, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 39, 40, 41, 43, 45, 46, 47, 48, 49, 50], "mct_current_schema": 45, "mct_quantiz": 41, "mct_wrapper": 11, "mctwrapper": 11, "mean": [1, 4, 9, 49], "measur": [6, 10, 12, 48, 49], "meet": [25, 32], "memori": [10, 25, 32, 49], "messag": 48, "metadata": [7, 45], "method": [4, 5, 6, 9, 11, 13, 25, 32, 35, 41, 43, 44, 45, 46], "metric": [4, 5, 6, 12, 36, 37, 38, 48], "metric_epsilon": 5, "metric_norm": 5, "metric_normalization_threshold": 5, "min": [1, 3, 5, 8, 9, 21, 24, 27, 29, 31, 34, 49], "min_threshold": [8, 46], "minbit": 5, "minim": [5, 9, 21, 25, 29, 31, 32], "minimum": [9, 46], "minor": 45, "minut": 50, "mix": [5, 10, 11, 12, 13, 21, 22, 24, 26, 27, 29, 30, 31, 34, 39, 45, 48, 50], "mixed_precis": 11, "mixed_precision_config": [21, 22, 24, 26, 27, 39], "mixedprecisionquantizationconfig": [11, 13, 21, 22, 24, 26, 27, 39], "mkstemp": 41, "mobilenet": [21, 22], "mobilenet_v2": [24, 26, 27, 29, 30, 31, 33, 34, 41], "mobilenetv2": [24, 26, 27, 41, 49], "model": [3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 18, 19, 20, 21, 24, 25, 28, 29, 31, 32, 36, 37, 38, 39, 40, 43, 44, 45, 46, 48, 49], "model_compression_toolkit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 48, 49], "model_fil": [26, 27], "model_format_onnx_mctq": 41, "model_mp": 5, "model_output": 41, "modifi": [13, 43], "modul": [13, 28, 29, 30, 31, 32, 37, 38], "more": [9, 18, 19, 24, 25, 27, 32, 34, 41, 45, 47, 48, 49], "most": 48, "mse": [8, 9, 11, 12, 48, 49], "much": 40, "multipl": [3, 5, 35, 45], "multiple_tensors_mse_loss": 4, "multipli": [1, 12, 14, 16, 48], "must": [25, 32, 45], "n_epoch": [4, 11, 15, 17, 21], "n_imag": [20, 28], "n_iter": [1, 14, 16, 20, 28], "nadam": 15, "name": [12, 40, 43, 45, 48, 49], "nchw": 3, "ndarrai": 7, "necessari": [4, 11, 41, 46, 48], "need": [3, 11, 13, 21, 24, 27, 29, 31, 34, 41, 42, 46, 48], "neg": [1, 5, 48], "negative_min_max_diff": [1, 16], "network": [3, 6, 11, 33, 39, 40, 43, 49, 50], "network_editor": [13, 40], "netzer": 50, "neural": [6, 11, 50], "neuron": 7, "new": [43, 45], "next": [20, 28, 41, 42], "nhwc": 3, "nn": [28, 37, 38], "no_norm": 1, "no_quantization_op": 3, "noclip": [8, 9], "node": [0, 27, 34, 41, 43, 46, 49], "node_nam": 43, "node_name_scop": 43, "node_typ": 43, "nodenamefilt": 43, "nodenamescopefilt": 43, "nodetypefilt": 43, "nois": 9, "non": [5, 15, 17, 45], "none": [1, 2, 4, 5, 8, 11, 12, 15, 17, 21, 23, 24, 27, 29, 31, 34, 35, 39, 40, 41, 43, 44, 45, 46], "norm": [9, 49], "norm_scor": [4, 5], "normal": [1, 4, 5, 9, 14, 16], "note": [21, 24, 26, 27], "notebook": 50, "noteq": 42, "notic": [20, 25, 28, 32, 41], "now": [6, 18, 19, 34, 41, 45, 46, 47, 49], "np": [7, 11, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "num_calibration_batch": [21, 24, 27, 29, 31, 34], "num_interest_points_factor": 5, "num_of_imag": [5, 11, 21, 24], "num_score_approxim": [6, 25, 32], "number": [1, 4, 5, 6, 11, 12, 14, 15, 16, 17, 20, 21, 24, 25, 27, 28, 29, 31, 32, 34, 40, 45, 46, 48], "numel": 32, "numer": 5, "numpi": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "o": 50, "object": [0, 3, 4, 5, 6, 10, 12, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 26, 27, 29, 30, 31, 34, 41, 43, 45, 46, 48], "observ": [9, 21, 29, 31, 45, 49], "one": [5, 42, 49], "onli": [3, 4, 5, 6, 9, 12, 21, 24, 26, 27, 41, 45], "onlin": [27, 34], "onnx": 11, "onnx_file_path": 41, "onnx_opset_vers": 41, "onnxruntim": 41, "op": [42, 45], "open": [41, 49, 50], "oper": [3, 10, 40, 42, 45], "operator_group": 45, "operator_set": 45, "operators_set": 45, "operatorsetnam": 45, "opquantizationconfig": [18, 19, 47], "optim": [1, 3, 4, 10, 11, 13, 14, 15, 16, 17, 18, 19, 21, 22, 24, 27, 29, 30, 31, 34, 39, 45, 46, 47, 50], "optimizer_bia": 4, "optimizer_quantization_paramet": 4, "optimizer_rest": [4, 15, 17], "optimizerv2": 15, "option": [11, 13, 21, 23, 24, 25, 27, 29, 31, 32, 34, 41, 45], "order": [15, 17, 21, 24, 27, 34, 40, 41, 42, 44], "org": 46, "orient": [13, 46], "origin": [25, 35, 36, 37, 38, 49], "ort": 41, "other": [1, 11, 15, 17, 48], "otherwis": 45, "our": [21, 24, 26, 27, 34, 50], "out": [3, 6], "out1": 50, "out2": 50, "out3": 50, "out_channel_axis_map": 3, "outlier": [9, 12, 48], "output": [1, 3, 9, 12, 14, 16, 20, 21, 24, 27, 28, 29, 31, 33, 34, 40, 45, 48, 49, 50], "output_image_s": [20, 28], "output_loss_multipli": [1, 14, 16], "output_loss_typ": [1, 14, 16], "output_nam": 41, "outputlosstyp": [14, 16], "over": 5, "overal": 9, "overrid": [4, 44], "overwrit": 5, "p": [9, 32], "packag": [41, 46, 50], "pad": 45, "page": 13, "pair": 49, "param": [17, 40, 43, 46], "param_item": 11, "paramet": [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46], "pars": 45, "part": 41, "pass": [2, 3, 5, 15, 17, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 43], "patch": 45, "path": [11, 13, 23, 35, 41, 48, 49], "pattern": 45, "pdf": 46, "per": [1, 3, 4, 21, 24, 27, 34, 45, 46, 49], "per_sampl": 4, "percentag": [5, 40], "peretz": 50, "perform": [6, 10, 11, 20, 25, 28, 32], "phase": [9, 49], "pinpoint": 40, "pip": [41, 50], "pipelin": [1, 11, 14, 16], "pixel": [1, 14, 16], "place": 45, "plan": 41, "platform": [11, 18, 19, 21, 24, 25, 26, 27, 30, 32, 45], "pleas": [9, 24, 27, 34, 41, 44, 48, 50], "plot": [40, 49], "point": [4, 5, 15, 17, 21, 29, 31, 36, 37, 38, 45, 49], "posit": 45, "possibl": [9, 21, 24, 27, 34, 45, 49], "post": [4, 11, 13, 25, 27, 32, 34, 50], "power": [21, 24, 27, 29, 31, 34, 45], "power_of_two": 45, "poweroftwo": 46, "pre": 5, "preced": [21, 24, 27, 29, 31, 34], "precis": [5, 10, 11, 12, 13, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 34, 39, 45, 48, 50], "predefin": [5, 6], "predict": 41, "prepar": [11, 13, 27, 34], "preprint": 50, "present": [2, 48, 49], "preserv": 45, "pretrain": [33, 34], "prevent": 5, "print": 40, "prior": 5, "prioriti": 11, "problemat": 40, "procedur": 48, "process": [4, 5, 8, 13, 14, 15, 16, 17, 18, 19, 20, 25, 28, 32, 39, 43, 44, 45, 47, 49], "product": 49, "progress": 40, "progress_info_callback": 40, "progress_perc": 40, "progressinfocallback": 40, "project": [41, 50], "properti": 7, "propos": [46, 48], "provid": [2, 11, 20, 25, 28, 32, 40, 41, 45, 46, 48, 49], "prune": [10, 50], "pruned_model": [25, 32], "pruning_config": [25, 32], "pruning_info": [25, 32], "pruning_mask": 7, "pruning_num_score_approxim": 6, "pruningconfig": [6, 13, 25, 32], "pruninginfo": [7, 13, 25, 32], "ptq": [11, 24, 31, 41, 48], "purpos": [20, 28, 40], "py": 50, "pydantic_cor": 45, "pypi": 50, "python": [35, 50], "pytorch": [11, 13, 45, 46, 50], "pytorch_data_generation_experiment": [13, 28], "pytorch_default_tpc": 30, "pytorch_gradient_post_training_quant": [13, 17, 29], "pytorch_post_training_quant": [13, 31, 41, 48], "pytorch_pruning_experiment": [13, 32], "pytorch_quantization_aware_training_finalize_experiment": [13, 33], "pytorch_quantization_aware_training_init_experiment": [13, 33, 34], "pytorch_resource_utilization_data": [13, 30], "q": 41, "q_fraction_scheduler_polici": 4, "qat": [26, 27, 33, 34, 44], "qat_config": [13, 27, 34], "qatconfig": [27, 34], "qc": 8, "qc_option": 45, "qmodel": 11, "qnnpack": 45, "quant": 41, "quantifi": [7, 49], "quantiz": [0, 3, 4, 5, 8, 9, 11, 12, 13, 15, 17, 20, 22, 28, 30, 36, 37, 38, 39, 40, 43, 44, 45, 46, 49, 50], "quantization_config": [39, 46], "quantization_configur": 45, "quantization_format": 41, "quantization_info": [21, 24, 26, 27, 29, 31, 33, 34], "quantization_preserv": [18, 19, 45, 47], "quantizationconfig": [9, 13, 39], "quantizationerrormethod": [8, 11, 13], "quantizationmethod": [3, 46], "quantize_and_export": 11, "quantize_reported_dir": [12, 48], "quantized_exportable_model": 41, "quantized_info": 48, "quantized_model": [11, 21, 24, 26, 27, 33, 34, 36, 37, 38, 48], "quantized_modul": [29, 31], "quantizewrapp": [13, 27, 33, 34], "question": 41, "r": 50, "radam": 16, "rais": 45, "random": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "random_data_gen": 48, "rang": [3, 9, 12, 21, 24, 27, 29, 31, 34, 48], "rate": [1, 14, 15, 16, 17], "ratio": [11, 12, 48], "readi": 33, "readm": 41, "receiv": [11, 40], "recent": 48, "recommend": [9, 48], "recov": [25, 32], "red": 48, "reduc": [5, 9, 25, 32], "reduce_on_plateau": [1, 14], "reduce_on_plateau_with_reset": 16, "reduceonplateau": 1, "refer": [41, 48], "refine_mp_solut": 5, "regard": 42, "regress": 9, "regular": [1, 4, 15, 17], "regularization_factor": [4, 15, 17], "regularized_min_max_diff": [1, 14], "relat": [3, 7, 13, 45], "releas": 50, "relev": 41, "relu": 3, "relu_bound_to_power_of_2": 8, "remain": 40, "remov": [12, 25, 32, 33, 48], "replac": [26, 48], "report": [12, 13, 48], "report_dir": [12, 48], "repositori": 41, "repr_datagen": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34], "repr_dataset": [36, 37, 38, 41], "repres": [4, 5, 10, 11, 15, 17, 21, 24, 25, 26, 27, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 48, 49], "representative_data_gen": [21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 41, 48], "representative_dataset": 11, "request": 2, "requir": [21, 24, 27, 29, 31, 34, 46, 49], "research": [9, 50], "reshap": [3, 20], "residu": 11, "residual_collaps": [8, 11], "resnet50": [25, 32, 41], "resnet50_weight": 32, "resolut": 9, "resourc": [6, 10, 11, 13, 21, 24, 25, 26, 27, 32, 33, 34, 49], "resourceutil": [13, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 34], "respect": 48, "respectivli": 3, "rest": 4, "result": [9, 48], "retrain": [25, 32], "retriev": [18, 19, 40, 45], "return": [2, 4, 5, 7, 11, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41], "round": 4, "rounding_typ": 4, "ru": [21, 24, 26, 27], "ru_data": [22, 30], "rule": [40, 43], "run": [4, 15, 17, 40, 41, 49], "runner": 40, "same": [1, 41, 45], "sampl": [4, 15, 17, 49], "save": [3, 11, 12, 27, 35, 41, 46, 48], "save_model_path": [11, 41], "saved_model": 23, "savedmodel": 23, "scalar": 49, "scale": [4, 5, 45], "scale_log_norm": 4, "schedul": [1, 4, 14, 16, 40], "scheduler_typ": [1, 14, 16], "schedulertyp": [14, 16], "schema": 45, "schema_vers": 45, "score": [4, 5, 6, 7, 9, 11, 15, 17, 25, 32], "sdsp": [11, 13, 45], "sdsp_v3_14": 19, "sdsp_version": [11, 19], "search": [5, 10, 13, 21, 24, 27, 29, 31, 34], "second": 49, "section": 40, "see": [4, 17, 48, 50], "seen": 49, "select": [0, 3, 6, 8, 9, 11, 13, 39, 41, 44, 45, 46], "self": [40, 45], "semiconductor": 50, "sensit": [5, 6, 9, 25, 32], "sequenti": [20, 28], "serial": 13, "serialization_format": 41, "sess": 41, "session": 41, "set": [3, 11, 12, 13, 15, 17, 20, 21, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 40, 41, 43, 45, 46, 48, 49], "set_log_fold": [35, 48, 49], "setup": [11, 50], "sever": [21, 24, 27, 29, 31, 34, 49], "shift": 48, "shift_negative_activation_correct": 8, "shift_negative_params_search": 8, "shift_negative_ratio": 8, "shift_negative_threshold_recalcul": 8, "shortli": 45, "should": [3, 6, 9, 15, 21, 22, 24, 25, 26, 27, 29, 31, 32, 34, 41, 45, 49], "show": 49, "shown": 48, "side": 9, "sigma": 5, "signal": 9, "signed": 45, "signific": [7, 48], "significantli": 48, "simd": [25, 32, 45], "simd_siz": 45, "similar": [9, 12, 36, 37, 38, 40, 48, 50], "similarli": 45, "simpl": [20, 28], "simplic": [20, 28], "simul": 40, "simulate_schedul": 40, "simultan": 45, "singl": 45, "situat": 9, "six": 48, "size": [1, 4, 5, 14, 15, 16, 17, 20, 21, 24, 26, 27, 28, 34, 41, 46], "skip": [12, 40, 41, 48], "slowli": 41, "small": [9, 48], "smaller": 42, "smallereq": 42, "smooth": [1, 46], "smoothing_and_augment": [1, 14, 16], "so": [11, 41], "softmax": 3, "softmax_shift": 8, "softquant": 4, "solut": 50, "solver": [21, 24, 27, 34], "some": [18, 19, 20, 28, 41, 45, 47, 49], "soni": 50, "sonysemiconductorsolut": 50, "sourc": 50, "spars": 9, "specif": [0, 3, 9, 11, 13, 25, 32, 43, 48, 49], "specifi": [6, 9, 11, 12, 14, 16, 18, 20, 23, 25, 28, 32, 41, 45, 48], "sphinx": 13, "sqnr": [12, 48], "squar": [1, 9], "stabl": [9, 50], "stage": 49, "standard": [25, 32, 40, 46], "start": [20, 28, 41, 46, 50], "start_step": 4, "state": 50, "state_dict": 32, "statist": [3, 21, 24, 27, 29, 31, 34, 49], "stderr": 40, "ste": [4, 44, 46], "step": [1, 4, 40, 46, 48], "store": [7, 46], "str": [3, 11, 12, 18, 19, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 35, 36, 37, 38, 40, 41, 42, 45, 48], "straight": [4, 46], "strategi": [6, 25, 32], "string": 43, "strongli": 9, "structur": [13, 50], "student": 4, "success": 11, "suffer": 41, "suggest": 48, "sum": [10, 22, 25, 30, 32], "support": [4, 11, 41], "supported_input_activation_n_bit": 45, "sure": 40, "sy": 40, "symmetr": [21, 24, 27, 29, 31, 34, 45, 46], "t": [35, 50], "tab": 49, "tabl": 45, "tag": 49, "take": [5, 24, 27, 34, 50], "target": [4, 11, 13, 18, 19, 21, 22, 24, 25, 26, 27, 30, 32, 33, 34, 45], "target_platform_cap": [21, 22, 24, 25, 27, 29, 30, 31, 32, 34, 42, 46], "target_q_fract": 4, "target_resource_util": [21, 24, 25, 27, 29, 31, 32, 34], "targetplatformcap": [13, 21, 22, 24, 25, 27, 29, 30, 31, 32, 34], "task": 9, "teacher": 4, "tempfil": 41, "tensor": [5, 11, 12, 15, 17, 20, 22, 28, 30, 45, 46, 49, 50], "tensorboard": [40, 50], "tensorflow": [3, 11, 13, 15, 20, 21, 22, 24, 25, 26, 27, 41, 43, 45, 50], "tf": [3, 11, 15, 20, 23, 26, 27], "tflite": [41, 45], "than": [5, 9, 42, 48], "thei": 3, "them": [45, 49], "thi": [5, 7, 8, 9, 11, 13, 20, 21, 23, 24, 25, 26, 27, 28, 29, 31, 32, 34, 35, 40, 41, 45, 46, 48, 50], "those": 48, "three": [3, 48], "threshold": [5, 8, 9, 11, 12, 21, 24, 27, 29, 31, 34, 45, 46, 48], "threshold_bitwidth_mixed_precis": 48, "threshold_bitwidth_mixed_precision_with_model_output_loss_object": 12, "threshold_degrade_layer_ratio": [12, 48], "threshold_quantize_error": [12, 48], "threshold_ratio_unbalanced_concaten": [12, 48], "threshold_zscore_outlier_remov": [12, 48], "through": [4, 20, 25, 28, 46], "throughout": 4, "thu": [25, 32, 49], "time": [3, 6, 46], "togeth": [25, 32], "tool": [11, 13, 46, 50], "toolkit": [11, 13, 20, 28, 29, 48], "torch": [17, 28, 37, 38, 41, 50], "torchscript": 41, "torchvis": [1, 16, 29, 30, 31, 32, 33, 34, 41], "total": [10, 22, 30, 40], "total_memori": 10, "totalcompon": 40, "tpc": [11, 13, 25, 32, 45], "tpc_minor_vers": 45, "tpc_patch_vers": 45, "tpc_platform_typ": 45, "tpc_v1_0": 18, "tpc_version": 18, "trace": 41, "track": 40, "train": [4, 11, 13, 44, 46, 50], "train_bia": 4, "trainabl": [23, 26, 46], "trainable_infrastructur": 44, "trainablequant": 26, "transform": [1, 9, 21, 24, 27, 29, 31, 34], "transpos": 3, "treat": 45, "troubleshoot": 13, "true": [1, 5, 8, 11, 12, 15, 16, 17, 23, 33, 34, 40, 46], "try": 5, "tun": 34, "tune": [15, 17, 25, 26, 27, 32, 33], "tupl": [1, 3, 11, 14, 16, 20, 21, 24, 25, 28, 29, 31, 32, 43, 45], "tutori": 48, "two": [5, 12, 21, 24, 27, 29, 31, 34, 41, 45, 48, 49], "type": [0, 1, 2, 4, 5, 6, 7, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32, 35, 36, 37, 38, 40, 41, 43, 45, 48], "ui": 49, "unbalanc": [12, 48], "unchang": 40, "under": 49, "unifi": 11, "uniform": [45, 46], "union": [1, 14, 16, 20, 21, 22, 24, 25, 27, 28, 29, 30, 31, 32, 34, 45], "uniqu": 45, "up": [6, 20, 28, 35, 45, 49], "updat": [4, 11], "upon": 46, "us": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 47, 48, 49, 50], "use_hessian_based_scor": [5, 11], "use_hessian_based_weight": [15, 17], "use_hessian_sample_attent": [15, 17], "use_mixed_precis": 11, "user": [11, 13, 21, 24, 26, 27, 29, 31, 33, 34, 40, 48], "userinform": [21, 24, 29, 31], "util": [6, 11, 13, 21, 24, 25, 26, 27, 32, 33, 34, 46], "v": 50, "valid": [36, 37, 38, 45, 46, 48], "validation_dataset": [36, 37, 38, 48], "validationerror": 45, "valu": [1, 2, 3, 4, 5, 6, 9, 11, 12, 21, 24, 25, 26, 27, 32, 40, 41, 42, 43, 45, 46, 48], "valuabl": 9, "variabl": [11, 15, 17], "variou": [11, 20, 28, 49], "vector": [4, 49], "verbos": 35, "version": [11, 13, 20, 28, 45], "via": [41, 50], "view": 49, "visit": [44, 50], "visual": [40, 48, 50], "wa": [2, 41, 48], "wai": [49, 50], "walk": [20, 28], "want": [3, 9], "warn": [11, 48], "we": [3, 20, 21, 24, 25, 27, 28, 32, 34, 41, 43, 45, 46, 49], "weight": [0, 1, 3, 4, 5, 8, 10, 11, 14, 15, 16, 17, 21, 22, 25, 27, 29, 30, 31, 32, 33, 34, 41, 43, 44, 45, 46, 49], "weight_quantizer_params_overrid": 44, "weight_training_method": 44, "weights_bias_correct": [8, 11], "weights_channels_axi": 46, "weights_compression_ratio": 11, "weights_error_method": 8, "weights_memori": [6, 10, 21, 24, 25, 27, 32, 34], "weights_n_bit": [43, 45, 46], "weights_per_channel_threshold": [45, 46], "weights_quantization_candid": 46, "weights_quantization_method": [43, 45, 46], "weights_quantization_param": 46, "weights_quantization_params_fn": 43, "weights_second_moment_correct": 8, "were": 49, "when": [1, 2, 3, 4, 5, 6, 9, 10, 12, 13, 15, 17, 21, 24, 26, 27, 40, 41, 42, 44, 45, 46, 48, 49], "where": [7, 9, 12, 41, 43, 48, 49], "whether": [4, 5, 7, 11, 14, 15, 16, 17, 23, 40, 41, 45, 46], "which": [4, 6, 40, 41, 42, 43, 45, 46], "while": [8, 21, 24, 26, 27, 34, 40, 45], "who": 48, "width": [0, 5, 12, 13, 21, 24, 27, 28, 34, 39, 45, 48, 50], "within": [40, 45, 48, 50], "without": 13, "work": 50, "would": 49, "wrap": [2, 3, 23, 27, 34, 42, 45, 46], "wrapper": [27, 33, 34, 46], "writer": 49, "x": 48, "xquant": [11, 50], "xquant_config": [12, 36, 37, 38, 48], "xquant_report_keras_experiment": [13, 36], "xquant_report_pytorch_experiment": [13, 37, 48], "xquant_report_troubleshoot_pytorch_experiment": [12, 13, 38, 48], "xquantconfig": [12, 13, 36, 37, 38], "y": 48, "yield": [21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 41], "you": [8, 9, 11, 40, 41, 45, 49, 50], "your": [41, 48], "z": 11, "z_score": [12, 48], "z_threshold": [8, 11], "zero": [5, 45]}, "titles": ["BitWidthConfig", "Data Generation Configuration", "DefaultDict Class", "FrameworkInfo Class", "GradientPTQConfig Class", "MixedPrecisionQuantizationConfig", "Pruning Configuration", "Pruning Information", "QuantizationConfig", "QuantizationErrorMethod", "ResourceUtilization", "wrapper", "XQuant Configuration", "API Docs", "Get DataGenerationConfig for Keras Models", "Get GradientPTQConfig for Keras Models", "Get DataGenerationConfig for Pytorch Models", "Get GradientPTQConfig for Pytorch Models", "Get TargetPlatformCapabilities for tpc version", "Get TargetPlatformCapabilities for sdsp converter version", "Keras Data Generation", "Keras Gradient Based Post Training Quantization", "Get Resource Utilization information for Keras Models", "Load Quantized Keras Model", "Keras Post Training Quantization", "Keras Structured Pruning", "Keras Quantization Aware Training Model Finalize", "Keras Quantization Aware Training Model Init", "Pytorch Data Generation", "Pytorch Gradient Based Post Training Quantization", "Get Resource Utilization information for PyTorch Models", "Pytorch Post Training Quantization", "Pytorch Structured Pruning", "PyTorch Quantization Aware Training Model Finalize", "PyTorch Quantization Aware Training Model Init", "Enable a Logger", "XQuant Report Keras", "XQuant Report Pytorch", "XQuant Report Troubleshoot Pytorch", "CoreConfig", "debug_config Module", "exporter Module", "Layer Attributes Filters", "network_editor Module", "qat_config Module", "target_platform_capabilities Module", "trainable_infrastructure Module", "<no title>", "XQuant Extension Tool", "Visualization within TensorBoard", "Model Compression Toolkit User Guide"], "titleterms": {"about": 48, "action": 43, "api": [13, 50], "attribut": 42, "attributequantizationconfig": 45, "awar": [26, 27, 33, 34], "base": [21, 29], "basekerastrainablequant": 46, "basepytorchtrainablequant": 46, "batchnormalignemntlosstyp": 1, "bit": 49, "bitwidthconfig": 0, "bnlayerweightingtyp": 1, "channelaxi": 3, "channelsfilteringstrategi": 6, "class": [2, 3, 4], "comparison": 49, "compress": 50, "configur": [1, 6, 12, 49], "constraint": 50, "convert": 19, "core": 13, "coreconfig": 39, "cosin": 49, "data": [1, 20, 28], "data_gener": 13, "datagenerationconfig": [14, 16], "datainittyp": 1, "debug_config": 40, "debugconfig": 40, "defaultdict": 2, "dictionari": 40, "doc": 13, "document": 50, "editrul": 43, "enabl": 35, "error": 48, "exampl": 48, "export": [13, 41], "extens": 48, "featur": 50, "filter": [42, 43], "final": [26, 33], "flow": 48, "format": [41, 48], "frameworkinfo": 3, "fuse": 45, "gener": [1, 20, 28, 48], "get": [14, 15, 16, 17, 18, 19, 22, 30], "gptq": 13, "gptqhessianscoresconfig": 4, "gradient": [21, 29], "gradientptqconfig": [4, 15, 17], "gradualactivationquantizationconfig": 4, "graph": 48, "guid": 50, "how": 48, "imagegranular": 1, "imagenormalizationtyp": 1, "imagepipelinetyp": 1, "importancemetr": 6, "indic": 13, "infer": 41, "inform": [7, 22, 30], "init": [27, 34], "instal": 50, "judgeabl": 48, "kei": 40, "kera": [14, 15, 20, 21, 22, 23, 24, 25, 26, 27, 36, 41], "keras_export_model": 41, "keras_load_quantized_model": 13, "kerasexportserializationformat": 41, "layer": 42, "load": 23, "logger": 35, "manualbitwidthselect": 0, "mctq": 41, "mix": 49, "mixedprecisionquantizationconfig": 5, "model": [14, 15, 16, 17, 22, 23, 26, 27, 30, 33, 34, 41, 50], "modul": [40, 41, 43, 44, 45, 46], "mpdistanceweight": 5, "mpmetricnorm": 5, "name": 41, "network_editor": 43, "onnx": 41, "operatorsetgroup": 45, "operatorsset": 45, "opquantizationconfig": 45, "opset": 41, "output": 41, "outputlosstyp": 1, "overal": 48, "overview": 50, "paramet": 48, "post": [21, 24, 29, 31], "precis": 49, "process": [40, 48], "prune": [6, 7, 13, 25, 32], "ptq": 13, "pytorch": [16, 17, 28, 29, 30, 31, 32, 33, 34, 37, 38, 41], "pytorch_export_model": 41, "pytorchexportserializationformat": 41, "qat": 13, "qat_config": 44, "qatconfig": 44, "qfractionlinearannealingconfig": 4, "quantiz": [21, 23, 24, 26, 27, 29, 31, 33, 34, 41, 48], "quantizationconfig": 8, "quantizationconfigopt": 45, "quantizationerrormethod": 9, "quantizationformat": 41, "quantizationmethod": 45, "quickstart": 50, "refer": 50, "report": [36, 37, 38], "resourc": [22, 30], "resourceutil": 10, "roundingtyp": 4, "run": 48, "schedulertyp": 1, "sdsp": 19, "serial": 41, "set_log_fold": 13, "similar": 49, "state": 40, "structur": [25, 32], "support": 50, "tabl": 13, "target_platform_cap": [13, 45], "targetplatformcap": [18, 19, 45], "technic": 50, "tensorboard": 49, "tool": 48, "toolkit": 50, "tpc": 18, "train": [21, 24, 26, 27, 29, 31, 33, 34], "trainable_infrastructur": [13, 46], "trainablequantizeractivationconfig": 46, "trainablequantizerweightsconfig": 46, "trainingmethod": [44, 46], "troubleshoot": [38, 48], "tutori": 41, "understand": 48, "us": 41, "user": 50, "util": [22, 30], "version": [18, 19, 41], "visual": 49, "width": 49, "within": 49, "wrapper": [11, 13], "xquant": [12, 13, 36, 37, 38, 48], "xquantconfig": 48}})
\ No newline at end of file
diff --git a/docs/static/pygments.css b/docs/static/pygments.css
index 0d49244ed..5f2b0a250 100644
--- a/docs/static/pygments.css
+++ b/docs/static/pygments.css
@@ -6,26 +6,26 @@ span.linenos.special { color: #000000; background-color: #ffffc0; padding-left:
.highlight .hll { background-color: #ffffcc }
.highlight { background: #eeffcc; }
.highlight .c { color: #408090; font-style: italic } /* Comment */
-.highlight .err { border: 1px solid #FF0000 } /* Error */
+.highlight .err { border: 1px solid #F00 } /* Error */
.highlight .k { color: #007020; font-weight: bold } /* Keyword */
-.highlight .o { color: #666666 } /* Operator */
+.highlight .o { color: #666 } /* Operator */
.highlight .ch { color: #408090; font-style: italic } /* Comment.Hashbang */
.highlight .cm { color: #408090; font-style: italic } /* Comment.Multiline */
.highlight .cp { color: #007020 } /* Comment.Preproc */
.highlight .cpf { color: #408090; font-style: italic } /* Comment.PreprocFile */
.highlight .c1 { color: #408090; font-style: italic } /* Comment.Single */
-.highlight .cs { color: #408090; background-color: #fff0f0 } /* Comment.Special */
+.highlight .cs { color: #408090; background-color: #FFF0F0 } /* Comment.Special */
.highlight .gd { color: #A00000 } /* Generic.Deleted */
.highlight .ge { font-style: italic } /* Generic.Emph */
.highlight .ges { font-weight: bold; font-style: italic } /* Generic.EmphStrong */
-.highlight .gr { color: #FF0000 } /* Generic.Error */
+.highlight .gr { color: #F00 } /* Generic.Error */
.highlight .gh { color: #000080; font-weight: bold } /* Generic.Heading */
.highlight .gi { color: #00A000 } /* Generic.Inserted */
-.highlight .go { color: #333333 } /* Generic.Output */
-.highlight .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */
+.highlight .go { color: #333 } /* Generic.Output */
+.highlight .gp { color: #C65D09; font-weight: bold } /* Generic.Prompt */
.highlight .gs { font-weight: bold } /* Generic.Strong */
.highlight .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
-.highlight .gt { color: #0044DD } /* Generic.Traceback */
+.highlight .gt { color: #04D } /* Generic.Traceback */
.highlight .kc { color: #007020; font-weight: bold } /* Keyword.Constant */
.highlight .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */
.highlight .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */
@@ -33,43 +33,43 @@ span.linenos.special { color: #000000; background-color: #ffffc0; padding-left:
.highlight .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */
.highlight .kt { color: #902000 } /* Keyword.Type */
.highlight .m { color: #208050 } /* Literal.Number */
-.highlight .s { color: #4070a0 } /* Literal.String */
-.highlight .na { color: #4070a0 } /* Name.Attribute */
+.highlight .s { color: #4070A0 } /* Literal.String */
+.highlight .na { color: #4070A0 } /* Name.Attribute */
.highlight .nb { color: #007020 } /* Name.Builtin */
-.highlight .nc { color: #0e84b5; font-weight: bold } /* Name.Class */
-.highlight .no { color: #60add5 } /* Name.Constant */
-.highlight .nd { color: #555555; font-weight: bold } /* Name.Decorator */
-.highlight .ni { color: #d55537; font-weight: bold } /* Name.Entity */
+.highlight .nc { color: #0E84B5; font-weight: bold } /* Name.Class */
+.highlight .no { color: #60ADD5 } /* Name.Constant */
+.highlight .nd { color: #555; font-weight: bold } /* Name.Decorator */
+.highlight .ni { color: #D55537; font-weight: bold } /* Name.Entity */
.highlight .ne { color: #007020 } /* Name.Exception */
-.highlight .nf { color: #06287e } /* Name.Function */
+.highlight .nf { color: #06287E } /* Name.Function */
.highlight .nl { color: #002070; font-weight: bold } /* Name.Label */
-.highlight .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */
+.highlight .nn { color: #0E84B5; font-weight: bold } /* Name.Namespace */
.highlight .nt { color: #062873; font-weight: bold } /* Name.Tag */
-.highlight .nv { color: #bb60d5 } /* Name.Variable */
+.highlight .nv { color: #BB60D5 } /* Name.Variable */
.highlight .ow { color: #007020; font-weight: bold } /* Operator.Word */
-.highlight .w { color: #bbbbbb } /* Text.Whitespace */
+.highlight .w { color: #BBB } /* Text.Whitespace */
.highlight .mb { color: #208050 } /* Literal.Number.Bin */
.highlight .mf { color: #208050 } /* Literal.Number.Float */
.highlight .mh { color: #208050 } /* Literal.Number.Hex */
.highlight .mi { color: #208050 } /* Literal.Number.Integer */
.highlight .mo { color: #208050 } /* Literal.Number.Oct */
-.highlight .sa { color: #4070a0 } /* Literal.String.Affix */
-.highlight .sb { color: #4070a0 } /* Literal.String.Backtick */
-.highlight .sc { color: #4070a0 } /* Literal.String.Char */
-.highlight .dl { color: #4070a0 } /* Literal.String.Delimiter */
-.highlight .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */
-.highlight .s2 { color: #4070a0 } /* Literal.String.Double */
-.highlight .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */
-.highlight .sh { color: #4070a0 } /* Literal.String.Heredoc */
-.highlight .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */
-.highlight .sx { color: #c65d09 } /* Literal.String.Other */
+.highlight .sa { color: #4070A0 } /* Literal.String.Affix */
+.highlight .sb { color: #4070A0 } /* Literal.String.Backtick */
+.highlight .sc { color: #4070A0 } /* Literal.String.Char */
+.highlight .dl { color: #4070A0 } /* Literal.String.Delimiter */
+.highlight .sd { color: #4070A0; font-style: italic } /* Literal.String.Doc */
+.highlight .s2 { color: #4070A0 } /* Literal.String.Double */
+.highlight .se { color: #4070A0; font-weight: bold } /* Literal.String.Escape */
+.highlight .sh { color: #4070A0 } /* Literal.String.Heredoc */
+.highlight .si { color: #70A0D0; font-style: italic } /* Literal.String.Interpol */
+.highlight .sx { color: #C65D09 } /* Literal.String.Other */
.highlight .sr { color: #235388 } /* Literal.String.Regex */
-.highlight .s1 { color: #4070a0 } /* Literal.String.Single */
+.highlight .s1 { color: #4070A0 } /* Literal.String.Single */
.highlight .ss { color: #517918 } /* Literal.String.Symbol */
.highlight .bp { color: #007020 } /* Name.Builtin.Pseudo */
-.highlight .fm { color: #06287e } /* Name.Function.Magic */
-.highlight .vc { color: #bb60d5 } /* Name.Variable.Class */
-.highlight .vg { color: #bb60d5 } /* Name.Variable.Global */
-.highlight .vi { color: #bb60d5 } /* Name.Variable.Instance */
-.highlight .vm { color: #bb60d5 } /* Name.Variable.Magic */
+.highlight .fm { color: #06287E } /* Name.Function.Magic */
+.highlight .vc { color: #BB60D5 } /* Name.Variable.Class */
+.highlight .vg { color: #BB60D5 } /* Name.Variable.Global */
+.highlight .vi { color: #BB60D5 } /* Name.Variable.Instance */
+.highlight .vm { color: #BB60D5 } /* Name.Variable.Magic */
.highlight .il { color: #208050 } /* Literal.Number.Integer.Long */
\ No newline at end of file
diff --git a/docsrc/source_troubleshoot/troubleshoots/threhold_selection_error_method.rst b/docsrc/source_troubleshoot/troubleshoots/threhold_selection_error_method.rst
index 9acad965a..fd592d7f0 100644
--- a/docsrc/source_troubleshoot/troubleshoots/threhold_selection_error_method.rst
+++ b/docsrc/source_troubleshoot/troubleshoots/threhold_selection_error_method.rst
@@ -23,12 +23,28 @@ Solution
=================================
Use a different error method for activations. You can set the following values:
- * NOCLIPPING - Use min/max values
- * MSE (default) - Use mean square error
- * MAE - Use mean absolute error
- * KL - Use KL-divergence
- * Lp - Use Lp-norm
- * HMSE - Use Hessian-based mean squared error
+ * NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
+ * MSE - **(default)** Use mean square error for minimizing quantization noise.
+ * MAE - Use mean absolute error for minimizing quantization noise.
+ * KL - Use KL-divergence to make signals distributions to be similar as possible.
+ * Lp - Use Lp-norm to minimizing quantization noise. The parameter p is specified by QuantizationConfig.l_p_value (default: 2; integer only). It equals MAE when p = 1 and MSE when p = 2. If you want to use p≧3, please use this method.
+ * HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization.
+
+ **How to select QuantizationErrorMethod**
+
+ .. csv-table::
+ :header: "Method", "Recommended Situations"
+ :widths: 20, 80
+
+ "NOCLIPPING", "Research and debugging phases where you want to observe behavior across the entire range. This is effective when you want to maintain the entire range, especially when the data is biased (for example, when there is an extremely small amount of data on the minimum side)."
+ "MSE", "**Basically, you should use this method.** This method is effective when the data distribution is close to normal and there are few outliers. Effective when you want stable results, such as in regression tasks."
+ "MAE", "Effective for data with a lot of noise and outliers."
+ "KL", "Useful for tasks where output distribution is important (such as Anomaly Detection)."
+ "LP", "p≧3 is effective when you want to be more sensitive to outliers than MSE. (such as Sparse Data)."
+ "HMSE", "Recommended when using GPTQ. This is effective for models where specific layers strongly influence the overall accuracy. (such as Transformers)."
+
+
+
For example, set NOCLIPPING to the ``activation_error_method`` attribute of the ``QuantizationConfig`` in ``CoreConfig``.
diff --git a/model_compression_toolkit/core/common/quantization/quantization_config.py b/model_compression_toolkit/core/common/quantization/quantization_config.py
index 8d6ca918b..df49f6555 100644
--- a/model_compression_toolkit/core/common/quantization/quantization_config.py
+++ b/model_compression_toolkit/core/common/quantization/quantization_config.py
@@ -41,18 +41,31 @@ class QuantizationErrorMethod(Enum):
"""
Method for quantization threshold selection:
- NOCLIPPING - Use min/max values as thresholds.
+ NOCLIPPING - Use min/max values as thresholds. This avoids clipping bias but reduces quantization resolution.
- MSE - Use mean square error for minimizing quantization noise.
+ MSE - **(default)** Use mean square error for minimizing quantization noise.
MAE - Use mean absolute error for minimizing quantization noise.
KL - Use KL-divergence to make signals distributions to be similar as possible.
- Lp - Use Lp-norm to minimizing quantization noise.
+ Lp - Use Lp-norm to minimizing quantization noise. The parameter p is specified by QuantizationConfig.l_p_value (default: 2; integer only). It equals MAE when p = 1 and MSE when p = 2. If you want to use p≧3, please use this method.
HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization.
+ **How to select QuantizationErrorMethod**
+
+ .. csv-table::
+ :header: "Method", "Recommended Situations"
+ :widths: 20, 80
+
+ "NOCLIPPING", "Research and debugging phases where you want to observe behavior across the entire range. This is effective when you want to maintain the entire range, especially when the data is biased (for example, when there is an extremely small amount of data on the minimum side)."
+ "MSE", "**Basically, you should use this method.** This method is effective when the data distribution is close to normal and there are few outliers. Effective when you want stable results, such as in regression tasks."
+ "MAE", "Effective for data with a lot of noise and outliers."
+ "KL", "Useful for tasks where output distribution is important (such as Anomaly Detection)."
+ "LP", "p≧3 is effective when you want to be more sensitive to outliers than MSE. (such as Sparse Data)."
+ "HMSE", "Recommended when using GPTQ. This is effective for models where specific layers strongly influence the overall accuracy. (such as Transformers)."
+
"""
NOCLIPPING = 0
diff --git a/quantization_troubleshooting.md b/quantization_troubleshooting.md
index fdfd0e7ef..f8219ff5b 100644
--- a/quantization_troubleshooting.md
+++ b/quantization_troubleshooting.md
@@ -114,6 +114,8 @@ Some error methods (specifically, the KL-Divergence method) may suffer from exte
Opting for a different error metric could enhance threshold selection for one layer while potentially compromising another.
Therefore, thorough investigation and consideration are necessary.
+Read more about them in the [QuantizationErrorMethod](https://sonysemiconductorsolutions.github.io/mct-model-optimization/api/api_docs/classes/QuantizationErrorMethod.html#model_compression_toolkit.core.QuantizationErrorMethod) class description.
+
___
## Model Structure Quantization Issues