You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use float mp model baseline instead of maxbit configuration (#1441)
* float mp model for sensitivity - disable all quantizers except current candidate, instead of using maxbit config
* ensure non-negative value in similarity analyzer (due to float precision)
* add normalization method and epsilon for sensitivity metric to MP config
* print only changed by refinement solutions instead of all
* remove support for non-configurable quant layer in mp back2framework
MAXBIT: normalize sensitivity metrics of layer candidates by max-bitwidth candidate (of that layer).
26
+
MINBIT: normalize sensitivity metrics of layer candidates by min-bitwidth candidate (of that layer).
27
+
NONE: no normalization.
28
+
"""
29
+
MAXBIT='MAXBIT'
30
+
MINBIT='MINBIT'
31
+
NONE='NONE'
32
+
33
+
22
34
@dataclass
23
35
classMixedPrecisionQuantizationConfig:
24
36
"""
@@ -27,7 +39,6 @@ class MixedPrecisionQuantizationConfig:
27
39
Args:
28
40
compute_distance_fn (Callable): Function to compute a distance between two tensors. If None, using pre-defined distance methods based on the layer type for each layer.
29
41
distance_weighting_method (MpDistanceWeighting): MpDistanceWeighting enum value that provides a function to use when weighting the distances among different layers when computing the sensitivity metric.
30
-
custom_metric_fn (Callable): Function to compute a custom metric. As input gets the model_mp and returns a float value for metric. If None, uses interest point metric.
31
42
num_of_images (int): Number of images to use to evaluate the sensitivity of a mixed-precision model comparing to the float model.
32
43
configuration_overwrite (List[int]): A list of integers that enables overwrite of mixed precision with a predefined one.
33
44
num_interest_points_factor (float): A multiplication factor between zero and one (represents percentage) to reduce the number of interest points used to calculate the distance metric.
@@ -36,11 +47,16 @@ class MixedPrecisionQuantizationConfig:
36
47
refine_mp_solution (bool): Whether to try to improve the final mixed-precision configuration using a greedy algorithm that searches layers to increase their bit-width, or not.
37
48
metric_normalization_threshold (float): A threshold for checking the mixed precision distance metric values, In case of values larger than this threshold, the metric will be scaled to prevent numerical issues.
38
49
hessian_batch_size (int): The Hessian computation batch size. used only if using mixed precision with Hessian-based objective.
0 commit comments