You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/advanced/model_optimization.rst
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ hls4ml Optimization API
3
3
========================
4
4
5
5
Pruning and weight sharing are effective techniques to reduce model footprint and computational requirements. The hls4ml Optimization API introduces hardware-aware pruning and weight sharing.
6
-
By defining custom objectives, the algorithm solves a Knapsack optimization problem aimed at maximizing model performance, while keeping the target resource(s) at a minimum. Out-of-the box objectives include network sparsity, GPU FLOPs, Vivado DSPs, memory utilization etc.
6
+
By defining custom objectives, the algorithm solves a Knapsack optimization problem aimed at maximizing model performance, while keeping the target resource(s) at a minimum. Out-of-the box objectives include network sparsity, GPU FLOPs, Vivado DSPs, memory utilization etc.
7
7
8
8
The code block below showcases three use cases of the hls4ml Optimization API - network sparsity (unstructured pruning), GPU FLOPs (structured pruning) and Vivado DSP utilization (pattern pruning). First, we start with unstructured pruning:
9
9
@@ -115,6 +115,6 @@ Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
115
115
There are two more Vivado "optimizers" - VivadoFFEstimator, aimed at reducing register utilisation and VivadoMultiObjectiveEstimator, aimed at optimising BRAM and DSP utilisation.
116
116
Note, to ensure DSPs are optimized, "unrolled" Dense multiplication must be used before synthesing HLS, by modifying the config:
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
53
59
54
60
Kwargs:
55
61
- callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
56
-
- ranking_metric (string): Metric used for rannking weights and structures; currently supported l1, l2, saliency and Oracle
62
+
- ranking_metric (string): Metric used for rannking weights and structures;
63
+
currently supported l1, l2, saliency and Oracle
57
64
- local (boolean): Layer-wise or global pruning
58
65
- verbose (boolean): Display debug logs during model optimization
59
-
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing, allows regrowth of previously pruned weights
60
-
- cutoff_bad_trials (int): After how many bad trials (performance below threshold), should model pruning / weight sharing stop
66
+
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
67
+
allows regrowth of previously pruned weights
68
+
- cutoff_bad_trials (int): After how many bad trials (performance below threshold),
69
+
should model pruning / weight sharing stop
61
70
- directory (string): Directory to store temporary results
62
71
- tuner (str): Tuning alogorithm, choose between Bayesian, Hyperband and None
63
-
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing; default usually works well; for very large networks, greedy algorithm might be more suitable
72
+
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
73
+
default usually works well; for very large networks, greedy algorithm might be more suitable
64
74
- regularization_range (list): List of suitable hyperparameters for weight decay
Copy file name to clipboardExpand all lines: hls4ml/optimization/config.py
+5-3Lines changed: 5 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -17,15 +17,17 @@
17
17
- Dense: Neurons, determined by their outgoing connections (columns in Keras weight tensors)
18
18
- Conv2D: Filters (structures of size filt_width x filt_height x n_chan)
19
19
- Notes:
20
-
- For Dense, it was also possible optimize by incoming connections (rows); however, removing zero neurons becomes harder
20
+
- For Dense, it was also possible optimize by incoming connections (rows);
21
+
However, removing zero neurons becomes harder because of Keras Surgeon
21
22
- For Conv2D, significant literature explored pruning channels; currently not supported
22
23
- Supports: All layers in SUPPORTED_LAYERS (hls4ml.optimization.keras)
23
24
24
25
3. Pattern:
25
26
- Pruning: Y
26
27
- Weight sharing: Y
27
-
- Description: Zeroes out or quantizes all the weights in a group.
28
-
Groups are determined by a variable, n, and every n-th weight in the flattened, transposed (Resource) weight tensor is collected and stored in the same group
28
+
- Description: Zeroes out or quantizes all the weights in a group
29
+
Groups are determined by a variable, n, and every n-th weight in the flattened,
30
+
Transposed (Resource) weight tensor is collected and stored in the same group
29
31
Equivalent to pruning/quantizing weight processed by the same DSP in hls4ml
30
32
- Supports: All layers in SUPPORTED_LAYERS (hls4ml.optimization.keras)
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
67
76
68
77
Kwargs:
69
78
- callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
70
-
- ranking_metric (string): Metric used for rannking weights and structures; currently supported l1, l2, saliency and Oracle
79
+
- ranking_metric (string): Metric used for rannking weights and structures;
80
+
currently supported l1, l2, saliency and Oracle
71
81
- local (boolean): Layer-wise or global pruning
72
82
- verbose (boolean): Display debug logs during model optimization
73
-
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing, allows regrowth of previously pruned weights
74
-
- cutoff_bad_trials (int): After how many bad trials (performance below threshold), should model pruning / weight sharing stop
83
+
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
84
+
allows regrowth of previously pruned weights
85
+
- cutoff_bad_trials (int): After how many bad trials (performance below threshold),
86
+
should model pruning / weight sharing stop
75
87
- directory (string): Directory to store temporary results
76
88
- tuner (str): Tuning alogorithm, choose between Bayesian, Hyperband and None
77
-
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing; default usually works well; for very large networks, greedy algorithm might be more suitable
89
+
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
90
+
default usually works well; for very large networks, greedy algorithm might be more suitable
78
91
- regularization_range (list): List of suitable hyperparameters for weight decay
- learning_rate_range (list): List of suitable hyperparameters for learning rate
129
132
130
133
Notes:
131
-
- In general, the regularization and learning rate ranges do not need to be provided, as the implementation sets a generic enough range.
132
-
However, if the user has an idea on the possible range on hyperparameter ranges (eg. VGG-16 weight decay ~10^-5), the tuning will complete faster
133
-
- The default tuner is Bayesian & when coupled with the correct ranges of hyperparameters, it performs quite well, fast. However, older version of Keras Tuner had a crashing bug with Bayesian Tuner
134
-
- In general, the directory does not need to be specified. However, if pruning several models simultaneously, to avoid conflicting intermediate results, it is useful to specify directory
134
+
- In general, the regularization and learning rate ranges do not need to be provided,
135
+
as the implementation sets a generic enough range. if the user has an idea on the
136
+
possible range on hyperparameter ranges, the tuning will complete faster.
137
+
- The default tuner is Bayesian & when coupled with the correct ranges of hyperparameters,
138
+
it performs quite well, fast. However, older version of Keras Tuner had a crashing bug with it.
139
+
- In general, the directory does not need to be specified. However, if pruning several models simultaneously,
140
+
to avoid conflicting intermediate results, it is useful to specify directory.
135
141
'''
136
142
# User provided manual hyper-parameters for regularisation loss
137
143
# TODO - Maybe we could extend this to be hyper-parameters per layer? or layer-type?
0 commit comments