Skip to content

Commit ad47f41

Browse files
authored
Merge pull request #3 from vloncar/opt1
Cleanup docstrings
2 parents c4a5a0f + 7a26a9a commit ad47f41

File tree

13 files changed

+278
-247
lines changed

13 files changed

+278
-247
lines changed

docs/advanced/model_optimization.rst

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
1-
========================
2-
hls4ml Optimization API
3-
========================
1+
=================================
2+
Hardware-aware Optimization API
3+
=================================
44

55
Pruning and weight sharing are effective techniques to reduce model footprint and computational requirements. The hls4ml Optimization API introduces hardware-aware pruning and weight sharing.
66
By defining custom objectives, the algorithm solves a Knapsack optimization problem aimed at maximizing model performance, while keeping the target resource(s) at a minimum. Out-of-the box objectives include network sparsity, GPU FLOPs, Vivado DSPs, memory utilization etc.
77

88
The code block below showcases three use cases of the hls4ml Optimization API - network sparsity (unstructured pruning), GPU FLOPs (structured pruning) and Vivado DSP utilization (pattern pruning). First, we start with unstructured pruning:
99

1010
.. code-block:: Python
11+
1112
from sklearn.metrics import accuracy_score
1213
from tensorflow.keras.optimizers import Adam
1314
from tensorflow.keras.metrics import CategoricalAccuracy
@@ -71,7 +72,9 @@ In a similar manner, it is possible to target GPU FLOPs or Vivado DSPs. However,
7172
Instead, it is the sparsity of the target resource. As an example: Starting with a network utilizing 512 DSPs and a final sparsity of 50%; the optimized network will use 256 DSPs.
7273

7374
To optimize GPU FLOPs, the code is similar to above:
75+
7476
.. code-block:: Python
77+
7578
from hls4ml.optimization.objectives.gpu_objectives import GPUFLOPEstimator
7679
7780
# Optimize model
@@ -91,7 +94,9 @@ To optimize GPU FLOPs, the code is similar to above:
9194
print(optimized_model.summary())
9295
9396
Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
97+
9498
.. code-block:: Python
99+
95100
from hls4ml.utils.config import config_from_keras_model
96101
from hls4ml.optimization.objectives.vivado_objectives import VivadoDSPEstimator
97102
@@ -121,7 +126,9 @@ Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
121126
122127
There are two more Vivado "optimizers" - VivadoFFEstimator, aimed at reducing register utilisation and VivadoMultiObjectiveEstimator, aimed at optimising BRAM and DSP utilisation.
123128
Note, to ensure DSPs are optimized, "unrolled" Dense multiplication must be used before synthesing HLS, by modifying the config:
129+
124130
.. code-block:: Python
131+
125132
hls_config = config_from_keras_model(optimized_model)
126133
hls_config['Model']['DenseResourceImplementation'] = 'Unrolled'
127134
# Any addition hls4ml config, such as strategy, reuse factor etc...

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
advanced/fifo_depth
2626
advanced/extension
2727
advanced/accelerator
28+
advanced/model_optimization
2829

2930
.. toctree::
3031
:hidden:
@@ -34,6 +35,7 @@
3435
autodoc/hls4ml.backends
3536
autodoc/hls4ml.converters
3637
autodoc/hls4ml.model
38+
autodoc/hls4ml.optimization
3739
autodoc/hls4ml.report
3840
autodoc/hls4ml.utils
3941
autodoc/hls4ml.writer

hls4ml/optimization/__init__.py

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -37,41 +37,42 @@ def optimize_keras_for_hls4ml(
3737
Top-level function for optimizing a Keras model, given hls4ml config and a hardware objective(s)
3838
3939
Args:
40-
- keras_model (keras.Model): Model to be optimized
41-
- hls_config (dict): hls4ml configuration, obtained from hls4ml.utils.config.config_from_keras_model(...)
42-
- objective (hls4ml.optimization.objectives.ObjectiveEstimator):
40+
keras_model (keras.Model): Model to be optimized
41+
hls_config (dict): hls4ml configuration, obtained from hls4ml.utils.config.config_from_keras_model(...)
42+
objective (hls4ml.optimization.objectives.ObjectiveEstimator):
4343
Parameter, hardware or user-defined objective of optimization
44-
- scheduler (hls4ml.optimization.schduler.OptimizationScheduler):
44+
scheduler (hls4ml.optimization.scheduler.OptimizationScheduler):
4545
Sparsity scheduler, choose between constant, polynomial and binary
46-
- X_train (np.array): Training inputs
47-
- y_train (np.array): Training labels
48-
- X_val (np.array): Validation inputs
49-
- y_val (np.array): Validation labels
50-
- batch_size (int): Batch size during training
51-
- epochs (int): Maximum number of epochs to fine-tune model, in one iteration of pruning
52-
- optimizer (keras.optimizers.Optimizer or equivalent-string description): Optimizer used during training
53-
- loss_fn (keras.losses.Loss or equivalent loss description): Loss function used during training
54-
- validation_metric (keras.metrics.Metric or equivalent loss description): Validation metric, used as a baseline
55-
- increasing (boolean): If the metric improves with increased values;
56-
e.g. accuracy -> increasing = True, MSE -> increasing = False
57-
- rtol (float): Relative tolerance;
58-
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
46+
X_train (np.array): Training inputs
47+
y_train (np.array): Training labels
48+
X_val (np.array): Validation inputs
49+
y_val (np.array): Validation labels
50+
batch_size (int): Batch size during training
51+
epochs (int): Maximum number of epochs to fine-tune model, in one iteration of pruning
52+
optimizer (keras.optimizers.Optimizer or equivalent-string description): Optimizer used during training
53+
loss_fn (keras.losses.Loss or equivalent loss description): Loss function used during training
54+
validation_metric (keras.metrics.Metric or equivalent loss description): Validation metric, used as a baseline
55+
increasing (boolean): If the metric improves with increased values;
56+
e.g. accuracy -> increasing = True, MSE -> increasing = False
57+
rtol (float): Relative tolerance;
58+
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
59+
callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
60+
ranking_metric (string): Metric used for ranking weights and structures;
61+
currently supported l1, l2, saliency and Oracle
62+
local (boolean): Layer-wise or global pruning
63+
verbose (boolean): Display debug logs during model optimization
64+
rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
65+
allows regrowth of previously pruned weights
66+
cutoff_bad_trials (int): After how many bad trials (performance below threshold),
67+
should model pruning / weight sharing stop
68+
directory (string): Directory to store temporary results
69+
tuner (str): Tuning algorithm, choose between Bayesian, Hyperband and None
70+
knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
71+
default usually works well; for very large networks, greedy algorithm might be more suitable
72+
regularization_range (list): List of suitable hyperparameters for weight decay
5973
60-
Kwargs:
61-
- callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
62-
- ranking_metric (string): Metric used for rannking weights and structures;
63-
currently supported l1, l2, saliency and Oracle
64-
- local (boolean): Layer-wise or global pruning
65-
- verbose (boolean): Display debug logs during model optimization
66-
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
67-
allows regrowth of previously pruned weights
68-
- cutoff_bad_trials (int): After how many bad trials (performance below threshold),
69-
should model pruning / weight sharing stop
70-
- directory (string): Directory to store temporary results
71-
- tuner (str): Tuning alogorithm, choose between Bayesian, Hyperband and None
72-
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
73-
default usually works well; for very large networks, greedy algorithm might be more suitable
74-
- regularization_range (list): List of suitable hyperparameters for weight decay
74+
Returns:
75+
keras.Model: Optimized model
7576
'''
7677

7778
# Extract model attributes

hls4ml/optimization/attributes.py

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,14 @@ class hls4mlAttributes:
1111
A class for storing hls4ml information of a single layer
1212
1313
Args:
14-
- n_in (int): Number of inputs (rows) for Dense matrix multiplication
15-
- n_out (int): Number of outputs (cols) for Dense matrix multiplication
16-
- io_type (string): io_parallel or io_stream
17-
- strategy (string): Resource or Latency
18-
- weight_precision (FixedPrecisionType): Layer weight precision
19-
- output_precision (FixedPrecisionType): Layer output precision
20-
- reuse_factor (int): Layer reuse factor
21-
- parallelization_factor (int): Layer parallelization factor - [applicable to io_parallel Conv2D]
14+
n_in (int): Number of inputs (rows) for Dense matrix multiplication
15+
n_out (int): Number of outputs (cols) for Dense matrix multiplication
16+
io_type (string): io_parallel or io_stream
17+
strategy (string): Resource or Latency
18+
weight_precision (FixedPrecisionType): Layer weight precision
19+
output_precision (FixedPrecisionType): Layer output precision
20+
reuse_factor (int): Layer reuse factor
21+
parallelization_factor (int): Layer parallelization factor - [applicable to io_parallel Conv2D]
2222
'''
2323

2424
def __init__(
@@ -51,12 +51,12 @@ class OptimizationAttributes:
5151
A class for storing layer optimization attributes
5252
5353
Args:
54-
- structure_type (enum): Targeted structure - unstructured, structured, pattern, block
55-
- pruning (boolean): Should pruning be applied to the layer
56-
- weight_sharing (boolean): Should weight sharing be applied to the layer
57-
- block_shape (tuple): Block shape if structure_type == block
58-
- pattern_offset (int): Length of each pattern if structure_type == pattern
59-
- consecutive_patterns (int): How many consecutive patterns are grouped together if structure_type == pattern
54+
structure_type (enum): Targeted structure - unstructured, structured, pattern, block
55+
pruning (boolean): Should pruning be applied to the layer
56+
weight_sharing (boolean): Should weight sharing be applied to the layer
57+
block_shape (tuple): Block shape if structure_type == block
58+
pattern_offset (int): Length of each pattern if structure_type == pattern
59+
consecutive_patterns (int): How many consecutive patterns are grouped together if structure_type == pattern
6060
6161
Notes:
6262
- In the case of hls4ml, pattern_offset is equivalent to the number of weights processed in parallel
@@ -88,16 +88,16 @@ class LayerAttributes:
8888
A class for storing layer information
8989
9090
Args:
91-
- name (string): Layer name
92-
- layer_type (keras.Layer): Layer type (e.g. Dense, Conv2D etc.)
93-
- inbound_layers (list): List of parent nodes, identified by name
94-
- weight_shape (tuple): Layer weight shape
95-
- input_shape (tuple): Layer input shape
96-
- output_shape (tuple): Layer output shape
97-
- optimizable (bool): Should optimizations (pruning, weight sharing) be applied to this layer
98-
- optimization_attributes (OptimizationAttributes): Type of optimization,
91+
name (string): Layer name
92+
layer_type (keras.Layer): Layer type (e.g. Dense, Conv2D etc.)
93+
inbound_layers (list): List of parent nodes, identified by name
94+
weight_shape (tuple): Layer weight shape
95+
input_shape (tuple): Layer input shape
96+
output_shape (tuple): Layer output shape
97+
optimizable (bool): Should optimizations (pruning, weight sharing) be applied to this layer
98+
optimization_attributes (OptimizationAttributes): Type of optimization,
9999
pruning or weight sharing, block shape and pattern offset
100-
- args (dict): Additional information,
100+
args (dict): Additional information,
101101
e.g. hls4mlAttributes; dictionary so it can be generic enough for different platforms
102102
'''
103103

@@ -147,10 +147,10 @@ def get_attributes_from_keras_model(model):
147147
Per-layer pruning sype (structured, pattern etc.), depend on the pruning objective and are inserted later
148148
149149
Args:
150-
- model (keras.model): Model to extract attributes from
150+
model (keras.model): Model to extract attributes from
151151
152-
Return:
153-
- model_attributes (dict): Each key corresponds to a layer name, values are instances of LayerAttribute
152+
Returns:
153+
model_attributes (dict): Each key corresponds to a layer name, values are instances of LayerAttribute
154154
'''
155155
is_sequential = model.__class__.__name__ == 'Sequential'
156156
model_attributes = {}
@@ -188,11 +188,11 @@ def get_attributes_from_keras_model_and_hls4ml_config(model, config):
188188
Per-layer pruning sype (structured, pruning etc.), depend on the pruning objective and are inserted later
189189
190190
Args:
191-
- model (keras.model): Model to extract attributes from
192-
- config (dict): hls4ml dictionary
191+
model (keras.model): Model to extract attributes from
192+
config (dict): hls4ml dictionary
193193
194-
Return:
195-
- model_attributes (dict): Each key corresponds to a layer name, values are LayerAttribute instances
194+
Returns:
195+
model_attributes (dict): Each key corresponds to a layer name, values are LayerAttribute instances
196196
'''
197197

198198
# Extract Keras attributes

hls4ml/optimization/keras/__init__.py

Lines changed: 39 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -50,45 +50,43 @@ def optimize_model(
5050
Top-level function for optimizing a Keras model, given objectives
5151
5252
Args:
53-
- model (keras.Model): Model to be optimized
54-
- model_attributes (dict): Layer-wise model attributes,
55-
obtained from hls4ml.optimization.get_attributes_from_keras_model(...)
56-
- objective (hls4ml.optimization.objectives.ObjectiveEstimator):
57-
Parameter, hardware or user-defined objective of optimization
58-
- scheduler (hls4ml.optimization.schduler.OptimizationScheduler):
59-
Sparsity scheduler, choose between constant, polynomial and binary
60-
- X_train (np.array): Training inputs
61-
- y_train (np.array): Training labels
62-
- X_val (np.array): Validation inputs
63-
- y_val (np.array): Validation labels
64-
- batch_size (int): Batch size during training
65-
- epochs (int): Maximum number of epochs to fine-tune model, in one iteration of pruning
66-
- optimizer (keras.optimizers.Optimizer or equivalent-string description):
67-
Optimizer used during training
68-
- loss_fn (keras.losses.Loss or equivalent loss description):
69-
Loss function used during training
70-
- validation_metric (keras.metrics.Metric or equivalent loss description):
71-
Validation metric, used as a baseline
72-
- increasing (boolean): If the metric improves with increased values;
73-
e.g. accuracy -> increasing = True, MSE -> increasing = False
74-
- rtol (float): Relative tolerance;
75-
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
76-
77-
Kwargs:
78-
- callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
79-
- ranking_metric (string): Metric used for rannking weights and structures;
80-
currently supported l1, l2, saliency and Oracle
81-
- local (boolean): Layer-wise or global pruning
82-
- verbose (boolean): Display debug logs during model optimization
83-
- rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
84-
allows regrowth of previously pruned weights
85-
- cutoff_bad_trials (int): After how many bad trials (performance below threshold),
86-
should model pruning / weight sharing stop
87-
- directory (string): Directory to store temporary results
88-
- tuner (str): Tuning alogorithm, choose between Bayesian, Hyperband and None
89-
- knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
90-
default usually works well; for very large networks, greedy algorithm might be more suitable
91-
- regularization_range (list): List of suitable hyperparameters for weight decay
53+
model (keras.Model): Model to be optimized
54+
model_attributes (dict): Layer-wise model attributes,
55+
obtained from hls4ml.optimization.get_attributes_from_keras_model(...)
56+
objective (hls4ml.optimization.objectives.ObjectiveEstimator):
57+
Parameter, hardware or user-defined objective of optimization
58+
scheduler (hls4ml.optimization.scheduler.OptimizationScheduler):
59+
Sparsity scheduler, choose between constant, polynomial and binary
60+
X_train (np.array): Training inputs
61+
y_train (np.array): Training labels
62+
X_val (np.array): Validation inputs
63+
y_val (np.array): Validation labels
64+
batch_size (int): Batch size during training
65+
epochs (int): Maximum number of epochs to fine-tune model, in one iteration of pruning
66+
optimizer (keras.optimizers.Optimizer or equivalent-string description): Optimizer used during training
67+
loss_fn (keras.losses.Loss or equivalent loss description): Loss function used during training
68+
validation_metric (keras.metrics.Metric or equivalent loss description): Validation metric, used as a baseline
69+
increasing (boolean): If the metric improves with increased values;
70+
e.g. accuracy -> increasing = True, MSE -> increasing = False
71+
rtol (float): Relative tolerance;
72+
pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
73+
callbacks (list of keras.callbacks.Callback) Currently not supported, developed in future versions
74+
ranking_metric (string): Metric used for ranking weights and structures;
75+
currently supported l1, l2, saliency and Oracle
76+
local (boolean): Layer-wise or global pruning
77+
verbose (boolean): Display debug logs during model optimization
78+
rewinding_epochs (int): Number of epochs to retrain model without weight freezing,
79+
allows regrowth of previously pruned weights
80+
cutoff_bad_trials (int): After how many bad trials (performance below threshold),
81+
should model pruning / weight sharing stop
82+
directory (string): Directory to store temporary results
83+
tuner (str): Tuning algorithm, choose between Bayesian, Hyperband and None
84+
knapsack_solver (str): Algorithm to solve Knapsack problem when optimizing;
85+
default usually works well; for very large networks, greedy algorithm might be more suitable
86+
regularization_range (list): List of suitable hyperparameters for weight decay
87+
88+
Returns:
89+
keras.Model: Optimized model
9290
'''
9391

9492
if not isinstance(scheduler, OptimizationScheduler):
@@ -213,7 +211,7 @@ def optimize_model(
213211

214212
# Mask gradients
215213
# Before training the model at the next sparsity level, reset internal states
216-
# Furthemore, modern optimizers (e.g. Adam) accumulate gradients during backprop
214+
# Furthermore, modern optimizers (e.g. Adam) accumulate gradients during backprop
217215
# Therefore, even if the gradient for a weight is zero, it might be updated, due to previous gradients
218216
# Avoid this by resetting the internal variables of an optimizer
219217
optimizable_model.reset_metrics()
@@ -329,7 +327,7 @@ def __call__(self, X, y, s):
329327
- y (tf.Tensor): Output data
330328
- s (float): Sparsity
331329
332-
Return:
330+
Returns:
333331
- loss (tf.Varilable): Model loss with input X and output y
334332
'''
335333
grads = []

0 commit comments

Comments
 (0)