You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 3, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: src/content/get-started/sparsify-a-model/supported-integrations.mdx
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,13 @@ index: 1000
10
10
11
11
This page walks through an example of creating a sparsification recipe to prune a dense model from scratch and applying a recipe to a supported integration.
12
12
13
-
SparseML has pre-made integrations with many popular model repositories, such as with HuggingFace Transformers and Ultralytics YOLOv5.
14
-
For these integrations, a sparsification recipe is all you need, and you can apply state-of-the-art sparsification algorithms, including
13
+
SparseML has pre-made integrations with many popular model repositories, such as with HuggingFace Transformers and Ultralytics YOLOv5.
14
+
For these integrations, a sparsification recipe is all you need, and you can apply state-of-the-art sparsification algorithms, including
15
15
pruning, distillation, and quantization, with a single command line call.
16
16
17
17
## Pruning and Pruning Recipes
18
18
19
-
Pruning is a systematic way of removing redundant weights and connections within a neural network. An applied pruning algorithm must determine which
19
+
Pruning is a systematic way of removing redundant weights and connections within a neural network. An applied pruning algorithm must determine which
20
20
weights are redundant and will not affect the accuracy.
21
21
22
22
A standard algorithm for pruning is gradual magnitude pruning, or GMP for short.
@@ -41,29 +41,29 @@ The following are reasonably default values to start with:
41
41
42
42
SparseML conveniently encodes these hyperparameters into a YAML-based **Recipe** file. The rest of the system parses the arguments in the YAML file to set the parameters of the algorithm.
43
43
44
-
For example, the following `recipe.yaml` file for the default values listed above:
44
+
For example, the following `recipe.yaml` file for the default values listed above:
45
45
```yaml
46
46
modifiers:
47
-
!GlobalMagnitudePruningModifier
47
+
- !GlobalMagnitudePruningModifier
48
48
init_sparsity: 0.05
49
49
final_sparsity: 0.8
50
50
start_epoch: 0.0
51
51
end_epoch: 30.0
52
52
update_frequency: 1.0
53
53
params: __ALL_PRUNABLE__
54
54
55
-
!SetLearningRateModifier
55
+
- !SetLearningRateModifier
56
56
start_epoch: 0.0
57
57
learning_rate: 0.05
58
58
59
-
!LearningRateFunctionModifier
59
+
- !LearningRateFunctionModifier
60
60
start_epoch: 30.0
61
61
end_epoch: 50.0
62
62
lr_func: cosine
63
63
init_lr: 0.05
64
64
final_lr: 0.001
65
65
66
-
!EpochRangeModifier
66
+
- !EpochRangeModifier
67
67
start_epoch: 0.0
68
68
end_epoch: 50.0
69
69
```
@@ -78,7 +78,7 @@ In this recipe:
78
78
79
79
## Quantization and Quantization Recipes
80
80
81
-
A quantization recipe systematically reduces the precision for weights and activations within a neural network, generally from `FP32` to `INT8`. Running a quantized
81
+
A quantization recipe systematically reduces the precision for weights and activations within a neural network, generally from `FP32` to `INT8`. Running a quantized
82
82
model increases speed and reduces memory consumption while sacrificing very little in terms of accuracy.
83
83
84
84
Quantization aware training (QAT) is the standard algorithm. With QAT, fake quantization operators are injected into the graph before quantizable nodes for activations, and weights are wrapped with fake quantization operators.
@@ -94,19 +94,19 @@ The following are reasonably good values to start with:
94
94
- The number of quantized training epochs is set to 5.
95
95
- The batch normalization statistics are frozen at the start of the third epoch.
96
96
97
-
For example, the following `recipe.yaml` file for the default values listed above:
97
+
For example, the following `recipe.yaml` file for the default values listed above:
98
98
```yaml
99
99
modifiers:
100
-
!QuantizationModifier
100
+
- !QuantizationModifier
101
101
start_epoch: 0.0
102
102
submodules: ['model']
103
103
freeze_bn_stats_epoch: 3.0
104
104
105
-
!SetLearningRateModifier
105
+
- !SetLearningRateModifier
106
106
start_epoch: 0.0
107
107
learning_rate: 10e-6
108
108
109
-
!EpochRangeModifier
109
+
- !EpochRangeModifier
110
110
start_epoch: 0.0
111
111
end_epoch: 5.0
112
112
```
@@ -127,35 +127,35 @@ This prevents stability issues from lacking precision when pruning and utilizing
127
127
Combining the two previous recipes creates the following new recipe.yaml file:
0 commit comments