@@ -47,14 +47,14 @@ index: 3000
47
47
48
48
[ SparseZoo is a constantly-growing repository] ( https://sparsezoo.neuralmagic.com ) of sparsified (pruned and pruned-quantized) models with matching sparsification recipes for neural networks.
49
49
It simplifies and accelerates your time-to-value in building performant deep learning models with a collection of inference-optimized models and recipes to prototype from.
50
- Read more about sparsification [ here.] ( https://docs.neuralmagic.com/main/source/getstarted.html#sparsification )
50
+ Read more about sparsification [ here.] ( https://docs.neuralmagic.com/main/source/getstarted.html#sparsification )
51
51
52
52
Available via API and hosted in the cloud, the SparseZoo contains both baseline and models sparsified to different degrees of inference performance vs. baseline loss recovery.
53
- Recipe-driven approaches built around sparsification algorithms allow you to use the models as given, transfer-learn from the models onto private datasets, or transfer the recipes to your architectures.
53
+ Recipe-driven approaches built around sparsification algorithms allow you to use the models as given, transfer-learn from the models onto private datasets, or transfer the recipes to your architectures.
54
54
55
55
The [ GitHub repository] ( https://github.com/neuralmagic/sparsezoo ) contains the Python API code to handle the connection and authentication to the cloud.
56
56
57
- <img alt = " SparseZoo Flow" src = " https://docs.neuralmagic.com/docs/source/infographics/sparsezoo.png" width = " 100% " />
57
+ <img alt = " SparseZoo Flow" src = " https://docs.neuralmagic.com/docs/source/infographics/sparsezoo.png" width = " 960px " />
58
58
59
59
## Highlights
60
60
@@ -64,8 +64,8 @@ The [GitHub repository](https://github.com/neuralmagic/sparsezoo) contains the P
64
64
65
65
## Installation
66
66
67
- This repository is tested on Python 3.6 -3.9, and Linux/Debian systems.
68
- It is recommended to install in a [ virtual environment] ( https://docs.python.org/3/library/venv.html ) to keep your system in order.
67
+ This repository is tested on Python 3.7 -3.9, and Linux/Debian systems.
68
+ It is recommended to install in a [ virtual environment] ( https://docs.python.org/3/library/venv.html ) to keep your system in order.
69
69
70
70
Install with pip using:
71
71
@@ -75,47 +75,271 @@ pip install sparsezoo
75
75
76
76
## Quick Tour
77
77
78
- ### Python APIs
78
+ The SparseZoo Python API enables you to search and download sparsified models. Code examples are given below.
79
+ We encourage users to load SparseZoo models by copying a stub directly from a [ model page] ( (https://sparsezoo.neuralmagic.com/) ) .
79
80
80
- The Python APIs respect this format enabling you to search and download models. Some code examples are given below.
81
- The [ SparseZoo UI] ( https://sparsezoo.neuralmagic.com/ ) also enables users to load models by copying
82
- a stub directly from a model page.
81
+ ### Introduction to Model Class Object
83
82
83
+ The ` Model ` is a fundamental object that serves as a main interface with the SparseZoo library.
84
+ It represents a SparseZoo model, together with all its directories and files.
84
85
85
- #### Loading from a Stub
86
+ #### Creating a Model Class Object From SparseZoo Stub
87
+ ``` python
88
+ from sparsezoo import Model
89
+
90
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
91
+
92
+ model = Model(stub)
93
+ print (str (model))
94
+
95
+ >> Model(stub = zoo:cv/ classification/ resnet_v1- 50 / pytorch/ sparseml/ imagenet/ pruned95_quant- none)
96
+ ```
97
+
98
+ #### Creating a Model Class Object From Local Model Directory
99
+ ``` python
100
+ from sparsezoo import Model
101
+
102
+ directory = " .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"
103
+
104
+ model = Model(directory)
105
+ print (str (model))
106
+
107
+ >> Model(directory = ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 )
108
+ ```
109
+
110
+ #### Manually Specifying the Model Download Path
111
+
112
+ Unless specified otherwise, the model created from the SparseZoo stub is saved to the local sparsezoo cache directory.
113
+ This can be overridden by passing the optional ` download_path ` argument to the constructor:
114
+
115
+ ``` python
116
+ from sparsezoo import Model
117
+
118
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
119
+ download_directory = " ./model_download_directory"
120
+
121
+ model = Model(stub, download_path = download_directory)
122
+ ```
123
+ #### Downloading the Model Files
124
+ Once the model is initialized from a stub, it may be downloaded either by calling the ` download() ` method or by invoking a ` path ` property. Both pathways are universal for all the files in SparseZoo. Invoking the ` path ` property will always trigger file download unless the file has already been downloaded.
125
+
126
+ ``` python
127
+ # method 1
128
+ model.download()
129
+
130
+ # method 2
131
+ model_path = model.path
132
+ ```
133
+
134
+ #### Inspecting the Contents of the SparseZoo Model
135
+
136
+ We call the ` available_files ` method to inspect which files are present in the SparseZoo model. Then, we select a file by calling the appropriate attribute:
137
+
138
+ ``` python
139
+ model.available_files
140
+
141
+ >> {' training' : Directory(name = training),
142
+ >> ' deployment' : Directory(name = deployment),
143
+ >> ' sample_inputs' : Directory(name = sample_inputs.tar.gz),
144
+ >> ' sample_outputs' : {' framework' : Directory(name = sample_outputs.tar.gz)},
145
+ >> ' sample_labels' : Directory(name = sample_labels.tar.gz),
146
+ >> ' model_card' : File(name = model.md),
147
+ >> ' recipes' : Directory(name = recipe),
148
+ >> ' onnx_model' : File(name = model.onnx)}
149
+ ```
150
+ Then, we might take a closer look at the contents of the SparseZoo model:
151
+ ``` python
152
+ model_card = model.model_card
153
+ print (model_card)
154
+
155
+ >> File(name = model.md)
156
+ ```
157
+ ``` python
158
+ model_card_path = model.model_card.path
159
+ print (model_card_path)
160
+
161
+ >> ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / model.md
162
+ ```
163
+
164
+
165
+ ### Model, Directory, and File
166
+
167
+ In general, every file in the SparseZoo model shares a set of attributes: ` name ` , ` path ` , ` URL ` , and ` parent ` :
168
+ - ` name ` serves as an identifier of the file/directory
169
+ - ` path ` points to the location of the file/directory
170
+ - ` URL ` specifies the server address of the file/directory in question
171
+ - ` parent ` points to the location of the parent directory of the file/directory in question
172
+
173
+ A directory is a unique type of file that contains other files. For that reason, it has an additional ` files ` attribute.
174
+
175
+ ``` python
176
+ print (model.onnx_model)
177
+
178
+ >> File(name = model.onnx)
179
+
180
+ print (f " File name: { model.onnx_model.name} \n "
181
+ f " File path: { model.onnx_model.path} \n "
182
+ f " File URL: { model.onnx_model.url} \n "
183
+ f " Parent directory: { model.onnx_model.parent_directory} " )
184
+
185
+ >> File name: model.onnx
186
+ >> File path: ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / model.onnx
187
+ >> File URL : https:// models.neuralmagic.com/ cv- classification/ ...
188
+ >> Parent directory: ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0
189
+ ```
190
+
191
+ ``` python
192
+ print (model.recipes)
193
+
194
+ >> Directory(name = recipe)
195
+
196
+ print (f " File name: { model.recipes.name} \n "
197
+ f " Contains: { [file .name for file in model.recipes.files]} \n "
198
+ f " File path: { model.recipes.path} \n "
199
+ f " File URL: { model.recipes.url} \n "
200
+ f " Parent directory: { model.recipes.parent_directory} " )
201
+
202
+ >> File name: recipe
203
+ >> Contains: [' recipe_original.md' , ' recipe_transfer-classification.md' ]
204
+ >> File path: / home/ user/ .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / recipe
205
+ >> File URL : None
206
+ >> Parent directory: / home/ user/ .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0
207
+ ```
208
+
209
+ ### Selecting Checkpoint-Specific Data
210
+
211
+ A SparseZoo model may contain several checkpoints. The model may contain a checkpoint that had been saved before the model was quantized - that checkpoint would be used for transfer learning. Another checkpoint might have been saved after the quantization step - that one is usually directly used for inference.
212
+
213
+ The recipes may also vary depending on the use case. We may want to access a recipe that was used to sparsify the dense model (` recipe_original ` ) or the one that enables us to sparse transfer learn from the already sparsified model (` recipe_transfer ` ).
214
+
215
+ There are two ways to access those specific files.
216
+
217
+ #### Accessing Recipes (Through Python API)
218
+ ``` python
219
+ available_recipes = model.recipes.available
220
+ print (available_recipes)
221
+
222
+ >> [' original' , ' transfer-classification' ]
223
+
224
+ transfer_recipe = model.recipes[" transfer-classification" ]
225
+ print (transfer_recipe)
226
+
227
+ >> File(name = recipe_transfer- classification.md)
228
+
229
+ original_recipe = model.recipes.default # recipe defaults to `original`
230
+ original_recipe_path = original_recipe.path # downloads the recipe and returns its path
231
+ print (original_recipe_path)
232
+
233
+ >> ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / recipe/ recipe_original.md
234
+ ```
235
+
236
+ #### Accessing Checkpoints (Through Python API)
237
+ In general, we are expecting the following checkpoints to be included in the model:
238
+
239
+ - ` checkpoint_prepruning `
240
+ - ` checkpoint_postpruning `
241
+ - ` checkpoint_preqat `
242
+ - ` checkpoint_postqat `
243
+
244
+ The checkpoint that the model defaults to is the ` preqat ` state (just before the quantization step).
245
+
246
+ ``` python
247
+ from sparsezoo import Model
248
+
249
+ stub = " zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"
86
250
251
+ model = Model(stub)
252
+ available_checkpoints = model.training.available
253
+ print (available_checkpoints)
254
+
255
+ >> [' preqat' ]
256
+
257
+ preqat_checkpoint = model.training.default # recipe defaults to `preqat`
258
+ preqat_checkpoint_path = preqat_checkpoint.path # downloads the checkpoint and returns its path
259
+ print (preqat_checkpoint_path)
260
+
261
+ >> ... / .cache/ sparsezoo/ 0857c6f2 - 13c1 - 43c9 - 8db8 - 8f89a548dccd / training
262
+
263
+ [print (file .name) for file in preqat_checkpoint.files]
264
+
265
+ >> vocab.txt
266
+ >> special_tokens_map.json
267
+ >> pytorch_model.bin
268
+ >> config.json
269
+ >> training_args.bin
270
+ >> tokenizer_config.json
271
+ >> trainer_state.json
272
+ >> tokenizer.json
273
+ ```
274
+
275
+
276
+ #### Accessing Recipes (Through Stub String Arguments)
277
+
278
+ You can also directly request a specific recipe/checkpoint type by appending the appropriate URL query arguments to the stub:
87
279
``` python
88
280
from sparsezoo import Model
89
281
90
- # copied from https://sparsezoo.neuralmagic.com/
91
- stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned90_quant-none "
282
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer "
283
+
92
284
model = Model(stub)
93
- print (model)
285
+
286
+ # Inspect which files are present.
287
+ # Note that the available recipes are restricted
288
+ # according to the specified URL query arguments
289
+ print (model.recipes.available)
290
+
291
+ >> [' transfer-classification' ]
292
+
293
+ transfer_recipe = model.recipes.default # Now the recipes default to the one selected by the stub string arguments
294
+ print (transfer_recipe)
295
+
296
+ >> File(name = recipe_transfer- classification.md)
297
+ ```
298
+
299
+ ### Accessing Sample Data
300
+
301
+ The user may easily request a sample batch of data that represents the inputs and outputs of the model.
302
+
303
+ ``` python
304
+ sample_data = model.sample_batch(batch_size = 10 )
305
+
306
+ print (sample_data[' sample_inputs' ][0 ].shape)
307
+ >> (10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)
308
+
309
+ print (sample_data[' sample_outputs' ][0 ].shape)
310
+ >> (10 , 1000 ) # (batch_size, num_classes)
94
311
```
95
312
96
- #### Searching the Zoo
313
+ ### Model Search
314
+ The function ` search_models ` enables the user to quickly filter the contents of SparseZoo repository to find the stubs of interest:
97
315
98
316
``` python
99
317
from sparsezoo import search_models
100
318
101
- models = search_models(
102
- domain = " cv" ,
103
- sub_domain = " classification" ,
104
- return_stubs = True ,
105
- )
106
- print (models)
319
+ args = {
320
+ " domain" : " cv" ,
321
+ " sub_domain" : " segmentation" ,
322
+ " architecture" : " yolact" ,
323
+ }
324
+
325
+ models = search_models(** args)
326
+ [print (model) for model in models]
327
+
328
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ pruned82_quant- none)
329
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ pruned90- none)
330
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ base- none)
107
331
```
108
332
109
333
### Environmental Variables
110
334
111
335
Users can specify the directory where models (temporarily during download) and its required credentials will be saved in your working machine.
112
- ` SPARSEZOO_MODELS_PATH ` is the path where the downloaded models will be saved temporarily. Default ` ~/.cache/sparsezoo/ `
113
- ` SPARSEZOO_CREDENTIALS_PATH ` is the path where ` credentials.yaml ` will be saved. Default ` ~/.cache/sparsezoo/ `
336
+ ` SPARSEZOO_MODELS_PATH ` is the path where the downloaded models will be saved temporarily. Default ` ~/.cache/sparsezoo/ `
337
+ ` SPARSEZOO_CREDENTIALS_PATH ` is the path where ` credentials.yaml ` will be saved. Default ` ~/.cache/sparsezoo/ `
114
338
115
339
### Console Scripts
116
340
117
341
In addition to the Python APIs, a console script entry point is installed with the package ` sparsezoo ` .
118
- This enables easy interaction straight from your console/terminal.
342
+ This enables easy interaction straight from your console/terminal.
119
343
120
344
#### Downloading
121
345
@@ -125,15 +349,13 @@ Download command help
125
349
sparsezoo.download -h
126
350
```
127
351
128
- <br ></br >
129
- Download ResNet-50 Model
352
+ <br />Download ResNet-50 Model
130
353
131
354
``` shell script
132
355
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none
133
356
```
134
357
135
- <br ></br >
136
- Download pruned and quantized ResNet-50 Model
358
+ <br />Download pruned and quantized ResNet-50 Model
137
359
138
360
``` shell script
139
361
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate
@@ -147,15 +369,13 @@ Search command help
147
369
sparsezoo search -h
148
370
```
149
371
150
- <br ></br >
151
- Searching for all classification MobileNetV1 models in the computer vision domain
372
+ <br />Searching for all classification MobileNetV1 models in the computer vision domain
152
373
153
374
``` shell script
154
375
sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1
155
376
```
156
377
157
- <br ></br >
158
- Searching for all ResNet-50 models
378
+ <br />Searching for all ResNet-50 models
159
379
160
380
``` shell script
161
381
sparsezoo search --domain cv --sub-domain classification \
0 commit comments