Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 79b2c7c

Browse files
committed
update SparseZoo to latest readme
1 parent b509acf commit 79b2c7c

File tree

1 file changed

+251
-31
lines changed

1 file changed

+251
-31
lines changed

src/content/products/sparsezoo.mdx

Lines changed: 251 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,14 @@ index: 3000
4747

4848
[SparseZoo is a constantly-growing repository](https://sparsezoo.neuralmagic.com) of sparsified (pruned and pruned-quantized) models with matching sparsification recipes for neural networks.
4949
It simplifies and accelerates your time-to-value in building performant deep learning models with a collection of inference-optimized models and recipes to prototype from.
50-
Read more about sparsification [here.](https://docs.neuralmagic.com/main/source/getstarted.html#sparsification)
50+
Read more about sparsification [here.](https://docs.neuralmagic.com/main/source/getstarted.html#sparsification)
5151

5252
Available via API and hosted in the cloud, the SparseZoo contains both baseline and models sparsified to different degrees of inference performance vs. baseline loss recovery.
53-
Recipe-driven approaches built around sparsification algorithms allow you to use the models as given, transfer-learn from the models onto private datasets, or transfer the recipes to your architectures.
53+
Recipe-driven approaches built around sparsification algorithms allow you to use the models as given, transfer-learn from the models onto private datasets, or transfer the recipes to your architectures.
5454

5555
The [GitHub repository](https://github.com/neuralmagic/sparsezoo) contains the Python API code to handle the connection and authentication to the cloud.
5656

57-
<img alt="SparseZoo Flow" src="https://docs.neuralmagic.com/docs/source/infographics/sparsezoo.png" width="100%" />
57+
<img alt="SparseZoo Flow" src="https://docs.neuralmagic.com/docs/source/infographics/sparsezoo.png" width="960px" />
5858

5959
## Highlights
6060

@@ -64,8 +64,8 @@ The [GitHub repository](https://github.com/neuralmagic/sparsezoo) contains the P
6464

6565
## Installation
6666

67-
This repository is tested on Python 3.6-3.9, and Linux/Debian systems.
68-
It is recommended to install in a [virtual environment](https://docs.python.org/3/library/venv.html) to keep your system in order.
67+
This repository is tested on Python 3.7-3.9, and Linux/Debian systems.
68+
It is recommended to install in a [virtual environment](https://docs.python.org/3/library/venv.html) to keep your system in order.
6969

7070
Install with pip using:
7171

@@ -75,47 +75,271 @@ pip install sparsezoo
7575

7676
## Quick Tour
7777

78-
### Python APIs
78+
The SparseZoo Python API enables you to search and download sparsified models. Code examples are given below.
79+
We encourage users to load SparseZoo models by copying a stub directly from a [model page]((https://sparsezoo.neuralmagic.com/)).
7980

80-
The Python APIs respect this format enabling you to search and download models. Some code examples are given below.
81-
The [SparseZoo UI](https://sparsezoo.neuralmagic.com/) also enables users to load models by copying
82-
a stub directly from a model page.
81+
### Introduction to Model Class Object
8382

83+
The `Model` is a fundamental object that serves as a main interface with the SparseZoo library.
84+
It represents a SparseZoo model, together with all its directories and files.
8485

85-
#### Loading from a Stub
86+
#### Creating a Model Class Object From SparseZoo Stub
87+
```python
88+
from sparsezoo import Model
89+
90+
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
91+
92+
model = Model(stub)
93+
print(str(model))
94+
95+
>> Model(stub=zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none)
96+
```
97+
98+
#### Creating a Model Class Object From Local Model Directory
99+
```python
100+
from sparsezoo import Model
101+
102+
directory = ".../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"
103+
104+
model = Model(directory)
105+
print(str(model))
106+
107+
>> Model(directory=.../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0)
108+
```
109+
110+
#### Manually Specifying the Model Download Path
111+
112+
Unless specified otherwise, the model created from the SparseZoo stub is saved to the local sparsezoo cache directory.
113+
This can be overridden by passing the optional `download_path` argument to the constructor:
114+
115+
```python
116+
from sparsezoo import Model
117+
118+
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
119+
download_directory = "./model_download_directory"
120+
121+
model = Model(stub, download_path = download_directory)
122+
```
123+
#### Downloading the Model Files
124+
Once the model is initialized from a stub, it may be downloaded either by calling the `download()` method or by invoking a `path` property. Both pathways are universal for all the files in SparseZoo. Invoking the `path` property will always trigger file download unless the file has already been downloaded.
125+
126+
```python
127+
# method 1
128+
model.download()
129+
130+
# method 2
131+
model_path = model.path
132+
```
133+
134+
#### Inspecting the Contents of the SparseZoo Model
135+
136+
We call the `available_files` method to inspect which files are present in the SparseZoo model. Then, we select a file by calling the appropriate attribute:
137+
138+
```python
139+
model.available_files
140+
141+
>> {'training': Directory(name=training),
142+
>> 'deployment': Directory(name=deployment),
143+
>> 'sample_inputs': Directory(name=sample_inputs.tar.gz),
144+
>> 'sample_outputs': {'framework': Directory(name=sample_outputs.tar.gz)},
145+
>> 'sample_labels': Directory(name=sample_labels.tar.gz),
146+
>> 'model_card': File(name=model.md),
147+
>> 'recipes': Directory(name=recipe),
148+
>> 'onnx_model': File(name=model.onnx)}
149+
```
150+
Then, we might take a closer look at the contents of the SparseZoo model:
151+
```python
152+
model_card = model.model_card
153+
print(model_card)
154+
155+
>> File(name=model.md)
156+
```
157+
```python
158+
model_card_path = model.model_card.path
159+
print(model_card_path)
160+
161+
>> .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0/model.md
162+
```
163+
164+
165+
### Model, Directory, and File
166+
167+
In general, every file in the SparseZoo model shares a set of attributes: `name`, `path`, `URL`, and `parent`:
168+
- `name` serves as an identifier of the file/directory
169+
- `path` points to the location of the file/directory
170+
- `URL` specifies the server address of the file/directory in question
171+
- `parent` points to the location of the parent directory of the file/directory in question
172+
173+
A directory is a unique type of file that contains other files. For that reason, it has an additional `files` attribute.
174+
175+
```python
176+
print(model.onnx_model)
177+
178+
>> File(name=model.onnx)
179+
180+
print(f"File name: {model.onnx_model.name}\n"
181+
f"File path: {model.onnx_model.path}\n"
182+
f"File URL: {model.onnx_model.url}\n"
183+
f"Parent directory: {model.onnx_model.parent_directory}")
184+
185+
>> File name: model.onnx
186+
>> File path: .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0/model.onnx
187+
>> File URL: https://models.neuralmagic.com/cv-classification/...
188+
>> Parent directory: .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0
189+
```
190+
191+
```python
192+
print(model.recipes)
193+
194+
>> Directory(name=recipe)
195+
196+
print(f"File name: {model.recipes.name}\n"
197+
f"Contains: {[file.name for file in model.recipes.files]}\n"
198+
f"File path: {model.recipes.path}\n"
199+
f"File URL: {model.recipes.url}\n"
200+
f"Parent directory: {model.recipes.parent_directory}")
201+
202+
>> File name: recipe
203+
>> Contains: ['recipe_original.md', 'recipe_transfer-classification.md']
204+
>> File path: /home/user/.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0/recipe
205+
>> File URL: None
206+
>> Parent directory: /home/user/.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0
207+
```
208+
209+
### Selecting Checkpoint-Specific Data
210+
211+
A SparseZoo model may contain several checkpoints. The model may contain a checkpoint that had been saved before the model was quantized - that checkpoint would be used for transfer learning. Another checkpoint might have been saved after the quantization step - that one is usually directly used for inference.
212+
213+
The recipes may also vary depending on the use case. We may want to access a recipe that was used to sparsify the dense model (`recipe_original`) or the one that enables us to sparse transfer learn from the already sparsified model (`recipe_transfer`).
214+
215+
There are two ways to access those specific files.
216+
217+
#### Accessing Recipes (Through Python API)
218+
```python
219+
available_recipes = model.recipes.available
220+
print(available_recipes)
221+
222+
>> ['original', 'transfer-classification']
223+
224+
transfer_recipe = model.recipes["transfer-classification"]
225+
print(transfer_recipe)
226+
227+
>> File(name=recipe_transfer-classification.md)
228+
229+
original_recipe = model.recipes.default # recipe defaults to `original`
230+
original_recipe_path = original_recipe.path # downloads the recipe and returns its path
231+
print(original_recipe_path)
232+
233+
>> .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0/recipe/recipe_original.md
234+
```
235+
236+
#### Accessing Checkpoints (Through Python API)
237+
In general, we are expecting the following checkpoints to be included in the model:
238+
239+
- `checkpoint_prepruning`
240+
- `checkpoint_postpruning`
241+
- `checkpoint_preqat`
242+
- `checkpoint_postqat`
243+
244+
The checkpoint that the model defaults to is the `preqat` state (just before the quantization step).
245+
246+
```python
247+
from sparsezoo import Model
248+
249+
stub = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"
86250

251+
model = Model(stub)
252+
available_checkpoints = model.training.available
253+
print(available_checkpoints)
254+
255+
>> ['preqat']
256+
257+
preqat_checkpoint = model.training.default # recipe defaults to `preqat`
258+
preqat_checkpoint_path = preqat_checkpoint.path # downloads the checkpoint and returns its path
259+
print(preqat_checkpoint_path)
260+
261+
>> .../.cache/sparsezoo/0857c6f2-13c1-43c9-8db8-8f89a548dccd/training
262+
263+
[print(file.name) for file in preqat_checkpoint.files]
264+
265+
>> vocab.txt
266+
>> special_tokens_map.json
267+
>> pytorch_model.bin
268+
>> config.json
269+
>> training_args.bin
270+
>> tokenizer_config.json
271+
>> trainer_state.json
272+
>> tokenizer.json
273+
```
274+
275+
276+
#### Accessing Recipes (Through Stub String Arguments)
277+
278+
You can also directly request a specific recipe/checkpoint type by appending the appropriate URL query arguments to the stub:
87279
```python
88280
from sparsezoo import Model
89281

90-
# copied from https://sparsezoo.neuralmagic.com/
91-
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned90_quant-none"
282+
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"
283+
92284
model = Model(stub)
93-
print(model)
285+
286+
# Inspect which files are present.
287+
# Note that the available recipes are restricted
288+
# according to the specified URL query arguments
289+
print(model.recipes.available)
290+
291+
>> ['transfer-classification']
292+
293+
transfer_recipe = model.recipes.default # Now the recipes default to the one selected by the stub string arguments
294+
print(transfer_recipe)
295+
296+
>> File(name=recipe_transfer-classification.md)
297+
```
298+
299+
### Accessing Sample Data
300+
301+
The user may easily request a sample batch of data that represents the inputs and outputs of the model.
302+
303+
```python
304+
sample_data = model.sample_batch(batch_size = 10)
305+
306+
print(sample_data['sample_inputs'][0].shape)
307+
>> (10, 3, 224, 224) # (batch_size, num_channels, image_dim, image_dim)
308+
309+
print(sample_data['sample_outputs'][0].shape)
310+
>> (10, 1000) # (batch_size, num_classes)
94311
```
95312

96-
#### Searching the Zoo
313+
### Model Search
314+
The function `search_models` enables the user to quickly filter the contents of SparseZoo repository to find the stubs of interest:
97315

98316
```python
99317
from sparsezoo import search_models
100318

101-
models = search_models(
102-
domain="cv",
103-
sub_domain="classification",
104-
return_stubs=True,
105-
)
106-
print(models)
319+
args = {
320+
"domain": "cv",
321+
"sub_domain": "segmentation",
322+
"architecture": "yolact",
323+
}
324+
325+
models = search_models(**args)
326+
[print(model) for model in models]
327+
328+
>> Model(stub=zoo:cv/segmentation/yolact-darknet53/pytorch/dbolya/coco/pruned82_quant-none)
329+
>> Model(stub=zoo:cv/segmentation/yolact-darknet53/pytorch/dbolya/coco/pruned90-none)
330+
>> Model(stub=zoo:cv/segmentation/yolact-darknet53/pytorch/dbolya/coco/base-none)
107331
```
108332

109333
### Environmental Variables
110334

111335
Users can specify the directory where models (temporarily during download) and its required credentials will be saved in your working machine.
112-
`SPARSEZOO_MODELS_PATH` is the path where the downloaded models will be saved temporarily. Default `~/.cache/sparsezoo/`
113-
`SPARSEZOO_CREDENTIALS_PATH` is the path where `credentials.yaml` will be saved. Default `~/.cache/sparsezoo/`
336+
`SPARSEZOO_MODELS_PATH` is the path where the downloaded models will be saved temporarily. Default `~/.cache/sparsezoo/`
337+
`SPARSEZOO_CREDENTIALS_PATH` is the path where `credentials.yaml` will be saved. Default `~/.cache/sparsezoo/`
114338

115339
### Console Scripts
116340

117341
In addition to the Python APIs, a console script entry point is installed with the package `sparsezoo`.
118-
This enables easy interaction straight from your console/terminal.
342+
This enables easy interaction straight from your console/terminal.
119343

120344
#### Downloading
121345

@@ -125,15 +349,13 @@ Download command help
125349
sparsezoo.download -h
126350
```
127351

128-
<br></br>
129-
Download ResNet-50 Model
352+
<br/>Download ResNet-50 Model
130353

131354
```shell script
132355
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none
133356
```
134357

135-
<br></br>
136-
Download pruned and quantized ResNet-50 Model
358+
<br/>Download pruned and quantized ResNet-50 Model
137359

138360
```shell script
139361
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate
@@ -147,15 +369,13 @@ Search command help
147369
sparsezoo search -h
148370
```
149371

150-
<br></br>
151-
Searching for all classification MobileNetV1 models in the computer vision domain
372+
<br/>Searching for all classification MobileNetV1 models in the computer vision domain
152373

153374
```shell script
154375
sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1
155376
```
156377

157-
<br></br>
158-
Searching for all ResNet-50 models
378+
<br/>Searching for all ResNet-50 models
159379

160380
```shell script
161381
sparsezoo search --domain cv --sub-domain classification \

0 commit comments

Comments
 (0)