Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 4d7ba00

Browse files
robertgshaw2-redhatmarkurtzjeanniefinks
authored
Docs 1.0 refactor rs (#89)
* RS Changes. Note to Mark, I called out all of the spots for your review with * trying to repost * Updated changes post-discussion with Mark * Update src/content/get-started/deploy-a-model/cv-object-detection.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/cv-object-detection.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/cv-object-detection.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/sparsify-a-model/custom-integrations.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/products/sparseml.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/use-cases/deploying-deepsparse/deepsparse-server.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/transfer-a-sparsified-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/transfer-a-sparsified-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/transfer-a-sparsified-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/transfer-a-sparsified-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/transfer-a-sparsified-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/products/sparseml.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/products/sparseml.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/products/sparseml.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/cv-object-detection.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/deploy-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/try-a-model/cv-object-detection.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/get-started/try-a-model/nlp-text-classification.mdx Co-authored-by: Jeannie Finks <[email protected]> * Update src/content/user-guide/sparsification.mdx Co-authored-by: Jeannie Finks <[email protected]> Co-authored-by: Mark Kurtz <[email protected]> Co-authored-by: Jeannie Finks <[email protected]>
1 parent bbec735 commit 4d7ba00

36 files changed

+898
-527
lines changed

package-lock.json

Lines changed: 74 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/content/get-started/deploy-a-model/cv-object-detection.mdx

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,19 @@ index: 2000
88

99
# Deploy an Object Detection Model
1010

11-
The DeepSparse Server wraps pipelines, including the object detection pipeline.
12-
Therefore, the server supports images and image files as inputs and outputs the labeled predictions without extra effort.
13-
With all of this built on top of the DeepSparse Engine, the simplicity of servable pipelines is combined with GPU class performance on CPUs for sparse models.
11+
This page walks through an example of deploying an object detection model with DeepSparse Server.
12+
13+
The DeepSparse Server is a server wrapper around `Pipelines`, including the object detection pipeline. As such,
14+
the server provides and HTTP interface that accepts images and image files as inputs and outputs the labeled predictions.
15+
With all of this built on top of the DeepSparse Engine, the simplicity of servable pipelines is combined with GPU-class performance on CPUs for sparse models.
1416

1517
## Start the Server
1618

17-
Before starting the server, the model must be set up in the format expected for DeepSparse Pipelines.
18-
The expectations for this are found in the [Test a Model](../../test-a-model) section.
19-
With that expectation set, the **deepsparse.server** command can be used with either a local model or a SparseZoo stub.
19+
Before starting the server, the model must be set up in the format expected for DeepSparse `Pipelines`.
20+
See an example of how to setup `Pipelines` in the [Try a Model](../../try-a-model) section.
21+
22+
Once the `Pipelines` are set up, the `deepsparse.server` command launches a server with the model at `--model_path` inside. The `model_path` can either
23+
be a SparseZoo stub or a path to a local `model.onnx` file.
2024

2125
The command below shows how to start up the DeepSparse Server for a sparsified YOLOv5l model trained on the COCO dataset from the SparseZoo.
2226
The output confirms the server was started on port `:5543` with a `/docs` route for general info and a `/predict/from_files` route for inference.
@@ -39,22 +43,22 @@ $ deepsparse.server \
3943

4044
## View the Request Specs
4145

42-
As noted in the startup command, a **/docs route** was created; it contains OpenAPI specs and definitions for the expected inputs and responses.
46+
As noted in the startup command, a `/docs` route was created; it contains OpenAPI specs and definitions for the expected inputs and responses.
4347
Visiting the `http://localhost:5543/docs` in a browser shows the available routes on the server.
4448
The important one for object detection is the `/predict/from_files` POST route which takes the form of a standard files argument.
4549
The files argument enables uploading one or more image files for object detection processing.
4650

4751
## Make a Request
4852

4953
With the expected input payload and method type defined, any HTTP request package can be used to make the request.
50-
The code below shows how to request the same instance the server was started.
54+
5155
First, a CURL request is made to download a sample image for use with the sample request.
5256

5357
```bash
5458
wget -O basilica.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolo/sample_images/basilica.jpg
5559
```
5660

57-
Next, for simplicity and generality, the Python requests package is used to make a POST method request to the /predict/from_files pathway on localhost:5543 with the downloaded file.
61+
Next, for simplicity and generality, the Python requests package is used to make a POST method request to the `/predict/from_files` pathway on `localhost:5543` with the downloaded file.
5862
The predicted outputs can then be printed out or used in a later pipeline.
5963

6064
```python

src/content/get-started/deploy-a-model/nlp-text-classification.mdx

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,21 @@ index: 1000
88

99
# Deploy a Text Classification Model
1010

11-
The DeepSparse Server wraps pipelines, including the sentiment analysis pipeline.
12-
Therefore, the server supports raw text sequences as inputs and outputs the labeled predictions without extra effort.
13-
With all of this built on top of the DeepSparse Engine, the simplicity of servable pipelines is combined with GPU class performance on CPUs for sparse models.
11+
This page walks through an example of deploying a text-classification model with DeepSparse Server.
12+
13+
The DeepSparse Server is a server wrapper around `Pipelines`, including the sentiment analysis pipeline. As such,
14+
the server provides an HTTP interface that accepts raw text sequences as inputs and responds with the labeled predictions.
15+
With all of this built on top of the DeepSparse Engine, the simplicity of servable pipelines is combined with GPU-class performance on CPUs for sparse models.
1416

1517
## Start the Server
1618

17-
Before starting the server, the model must be set up in the format expected for DeepSparse Pipelines.
18-
The expectations for this are found in the [Test a Model](../../test-a-model) section.
19-
With that expectation set, the **deepsparse.server** command can be used with either a local model or a SparseZoo stub.
19+
Before starting the server, the model must be set up in the format expected for DeepSparse `Pipelines`.
20+
See an example of how to set up `Pipelines` in the [Try a Model](../../try-a-model) section.
21+
22+
Once the `Pipelines` are set up, the `deepsparse.server` command launches a server with the model at `--model_path` inside. The `model_path` can either
23+
be a SparseZoo stub or a local model path.
2024

21-
The command below shows how to start up the DeepSparse Server for a sparsified DistilBERT model trained on the SST-2 dataset for sentiment analysis from the SparseZoo.
25+
The command below starts up the DeepSparse Server for a sparsified DistilBERT model (from the SparseZoo) trained on the SST2 dataset for sentiment analysis.
2226
The output confirms the server was started on port `:5543` with a `/docs` route for general info and a `/predict` route for inference.
2327

2428
```bash
@@ -39,9 +43,9 @@ $ deepsparse.server \
3943

4044
## View the Request Specs
4145

42-
As noted in the startup command, a **/docs route** was created; it contains OpenAPI specs and definitions for the expected inputs and responses.
46+
As noted in the startup command, a `/docs route` was created; it contains OpenAPI specs and definitions for the expected inputs and responses.
4347
Visiting the `http://localhost:5543/docs` in a browser shows the available routes on the server.
44-
For the /predict route specifically, it shows the following as the expected input schema:
48+
For the `/predict` route specifically, it shows the following as the expected input schema:
4549

4650
```text
4751
TextClassificationInput{
@@ -68,9 +72,9 @@ Utilizing the request spec, a valid input for the sentiment analysis would be:
6872
## Make a Request
6973

7074
With the expected input payload and method type defined, any HTTP request package can be used to make the request.
71-
For simplicity and generality, the curl command is used.
72-
The command below shows how to request the same instance the server was started.
73-
Specifically, it makes a POST method request to the /predict pathway on localhost:5543 with the JSON payload created above.
75+
For simplicity and generality, the `curl` command is used.
76+
77+
The code below makes a POST method request to the `/predict` pathway on `localhost:5543` with the JSON payload created above.
7478
The predicted outputs from the model are then printed in the terminal.
7579

7680
```bash

src/content/get-started/install/deepsparse.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ index: 1000
88

99
# DeepSparse Installation
1010

11-
The [DeepSparse Engine](../../products/deepsparse) enables GPU-class performance on CPUs for neural network deployments exported to the [ONNX model format](https://onnx.ai/).
12-
It leverages sparsity within models to reduce compute and the unique cache hierarchy on CPUs to reduce memory movement.
11+
The [DeepSparse Engine](../../products/deepsparse) enables GPU-class performance on CPUs, leveraging sparsity within models to reduce FLOPs and the unique cache hierarchy on CPUs to reduce memory movement.
12+
The engine accepts models in the open-source [ONNX format](https://onnx.ai/), which are easily created from PyTorch and TensorFlow models.
1313

14-
Currently, DeepSparse is tested on Python 3.6-3.9, ONNX 1.5.0-1.10.1, ONNX opset version 11+ and is [manylinux compliant](https://peps.python.org/pep-0513/).
14+
Currently, DeepSparse is tested on Python 3.7-3.9, ONNX 1.5.0-1.10.1, ONNX opset version 11+ and is [manylinux compliant](https://peps.python.org/pep-0513/).
1515
It is limited to [Linux systems](https://www.linux.org/) running on [X86 CPU architectures](https://en.wikipedia.org/wiki/X86).
1616

1717
## General Install

src/content/get-started/install/sparseml.mdx

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,9 @@ index: 2000
88

99
# SparseML Installation
1010

11-
[SparseML](/products/sparseml) leverages [recipes](/user-guide/what-are-recipes) to allow model [sparsification](/user-guide/what-is-sparsification) with only a few lines of code in most pipelines.
12-
It supports applying state-of-the-art sparsification algorithms such as pruning and quantization to any neural network.
11+
[SparseML](/products/sparseml) enables you to create sparse models trained on your data. It supports transfer learning from sparse models to new data and sparsifying dense models from scratch with state-of-the-art algorithms for pruning and quantization.
1312

14-
Currently, SparseML is tested on Python 3.6-3.9 and is limited to [Linux](https://www.linux.org/) and [MacOS](https://www.apple.com/mac/) systems.
13+
Currently, SparseML is tested on Python 3.7-3.9 and is limited to [Linux](https://www.linux.org/) and [MacOS](https://www.apple.com/mac/) systems.
1514

1615
## General Install
1716

@@ -31,7 +30,7 @@ To install, use the following extra option:
3130
pip install sparseml[torch]
3231
```
3332

34-
To install torchvision as well, use the following extra options:
33+
To install torchvision, use the following extra options:
3534

3635
```bash
3736
pip install sparseml[torch,torchvision]

src/content/get-started/install/sparsezoo.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,10 @@ index: 3000
1010

1111
The [SparseZoo](/products/sparsezoo) stores presparsified models and sparsification recipes so you can easily apply them to your data.
1212
This installs the Python API and CLIs for downloading models and recipes from the [SparseZoo UI](https://sparsezoo.neuralmagic.com/).
13-
Note, that the SparseZoo package is automatically installed with both SparseML and DeepSparse.
1413

15-
Currently, the SparseZoo Python APIs and CLIs are tested on Python 3.6-3.9 and are limited to [Linux](https://www.linux.org/) and [MacOS](https://www.apple.com/mac/) systems.
14+
Note that the SparseZoo package is automatically installed with both SparseML and DeepSparse.
15+
16+
Currently, the SparseZoo Python APIs and CLIs are tested on Python 3.7-3.9 and are limited to [Linux](https://www.linux.org/) and [MacOS](https://www.apple.com/mac/) systems.
1617

1718
## General Install
1819

src/content/get-started/sparsify-a-model.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ index: 4000
88

99
# Sparsify a Model
1010

11-
SparseML contains many state-of-the-art, advanced sparsification algorithms, including pruning, distillation, and quantization techniques.
12-
These algorithms are built on top of sparsification recipes enabling easy integration into ML pipelines to sparsify most neural networks.
13-
In addition to integrating into custom pipelines, it contains integrations with many popular ML repositories.
14-
With these integrations, creating a recipe is all needed to sparsify any model the repos contain.
11+
SparseML enables you to create a sparse model from scratch. The library contains state-of-the-art sparsification algorithms, including pruning, distillation, and quantization techniques.
12+
13+
These algorithms are built on top of sparsification recipes, enabling easy integration into custom ML training pipelines to sparsify most neural networks.
14+
Additionally, SparseML integrates with popular ML repositories like HuggingFace Transformers and Ultralytics YOLO. With these integrations, creating a recipe and passing it to a CLI is all you need to sparsify a model.
1515

1616
Aside from sparsification algorithms, SparseML contains generic export pathways for performant deployments.
1717
These export pathways ensure the model saves in the correct format and rewrites the inference graphs for performance, such as quantized operator folding.

0 commit comments

Comments
 (0)