Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 8063962

Browse files
Beth-Kosisjeanniefinksrobertgshaw2-redhat
authored
B kosis use cases (#133)
* Update question-answering.mdx * Update question-answering.mdx * Update question-answering.mdx Line 29: What command? * Update text-classification.mdx * Update token-classification.mdx * Update deploying.mdx Line 307: What is the "appropriate" documentation? * Update sparsifying.mdx Line 12: What is the convention for "torch and torchvision"? Should this be "PyTorch and torchvision"? * Update deploying.mdx * Update sparsifying.mdx * Update deploying.mdx Line 177: What is "appropriate documentation?" * Update deepsparse-server.mdx Line 143: Determine consistent use of graphic icons. Line 143: Do we want to use the word "awesome"? Line 149: What is this image? * Update aws-sagemaker.mdx Line 207: The parenthetical statement needs to have a link or be rewritten to reference something specific. * Update docker.mdx * Update sparsifying.mdx * Update deepsparse-server.mdx * Update question-answering.mdx * Update text-classification.mdx * Update token-classification.mdx * Update sparsifying.mdx * Update sparsifying.mdx * Update src/content/use-cases/object-detection/sparsifying.mdx Co-authored-by: Robert Shaw <[email protected]> Co-authored-by: Jeannie Finks <[email protected]> Co-authored-by: Robert Shaw <[email protected]>
1 parent 813ef4a commit 8063962

File tree

11 files changed

+201
-221
lines changed

11 files changed

+201
-221
lines changed

src/content/use-cases/deploying-deepsparse/aws-sagemaker.mdx

Lines changed: 41 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,14 @@ Deployments benefit from both sparse-CPU acceleration with
1515
DeepSparse and automatic scaling from SageMaker.
1616

1717
## Installation Requirements
18-
The listed steps can be easily completed using a `python` and `bash`. The following
18+
The listed steps can be easily completed using `python` and `bash`. The following
1919
credentials, tools, and libraries are also required:
20-
* The [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) version 2.X that is [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html). Double check if the `region` that is configured in your AWS CLI matches the region in the SparseMaker class found in the `endpoint.py` file. Currently, the default region being used is `us-east-1`.
20+
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) version 2.X that is [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html). Double-check if the `region` that is configured in your AWS CLI matches the region in the SparseMaker class found in the `endpoint.py` file. Currently, the default region being used is `us-east-1`.
2121
* The [ARN](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) of your AWS role requires access to full SageMaker permissions.
22-
* `AmazonSageMakerFullAccess`
23-
* In the following steps, we will refer to this as `ROLE_ARN`. It should take the form `"arn:aws:iam::XXX:role/service-role/XXX"`. In addition to role permissions, make sure the AWS user who configured the AWS CLI configuration has ECR/SageMaker permissions.
24-
* [Docker and the `docker` cli](https://docs.docker.com/get-docker/).
25-
* The `boto3` python AWS sdk (`pip install boto3`).
22+
* `AmazonSageMakerFullAccess`
23+
* In the following steps, we will refer to this as `ROLE_ARN`. It should take the form `"arn:aws:iam::XXX:role/service-role/XXX"`. In addition to role permissions, make sure the AWS user who configured the AWS CLI configuration has ECR/SageMaker permissions.
24+
* [Docker and the `docker` CLI](https://docs.docker.com/get-docker/).
25+
* The `boto3` Python AWS SDK (`pip install boto3`).
2626

2727
### Quick Start
2828

@@ -40,7 +40,7 @@ Run the following command to build your SageMaker endpoint.
4040
python endpoint.py create
4141
```
4242

43-
After the endpoint has been staged (~1 minute), you can start making requests by passing your endpoint `region name` and your `endpoint name`. Afterwards you can run inference by passing in your question and context:
43+
After the endpoint has been staged (~1 minute), you can start making requests by passing your endpoint `region name` and your `endpoint name`. Afterwards, you can run inference by passing in your question and context:
4444

4545

4646
```python
@@ -53,9 +53,9 @@ answer = qa.predict(question="who is batman?", context="Mark is batman.")
5353
print(answer)
5454
```
5555

56-
answer: `b'{"score":0.6484262943267822,"answer":"Mark","start":0,"end":4}'`
56+
The answer is: `b'{"score":0.6484262943267822,"answer":"Mark","start":0,"end":4}'`
5757

58-
If you want to delete your endpoint, please use:
58+
If you want to delete your endpoint, use:
5959

6060
```bash
6161
python endpoint.py destroy
@@ -65,68 +65,66 @@ Continue reading to learn more about the files in this directory, the build requ
6565

6666
## Contents
6767
In addition to the step-by-step instructions below, the directory contains
68-
additional files to aid in the deployment.
68+
files to aid in the deployment.
6969

7070
### Dockerfile
7171
The included `Dockerfile` builds an image on top of the standard `python:3.8` image
72-
with `deepsparse` installed and creates an executable command `serve` that runs
72+
with `deepsparse` installed, and creates an executable command `serve` that runs
7373
`deepsparse.server` on port 8080. SageMaker will execute this image by running
7474
`docker run serve` and expects the image to serve inference requests at the
7575
`invocations/` endpoint.
7676

7777
For general customization of the server, changes should not need to be made
78-
to the Dockerfile, but to the `config.yaml` file that the Dockerfile reads from
79-
instead.
78+
to the Dockerfile but, instead, to the `config.yaml` file from which the Dockerfile reads.
8079

8180
### config.yaml
8281
`config.yaml` is used to configure the DeepSparse server running in the Dockerfile.
83-
The config must contain the line `integration: sagemaker` so
82+
The configuration must contain the line `integration: sagemaker` so
8483
endpoints may be provisioned correctly to match SageMaker specifications.
8584

8685
Notice that the `model_path` and `task` are set to run a sparse-quantized
87-
question-answering model from [SparseZoo](https://sparsezoo.neuralmagic.com/).
86+
question answering model from [SparseZoo](https://sparsezoo.neuralmagic.com/).
8887
To use a model directory stored in `s3`, set `model_path` to `/opt/ml/model` in
89-
the config and add `ModelDataUrl=<MODEL-S3-PATH>` to the `CreateModel` arguments.
88+
the configuration and add `ModelDataUrl=<MODEL-S3-PATH>` to the `CreateModel` arguments.
9089
SageMaker will automatically copy the files from the s3 path into `/opt/ml/model`
91-
which the server can then read from.
90+
from which the server then can read.
9291

9392
### push_image.sh
9493

95-
Bash script for pushing your local Docker image to the AWS ECR repository.
94+
This is a `Bash` script for pushing your local Docker image to the AWS ECR repository.
9695

9796
### endpoint.py
9897

99-
Contains the SparseMaker object for automating the build of a SageMaker endpoint from a Docker Image. You have the option to customize the parameters of the class in order to match the prefered state of your deployment.
98+
This file contains the SparseMaker object for automating the build of a SageMaker endpoint from a Docker image. You have the option to customize the parameters of the class in order to match the prefered state of your deployment.
10099

101100
### qa_client.py
102101

103-
Contains a client object for making requests to the SageMaker inference endpoint for the question answering task.
104-
____
105-
More information on the DeepSparse server and its configuration can be found
106-
[here](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server#readme).
102+
This file contains a client object for making requests to the SageMaker inference endpoint for the question answering task.
103+
104+
Review [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server#readme) for more information about the server and its configuration.
107105

108106
## Deploying to SageMaker
109107
The following steps are required to provision and deploy DeepSparse to SageMaker
110108
for inference:
111-
* Build the DeepSparse-SageMaker `Dockerfile` into a local docker image
112-
* Create an [Amazon ECR](https://aws.amazon.com/ecr/) repository to host the image
113-
* Push the image to the ECR repository
114-
* Create a SageMaker `Model` that reads from the hosted ECR image
115-
* Build a SageMaker `EndpointConfig` that defines how to provision the model deployment
116-
* Launch the SageMaker `Endpoint` defined by the `Model` and `EndpointConfig`
109+
* Build the DeepSparse-SageMaker `Dockerfile` into a local docker image.
110+
* Create an [Amazon ECR](https://aws.amazon.com/ecr/) repository to host the image.
111+
* Push the image to the ECR repository.
112+
* Create a SageMaker `Model` that reads from the hosted ECR image.
113+
* Build a SageMaker `EndpointConfig` that defines how to provision the model deployment.
114+
* Launch the SageMaker `Endpoint` defined by the `Model` and `EndpointConfig`.
117115

118116
### Building the DeepSparse-SageMaker Image Locally
119-
The `Dockerfile` can be build from this directory from a bash shell using the following command.
117+
Build the `Dockerfile` from this directory from a bash shell using the following command.
120118
The image will be tagged locally as `deepsparse-sagemaker-example`.
121119

122120
```bash
123121
docker build -t deepsparse-sagemaker-example .
124122
```
125123

126124
### Creating an ECR Repository
127-
The following code snippet can be used in Python to create an ECR repository.
125+
Use the following code snippet in Python to create an ECR repository.
128126
The `region_name` can be swapped to a preferred region. The repository will be named
129-
`deepsparse-sagemaker`. If the repository is already created, this step may be skipped.
127+
`deepsparse-sagemaker`. If the repository is already created, you may skip this step.
130128

131129
```python
132130
import boto3
@@ -136,7 +134,7 @@ create_repository_res = ecr.create_repository(repositoryName="deepsparse-sagemak
136134
```
137135

138136
### Pushing the Local Image to the ECR Repository
139-
Once the image is built and the ECR repository is created, the image can be pushed using the following
137+
Once the image is built and the ECR repository is created, you can push the image using the following
140138
bash commands.
141139

142140
```bash
@@ -172,7 +170,7 @@ latest: digest: sha256:XXX size: 3884
172170
```
173171

174172
### Creating a SageMaker Model
175-
A SageMaker `Model` can now be created referencing the pushed image.
173+
Create a SageMaker `Model` referencing the pushed image.
176174
The example model will be named `question-answering-example`.
177175
As mentioned in the requirements, `ROLE_ARN` should be a string arn of an AWS
178176
role with full access to SageMaker.
@@ -199,9 +197,7 @@ create_model_res = sm_boto3.create_model(
199197
)
200198
```
201199

202-
More information about options for configuring SageMaker `Model` instances can
203-
be found [here](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html).
204-
200+
Refer to [AWS documentation](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) for more information about options for configuring SageMaker `Model` instances.
205201

206202
### Building a SageMaker EndpointConfig
207203
The `EndpointConfig` is used to set the instance type to provision, how many, scaling
@@ -238,7 +234,7 @@ endpoint_config_res = sm_boto3.create_endpoint_config(**endpoint_config)
238234
```
239235

240236
### Launching a SageMaker Endpoint
241-
Once the `EndpointConfig` is defined, the endpoint can be easily launched using
237+
Once the `EndpointConfig` is defined, launch the endpoint using
242238
the `create_endpoint` command:
243239

244240
```python
@@ -248,10 +244,10 @@ endpoint_res = sm_boto3.create_endpoint(
248244
)
249245
```
250246

251-
After creating the endpoint, its status can be checked by running the following.
247+
After creating the endpoint, you can check its status by running the following.
252248
Initially, the `EndpointStatus` will be `Creating`. Checking after the image is
253249
successfully launched, it will be `InService`. If there are any errors, it will
254-
become `Failed`.
250+
be `Failed`.
255251

256252
```python
257253
from pprint import pprint
@@ -260,8 +256,8 @@ pprint(sm_boto3.describe_endpoint(EndpointName=endpoint_name))
260256

261257

262258
## Making a Request to the Endpoint
263-
After the endpoint is in service, requests can be made to it through the
264-
`invoke_endpoint` api. Inputs will be passed as a JSON payload.
259+
After the endpoint is in service, you can make requests to it through the
260+
`invoke_endpoint` API. Inputs will be passed as a JSON payload.
265261

266262
```python
267263
import json
@@ -291,7 +287,7 @@ print(res["Body"].readlines())
291287

292288
### Cleanup
293289

294-
The model and endpoint can be deleted with the following commands:
290+
You can delete the model and endpoint with the following commands:
295291
```python
296292
sm_boto3.delete_endpoint(EndpointName=endpoint_name)
297293
sm_boto3.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
@@ -304,5 +300,5 @@ These steps create an invokable SageMaker inference endpoint powered by the Deep
304300
Engine. The `EndpointConfig` settings may be adjusted to set instance scaling rules based
305301
on deployment needs.
306302

307-
More information on deploying custom models with SageMaker can be found
308-
[here](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html).
303+
Refer to [AWS documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html) for more information on deploying custom models with SageMaker.
304+

src/content/use-cases/deploying-deepsparse/deepsparse-server.mdx

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,16 +11,16 @@ This section explains how to deploy with DeepSparse Server.
1111

1212
## Installation Requirements
1313

14-
This section requires the [DeepSparse Server Install](/get-started/install/deepsparse).
14+
This use case requires the installation of [DeepSparse Server](/get-started/install/deepsparse).
1515

1616
## Usage
1717

1818
The DeepSparse Server allows you to serve models and `Pipelines` for deployment in HTTP. The server runs on top of the popular FastAPI web framework and Uvicorn web server.
1919
The server supports any task from DeepSparse, such as `Pipelines` including NLP, image classification, and object detection tasks.
2020
An updated list of available tasks can be found
21-
[on the DeepSparse Pipelines Introduction](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/PIPELINES.md).
21+
[in the DeepSparse Pipelines Introduction](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/PIPELINES.md).
2222

23-
Run the help CLI to lookup the available arguments.
23+
Run the help CLI to look up the available arguments.
2424

2525
```
2626
$ deepsparse.server --help
@@ -65,7 +65,7 @@ $ deepsparse.server --help
6565

6666
## Single Model Inference
6767

68-
Example CLI command for serving a single model for the **question answering** task:
68+
Here is an example CLI command for serving a single model for the question answering task:
6969

7070
```bash
7171
deepsparse.server \
@@ -88,7 +88,7 @@ obj = {
8888
response = requests.post(url, json=obj)
8989
```
9090

91-
In addition, you can make a request with a `curl` command from terminal:
91+
In addition, you can make a request with a `curl` command from the terminal:
9292

9393
```bash
9494
curl -X POST \
@@ -103,8 +103,8 @@ curl -X POST \
103103

104104
## Multiple Model Inference
105105

106-
To serve multiple models you can build a `config.yaml` file.
107-
In the sample YAML file below, we are defining two BERT models to be served by the `deepsparse.server` for the **question answering** task:
106+
To serve multiple models, you can build a `config.yaml` file.
107+
In the sample YAML file below, we are defining two BERT models to be served by the `deepsparse.server` for the question answering task:
108108

109109
```yaml
110110
num_cores: 2
@@ -119,7 +119,7 @@ endpoints:
119119
model: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni
120120
batch_size: 1
121121
```
122-
You can now run the server with the config file path using the `config` sub command:
122+
You can now run the server with the configuration file path using the `config` subcommand:
123123

124124
```bash
125125
deepsparse.server config config.yaml
@@ -140,7 +140,7 @@ obj = {
140140
response = requests.post(url, json=obj)
141141
```
142142

143-
💡 **PRO TIP** 💡: While your server is running, you can always use the awesome swagger UI that's built into FastAPI to view your model's pipeline `POST` routes.
143+
**PRO TIP:** While your server is running, you can always use the awesome swagger UI that's built into FastAPI to view your model's pipeline `POST` routes.
144144
The UI also enables you to easily make sample requests to your server.
145145
All you need is to add `/docs` at the end of your host URL:
146146

src/content/use-cases/deploying-deepsparse/docker.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ index: 3000
77

88
# Using/Creating a DeepSparse Docker Image
99

10-
DeepSparse is setup with a default Dockerfile for a minimal DeepSparse docker image.
10+
DeepSparse is set up with a default Dockerfile for a minimal DeepSparse Docker image.
1111
This image is based off the latest official Ubuntu image.
1212

1313
## Pull
1414

15-
You can access the already built image detailed at https://github.com/orgs/neuralmagic/packages/container/package/deepsparse:
15+
You can access the already-built image detailed at https://github.com/orgs/neuralmagic/packages/container/package/deepsparse:
1616

1717
```bash
1818
docker pull ghcr.io/neuralmagic/deepsparse:1.0.2-debian11
@@ -21,7 +21,7 @@ docker tag ghcr.io/neuralmagic/deepsparse:1.0.2-debian11 deepsparse_docker
2121

2222
## Extend
2323

24-
If you would like to customize the docker image, you can use the pre-built images as a base in your own `Dockerfile`:
24+
To customize the Docker image, you can use the pre-built images as a base in your own `Dockerfile`:
2525

2626
```Dockerfile
2727
from ghcr.io/neuralmagic/deepsparse:1.0.2-debian11
@@ -31,7 +31,7 @@ from ghcr.io/neuralmagic/deepsparse:1.0.2-debian11
3131

3232
## Build
3333

34-
In order to build and launch this image, run from the `docker/` directory under the DeepSparse Repo:
34+
To build and launch this image, run the following from the `docker/` directory under the DeepSparse Repo:
3535
```bash
3636
$ docker build -t deepsparse_docker . && docker run -it deepsparse_docker ${python_command}
3737
```
@@ -42,7 +42,7 @@ For example:
4242
docker build -t deepsparse_docker . && docker run -it deepsparse_docker deepsparse.server --task question_answering --model_path "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni"
4343
````
4444

45-
If you want to use a specific branch from deepsparse you can use the `GIT_CHECKOUT` build arg:
45+
To use a specific branch from DeepSparse, you can use the `GIT_CHECKOUT` build argument:
4646

4747
```bash
4848
docker build --build-arg GIT_CHECKOUT=main -t deepsparse_docker .

0 commit comments

Comments
 (0)