You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 3, 2025. It is now read-only.
* Update question-answering.mdx
* Update question-answering.mdx
* Update question-answering.mdx
Line 29: What command?
* Update text-classification.mdx
* Update token-classification.mdx
* Update deploying.mdx
Line 307: What is the "appropriate" documentation?
* Update sparsifying.mdx
Line 12: What is the convention for "torch and torchvision"? Should this be "PyTorch and torchvision"?
* Update deploying.mdx
* Update sparsifying.mdx
* Update deploying.mdx
Line 177: What is "appropriate documentation?"
* Update deepsparse-server.mdx
Line 143: Determine consistent use of graphic icons.
Line 143: Do we want to use the word "awesome"?
Line 149: What is this image?
* Update aws-sagemaker.mdx
Line 207: The parenthetical statement needs to have a link or be rewritten to reference something specific.
* Update docker.mdx
* Update sparsifying.mdx
* Update deepsparse-server.mdx
* Update question-answering.mdx
* Update text-classification.mdx
* Update token-classification.mdx
* Update sparsifying.mdx
* Update sparsifying.mdx
* Update src/content/use-cases/object-detection/sparsifying.mdx
Co-authored-by: Robert Shaw <[email protected]>
Co-authored-by: Jeannie Finks <[email protected]>
Co-authored-by: Robert Shaw <[email protected]>
Copy file name to clipboardExpand all lines: src/content/use-cases/deploying-deepsparse/aws-sagemaker.mdx
+41-45Lines changed: 41 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,14 +15,14 @@ Deployments benefit from both sparse-CPU acceleration with
15
15
DeepSparse and automatic scaling from SageMaker.
16
16
17
17
## Installation Requirements
18
-
The listed steps can be easily completed using a `python` and `bash`. The following
18
+
The listed steps can be easily completed using `python` and `bash`. The following
19
19
credentials, tools, and libraries are also required:
20
-
*The [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) version 2.X that is [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html). Doublecheck if the `region` that is configured in your AWS CLI matches the region in the SparseMaker class found in the `endpoint.py` file. Currently, the default region being used is `us-east-1`.
20
+
*[AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) version 2.X that is [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html). Double-check if the `region` that is configured in your AWS CLI matches the region in the SparseMaker class found in the `endpoint.py` file. Currently, the default region being used is `us-east-1`.
21
21
* The [ARN](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) of your AWS role requires access to full SageMaker permissions.
22
-
*`AmazonSageMakerFullAccess`
23
-
* In the following steps, we will refer to this as `ROLE_ARN`. It should take the form `"arn:aws:iam::XXX:role/service-role/XXX"`. In addition to role permissions, make sure the AWS user who configured the AWS CLI configuration has ECR/SageMaker permissions.
24
-
*[Docker and the `docker`cli](https://docs.docker.com/get-docker/).
25
-
* The `boto3`python AWS sdk (`pip install boto3`).
22
+
*`AmazonSageMakerFullAccess`
23
+
* In the following steps, we will refer to this as `ROLE_ARN`. It should take the form `"arn:aws:iam::XXX:role/service-role/XXX"`. In addition to role permissions, make sure the AWS user who configured the AWS CLI configuration has ECR/SageMaker permissions.
24
+
*[Docker and the `docker`CLI](https://docs.docker.com/get-docker/).
25
+
* The `boto3`Python AWS SDK (`pip install boto3`).
26
26
27
27
### Quick Start
28
28
@@ -40,7 +40,7 @@ Run the following command to build your SageMaker endpoint.
40
40
python endpoint.py create
41
41
```
42
42
43
-
After the endpoint has been staged (~1 minute), you can start making requests by passing your endpoint `region name` and your `endpoint name`. Afterwards you can run inference by passing in your question and context:
43
+
After the endpoint has been staged (~1 minute), you can start making requests by passing your endpoint `region name` and your `endpoint name`. Afterwards, you can run inference by passing in your question and context:
44
44
45
45
46
46
```python
@@ -53,9 +53,9 @@ answer = qa.predict(question="who is batman?", context="Mark is batman.")
The answer is: `b'{"score":0.6484262943267822,"answer":"Mark","start":0,"end":4}'`
57
57
58
-
If you want to delete your endpoint, please use:
58
+
If you want to delete your endpoint, use:
59
59
60
60
```bash
61
61
python endpoint.py destroy
@@ -65,68 +65,66 @@ Continue reading to learn more about the files in this directory, the build requ
65
65
66
66
## Contents
67
67
In addition to the step-by-step instructions below, the directory contains
68
-
additional files to aid in the deployment.
68
+
files to aid in the deployment.
69
69
70
70
### Dockerfile
71
71
The included `Dockerfile` builds an image on top of the standard `python:3.8` image
72
-
with `deepsparse` installed and creates an executable command `serve` that runs
72
+
with `deepsparse` installed, and creates an executable command `serve` that runs
73
73
`deepsparse.server` on port 8080. SageMaker will execute this image by running
74
74
`docker run serve` and expects the image to serve inference requests at the
75
75
`invocations/` endpoint.
76
76
77
77
For general customization of the server, changes should not need to be made
78
-
to the Dockerfile, but to the `config.yaml` file that the Dockerfile reads from
79
-
instead.
78
+
to the Dockerfile but, instead, to the `config.yaml` file from which the Dockerfile reads.
80
79
81
80
### config.yaml
82
81
`config.yaml` is used to configure the DeepSparse server running in the Dockerfile.
83
-
The config must contain the line `integration: sagemaker` so
82
+
The configuration must contain the line `integration: sagemaker` so
84
83
endpoints may be provisioned correctly to match SageMaker specifications.
85
84
86
85
Notice that the `model_path` and `task` are set to run a sparse-quantized
87
-
question-answering model from [SparseZoo](https://sparsezoo.neuralmagic.com/).
86
+
questionanswering model from [SparseZoo](https://sparsezoo.neuralmagic.com/).
88
87
To use a model directory stored in `s3`, set `model_path` to `/opt/ml/model` in
89
-
the config and add `ModelDataUrl=<MODEL-S3-PATH>` to the `CreateModel` arguments.
88
+
the configuration and add `ModelDataUrl=<MODEL-S3-PATH>` to the `CreateModel` arguments.
90
89
SageMaker will automatically copy the files from the s3 path into `/opt/ml/model`
91
-
which the server can then read from.
90
+
from which the server then can read.
92
91
93
92
### push_image.sh
94
93
95
-
Bash script for pushing your local Docker image to the AWS ECR repository.
94
+
This is a `Bash` script for pushing your local Docker image to the AWS ECR repository.
96
95
97
96
### endpoint.py
98
97
99
-
Contains the SparseMaker object for automating the build of a SageMaker endpoint from a Docker Image. You have the option to customize the parameters of the class in order to match the prefered state of your deployment.
98
+
This file contains the SparseMaker object for automating the build of a SageMaker endpoint from a Docker image. You have the option to customize the parameters of the class in order to match the prefered state of your deployment.
100
99
101
100
### qa_client.py
102
101
103
-
Contains a client object for making requests to the SageMaker inference endpoint for the question answering task.
104
-
____
105
-
More information on the DeepSparse server and its configuration can be found
This file contains a client object for making requests to the SageMaker inference endpoint for the question answering task.
103
+
104
+
Review [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server#readme) for more information about the server and its configuration.
107
105
108
106
## Deploying to SageMaker
109
107
The following steps are required to provision and deploy DeepSparse to SageMaker
110
108
for inference:
111
-
* Build the DeepSparse-SageMaker `Dockerfile` into a local docker image
112
-
* Create an [Amazon ECR](https://aws.amazon.com/ecr/) repository to host the image
113
-
* Push the image to the ECR repository
114
-
* Create a SageMaker `Model` that reads from the hosted ECR image
115
-
* Build a SageMaker `EndpointConfig` that defines how to provision the model deployment
116
-
* Launch the SageMaker `Endpoint` defined by the `Model` and `EndpointConfig`
109
+
* Build the DeepSparse-SageMaker `Dockerfile` into a local docker image.
110
+
* Create an [Amazon ECR](https://aws.amazon.com/ecr/) repository to host the image.
111
+
* Push the image to the ECR repository.
112
+
* Create a SageMaker `Model` that reads from the hosted ECR image.
113
+
* Build a SageMaker `EndpointConfig` that defines how to provision the model deployment.
114
+
* Launch the SageMaker `Endpoint` defined by the `Model` and `EndpointConfig`.
117
115
118
116
### Building the DeepSparse-SageMaker Image Locally
119
-
The `Dockerfile` can be build from this directory from a bash shell using the following command.
117
+
Build the `Dockerfile` from this directory from a bash shell using the following command.
120
118
The image will be tagged locally as `deepsparse-sagemaker-example`.
121
119
122
120
```bash
123
121
docker build -t deepsparse-sagemaker-example .
124
122
```
125
123
126
124
### Creating an ECR Repository
127
-
The following code snippet can be used in Python to create an ECR repository.
125
+
Use the following code snippet in Python to create an ECR repository.
128
126
The `region_name` can be swapped to a preferred region. The repository will be named
129
-
`deepsparse-sagemaker`. If the repository is already created, this step may be skipped.
127
+
`deepsparse-sagemaker`. If the repository is already created, you may skip this step.
More information about options for configuring SageMaker `Model` instances can
203
-
be found [here](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html).
204
-
200
+
Refer to [AWS documentation](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) for more information about options for configuring SageMaker `Model` instances.
205
201
206
202
### Building a SageMaker EndpointConfig
207
203
The `EndpointConfig` is used to set the instance type to provision, how many, scaling
Refer to [AWS documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html) for more information on deploying custom models with SageMaker.
Copy file name to clipboardExpand all lines: src/content/use-cases/deploying-deepsparse/deepsparse-server.mdx
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,16 +11,16 @@ This section explains how to deploy with DeepSparse Server.
11
11
12
12
## Installation Requirements
13
13
14
-
This section requires the [DeepSparse Server Install](/get-started/install/deepsparse).
14
+
This use case requires the installation of [DeepSparse Server](/get-started/install/deepsparse).
15
15
16
16
## Usage
17
17
18
18
The DeepSparse Server allows you to serve models and `Pipelines` for deployment in HTTP. The server runs on top of the popular FastAPI web framework and Uvicorn web server.
19
19
The server supports any task from DeepSparse, such as `Pipelines` including NLP, image classification, and object detection tasks.
20
20
An updated list of available tasks can be found
21
-
[on the DeepSparse Pipelines Introduction](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/PIPELINES.md).
21
+
[in the DeepSparse Pipelines Introduction](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/PIPELINES.md).
22
22
23
-
Run the help CLI to lookup the available arguments.
23
+
Run the help CLI to look up the available arguments.
24
24
25
25
```
26
26
$ deepsparse.server --help
@@ -65,7 +65,7 @@ $ deepsparse.server --help
65
65
66
66
## Single Model Inference
67
67
68
-
Example CLI command for serving a single model for the **question answering** task:
68
+
Here is an example CLI command for serving a single model for the question answering task:
69
69
70
70
```bash
71
71
deepsparse.server \
@@ -88,7 +88,7 @@ obj = {
88
88
response = requests.post(url, json=obj)
89
89
```
90
90
91
-
In addition, you can make a request with a `curl` command from terminal:
91
+
In addition, you can make a request with a `curl` command from the terminal:
92
92
93
93
```bash
94
94
curl -X POST \
@@ -103,8 +103,8 @@ curl -X POST \
103
103
104
104
## Multiple Model Inference
105
105
106
-
To serve multiple models you can build a `config.yaml` file.
107
-
In the sample YAML file below, we are defining two BERT models to be served by the `deepsparse.server` for the **question answering** task:
106
+
To serve multiple models, you can build a `config.yaml` file.
107
+
In the sample YAML file below, we are defining two BERT models to be served by the `deepsparse.server` for the question answering task:
You can now run the server with the config file path using the `config` sub command:
122
+
You can now run the server with the configuration file path using the `config` subcommand:
123
123
124
124
```bash
125
125
deepsparse.server config config.yaml
@@ -140,7 +140,7 @@ obj = {
140
140
response = requests.post(url, json=obj)
141
141
```
142
142
143
-
💡 **PRO TIP** 💡:While your server is running, you can always use the awesome swagger UI that's built into FastAPI to view your model's pipeline `POST` routes.
143
+
**PRO TIP:** While your server is running, you can always use the awesome swagger UI that's built into FastAPI to view your model's pipeline `POST` routes.
144
144
The UI also enables you to easily make sample requests to your server.
145
145
All you need is to add `/docs` at the end of your host URL:
0 commit comments