Skip to content

Commit e33c2f4

Browse files
authored
Merge pull request #114435 from MicrosoftDocs/j-martens-patch-34
Update how-to-deploy-fpga-web-service.md
2 parents 3d1fa27 + c961d26 commit e33c2f4

File tree

1 file changed

+168
-187
lines changed

1 file changed

+168
-187
lines changed

articles/machine-learning/how-to-deploy-fpga-web-service.md

Lines changed: 168 additions & 187 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ms.custom: seodec18
1717
# What are field-programmable gate arrays (FPGA) and how to deploy
1818
[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
1919

20-
This article provides an introduction to field-programmable gate arrays (FPGA), and shows you how to deploy your models using Azure Machine Learning to an Azure FPGA.
20+
This article provides an introduction to field-programmable gate arrays (FPGA), and shows you how to deploy your models using [Azure Machine Learning](overview-what-is-azure-ml.md) to an Azure FPGA.
2121

2222
FPGAs contain an array of programmable logic blocks, and a hierarchy of reconfigurable interconnects. The interconnects allow these blocks to be configured in various ways after manufacturing. Compared to other chips, FPGAs provide a combination of programmability and performance.
2323

@@ -77,9 +77,9 @@ The following scenarios use FPGAs:
7777

7878
+ [Land cover mapping](https://blogs.technet.microsoft.com/machinelearning/2018/05/29/how-to-use-fpgas-for-deep-learning-inference-to-perform-land-cover-mapping-on-terabytes-of-aerial-images/)
7979

80-
## Example: Deploy models on FPGAs
80+
## Deploy models on FPGAs
8181

82-
You can deploy a model as a web service on FPGAs with Azure Machine Learning Hardware Accelerated Models. Using FPGAs provides ultra-low latency inference, even with a single batch size. Inference, or model scoring, is the phase where the deployed model is used for prediction, most commonly on production data.
82+
You can deploy a model as a web service on FPGAs with [Azure Machine Learning Hardware Accelerated Models](https://docs.microsoft.com/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py). Using FPGAs provides ultra-low latency inference, even with a single batch size. Inference, or model scoring, is the phase where the deployed model is used for prediction, most commonly on production data.
8383

8484
### Prerequisites
8585

@@ -114,7 +114,7 @@ You can deploy a model as a web service on FPGAs with Azure Machine Learning Har
114114
pip install --upgrade azureml-accel-models[cpu]
115115
```
116116
117-
## 1. Create and containerize models
117+
### 1. Create and containerize models
118118
119119
This document will describe how to create a TensorFlow graph to preprocess the input image, make it a featurizer using ResNet 50 on an FPGA, and then run the features through a classifier trained on the ImageNet data set.
120120
@@ -128,189 +128,170 @@ Follow the instructions to:
128128
129129
Use the [Azure Machine Learning SDK for Python](https://docs.microsoft.com/python/api/overview/azure/ml/intro?view=azure-ml-py) to create a service definition. A service definition is a file describing a pipeline of graphs (input, featurizer, and classifier) based on TensorFlow. The deployment command automatically compresses the definition and graphs into a ZIP file, and uploads the ZIP to Azure Blob storage. The DNN is already deployed to run on the FPGA.
130130
131-
### Load Azure Machine Learning workspace
132-
133-
Load your Azure Machine Learning workspace.
134-
135-
```python
136-
import os
137-
import tensorflow as tf
138-
139-
from azureml.core import Workspace
140-
141-
ws = Workspace.from_config()
142-
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')
143-
```
144-
145-
### Preprocess image
146-
147-
The input to the web service is a JPEG image. The first step is to decode the JPEG image and preprocess it. The JPEG images are treated as strings and the result are tensors that will be the input to the ResNet 50 model.
148-
149-
```python
150-
# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings
151-
import azureml.accel.models.utils as utils
152-
tf.reset_default_graph()
153-
154-
in_images = tf.placeholder(tf.string)
155-
image_tensors = utils.preprocess_array(in_images)
156-
print(image_tensors.shape)
157-
```
158-
159-
### Load featurizer
160-
161-
Initialize the model and download a TensorFlow checkpoint of the quantized version of ResNet50 to be used as a featurizer. You may replace "QuantizedResnet50" in the code snippet below with by importing other deep neural networks:
162-
163-
- QuantizedResnet152
164-
- QuantizedVgg16
165-
- Densenet121
166-
167-
```python
168-
from azureml.accel.models import QuantizedResnet50
169-
save_path = os.path.expanduser('~/models')
170-
model_graph = QuantizedResnet50(save_path, is_frozen=True)
171-
feature_tensor = model_graph.import_graph_def(image_tensors)
172-
print(model_graph.version)
173-
print(feature_tensor.name)
174-
print(feature_tensor.shape)
175-
```
176-
177-
### Add classifier
178-
179-
This classifier has been trained on the ImageNet data set. Examples for transfer learning and training your customized weights are available in the set of [sample notebooks](https://aka.ms/aml-notebooks).
180-
181-
```python
182-
classifier_output = model_graph.get_default_classifier(feature_tensor)
183-
print(classifier_output)
184-
```
185-
186-
### Save the model
187-
188-
Now that the preprocessor, ResNet 50 featurizer, and the classifier have been loaded, save the graph and associated variables as a model.
189-
190-
```python
191-
model_name = "resnet50"
192-
model_save_path = os.path.join(save_path, model_name)
193-
print("Saving model in {}".format(model_save_path))
194-
195-
with tf.Session() as sess:
196-
model_graph.restore_weights(sess)
197-
tf.saved_model.simple_save(sess, model_save_path,
198-
inputs={'images': in_images},
199-
outputs={'output_alias': classifier_output})
200-
```
201-
202-
### Save input and output tensors
203-
The input and output tensors that were created during the preprocessing and classifier steps will be needed for model conversion and inference.
204-
205-
```python
206-
input_tensors = in_images.name
207-
output_tensors = classifier_output.name
208-
209-
print(input_tensors)
210-
print(output_tensors)
211-
```
212-
213-
> [!IMPORTANT]
214-
> Save the input and output tensors because you will need them for model conversion and inference requests.
215-
216-
The available models and the corresponding default classifier output tensors are below, which is what you would use for inference if you used the default classifier.
217-
218-
+ Resnet50, QuantizedResnet50
219-
```python
220-
output_tensors = "classifier_1/resnet_v1_50/predictions/Softmax:0"
221-
```
222-
+ Resnet152, QuantizedResnet152
223-
```python
224-
output_tensors = "classifier/resnet_v1_152/predictions/Softmax:0"
225-
```
226-
+ Densenet121, QuantizedDensenet121
227-
```python
228-
output_tensors = "classifier/densenet121/predictions/Softmax:0"
229-
```
230-
+ Vgg16, QuantizedVgg16
231-
```python
232-
output_tensors = "classifier/vgg_16/fc8/squeezed:0"
233-
```
234-
+ SsdVgg, QuantizedSsdVgg
235-
```python
236-
output_tensors = ['ssd_300_vgg/block4_box/Reshape_1:0', 'ssd_300_vgg/block7_box/Reshape_1:0', 'ssd_300_vgg/block8_box/Reshape_1:0', 'ssd_300_vgg/block9_box/Reshape_1:0', 'ssd_300_vgg/block10_box/Reshape_1:0', 'ssd_300_vgg/block11_box/Reshape_1:0', 'ssd_300_vgg/block4_box/Reshape:0', 'ssd_300_vgg/block7_box/Reshape:0', 'ssd_300_vgg/block8_box/Reshape:0', 'ssd_300_vgg/block9_box/Reshape:0', 'ssd_300_vgg/block10_box/Reshape:0', 'ssd_300_vgg/block11_box/Reshape:0']
237-
```
238-
239-
### Register model
240-
241-
[Register](concept-model-management-and-deployment.md) the model by using the SDK with the ZIP file in Azure Blob storage. Adding tags and other metadata about the model helps you keep track of your trained models.
242-
243-
```python
244-
from azureml.core.model import Model
245-
246-
registered_model = Model.register(workspace=ws,
247-
model_path=model_save_path,
248-
model_name=model_name)
249-
250-
print("Successfully registered: ", registered_model.name,
251-
registered_model.description, registered_model.version, sep='\t')
252-
```
253-
254-
If you've already registered a model and want to load it, you may retrieve it.
255-
256-
```python
257-
from azureml.core.model import Model
258-
model_name = "resnet50"
259-
# By default, the latest version is retrieved. You can specify the version, i.e. version=1
260-
registered_model = Model(ws, name="resnet50")
261-
print(registered_model.name, registered_model.description,
262-
registered_model.version, sep='\t')
263-
```
264-
265-
### Convert model
266-
267-
Convert the TensorFlow graph to the Open Neural Network Exchange format ([ONNX](https://onnx.ai/)). You will need to provide the names of the input and output tensors, and these names will be used by your client when you consume the web service.
268-
269-
```python
270-
from azureml.accel import AccelOnnxConverter
271-
272-
convert_request = AccelOnnxConverter.convert_tf_model(
273-
ws, registered_model, input_tensors, output_tensors)
274-
275-
# If it fails, you can run wait_for_completion again with show_output=True.
276-
convert_request.wait_for_completion(show_output=False)
277-
278-
# If the above call succeeded, get the converted model
279-
converted_model = convert_request.result
280-
print("\nSuccessfully converted: ", converted_model.name, converted_model.url, converted_model.version,
281-
converted_model.id, converted_model.created_time, '\n')
282-
```
283-
284-
### Create Docker image
285-
286-
The converted model and all dependencies are added to a Docker image. This Docker image can then be deployed and instantiated. Supported deployment targets include AKS in the cloud or an edge device such as [Azure Data Box Edge](https://docs.microsoft.com/azure/databox-online/data-box-edge-overview). You can also add tags and descriptions for your registered Docker image.
287-
288-
```python
289-
from azureml.core.image import Image
290-
from azureml.accel import AccelContainerImage
291-
292-
image_config = AccelContainerImage.image_configuration()
293-
# Image name must be lowercase
294-
image_name = "{}-image".format(model_name)
295-
296-
image = Image.create(name=image_name,
297-
models=[converted_model],
298-
image_config=image_config,
299-
workspace=ws)
300-
image.wait_for_creation(show_output=False)
301-
```
302-
303-
List the images by tag and get the detailed logs for any debugging.
304-
305-
```python
306-
for i in Image.list(workspace=ws):
307-
print('{}(v.{} [{}]) stored at {} with build log {}'.format(
308-
i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))
309-
```
310-
311-
## 2. Deploy to cloud or edge
312-
313-
### Deploy to the cloud
131+
1. Load Azure Machine Learning workspace
132+
133+
```python
134+
import os
135+
import tensorflow as tf
136+
137+
from azureml.core import Workspace
138+
139+
ws = Workspace.from_config()
140+
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')
141+
```
142+
143+
2. Preprocess image. The input to the web service is a JPEG image. The first step is to decode the JPEG image and preprocess it. The JPEG images are treated as strings and the result are tensors that will be the input to the ResNet 50 model.
144+
145+
```python
146+
# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings
147+
import azureml.accel.models.utils as utils
148+
tf.reset_default_graph()
149+
150+
in_images = tf.placeholder(tf.string)
151+
image_tensors = utils.preprocess_array(in_images)
152+
print(image_tensors.shape)
153+
```
154+
155+
1. Load featurizer. Initialize the model and download a TensorFlow checkpoint of the quantized version of ResNet50 to be used as a featurizer. You may replace "QuantizedResnet50" in the code snippet below with by importing other deep neural networks:
156+
157+
- QuantizedResnet152
158+
- QuantizedVgg16
159+
- Densenet121
160+
161+
```python
162+
from azureml.accel.models import QuantizedResnet50
163+
save_path = os.path.expanduser('~/models')
164+
model_graph = QuantizedResnet50(save_path, is_frozen=True)
165+
feature_tensor = model_graph.import_graph_def(image_tensors)
166+
print(model_graph.version)
167+
print(feature_tensor.name)
168+
print(feature_tensor.shape)
169+
```
170+
171+
1. Add a classifier. This classifier has been trained on the ImageNet data set. Examples for transfer learning and training your customized weights are available in the set of [sample notebooks](https://aka.ms/aml-notebooks).
172+
173+
```python
174+
classifier_output = model_graph.get_default_classifier(feature_tensor)
175+
print(classifier_output)
176+
```
177+
178+
1. Save the model. Now that the preprocessor, ResNet 50 featurizer, and the classifier have been loaded, save the graph and associated variables as a model.
179+
180+
```python
181+
model_name = "resnet50"
182+
model_save_path = os.path.join(save_path, model_name)
183+
print("Saving model in {}".format(model_save_path))
184+
185+
with tf.Session() as sess:
186+
model_graph.restore_weights(sess)
187+
tf.saved_model.simple_save(sess, model_save_path,
188+
inputs={'images': in_images},
189+
outputs={'output_alias': classifier_output})
190+
```
191+
192+
1. Save input and output tensors. The input and output tensors that were created during the preprocessing and classifier steps will be needed for model conversion and inference.
193+
194+
```python
195+
input_tensors = in_images.name
196+
output_tensors = classifier_output.name
197+
198+
print(input_tensors)
199+
print(output_tensors)
200+
```
201+
202+
> [!IMPORTANT]
203+
> Save the input and output tensors because you will need them for model conversion and inference requests.
204+
205+
The available models and the corresponding default classifier output tensors are below, which is what you would use for inference if you used the default classifier.
206+
207+
+ Resnet50, QuantizedResnet50
208+
```python
209+
output_tensors = "classifier_1/resnet_v1_50/predictions/Softmax:0"
210+
```
211+
+ Resnet152, QuantizedResnet152
212+
```python
213+
output_tensors = "classifier/resnet_v1_152/predictions/Softmax:0"
214+
```
215+
+ Densenet121, QuantizedDensenet121
216+
```python
217+
output_tensors = "classifier/densenet121/predictions/Softmax:0"
218+
```
219+
+ Vgg16, QuantizedVgg16
220+
```python
221+
output_tensors = "classifier/vgg_16/fc8/squeezed:0"
222+
```
223+
+ SsdVgg, QuantizedSsdVgg
224+
```python
225+
output_tensors = ['ssd_300_vgg/block4_box/Reshape_1:0', 'ssd_300_vgg/block7_box/Reshape_1:0', 'ssd_300_vgg/block8_box/Reshape_1:0', 'ssd_300_vgg/block9_box/Reshape_1:0', 'ssd_300_vgg/block10_box/Reshape_1:0', 'ssd_300_vgg/block11_box/Reshape_1:0', 'ssd_300_vgg/block4_box/Reshape:0', 'ssd_300_vgg/block7_box/Reshape:0', 'ssd_300_vgg/block8_box/Reshape:0', 'ssd_300_vgg/block9_box/Reshape:0', 'ssd_300_vgg/block10_box/Reshape:0', 'ssd_300_vgg/block11_box/Reshape:0']
226+
```
227+
228+
1. [Register](concept-model-management-and-deployment.md) the model by using the SDK with the ZIP file in Azure Blob storage. Adding tags and other metadata about the model helps you keep track of your trained models.
229+
230+
```python
231+
from azureml.core.model import Model
232+
233+
registered_model = Model.register(workspace=ws,
234+
model_path=model_save_path,
235+
model_name=model_name)
236+
237+
print("Successfully registered: ", registered_model.name,
238+
registered_model.description, registered_model.version, sep='\t')
239+
```
240+
241+
If you've already registered a model and want to load it, you may retrieve it.
242+
243+
```python
244+
from azureml.core.model import Model
245+
model_name = "resnet50"
246+
# By default, the latest version is retrieved. You can specify the version, i.e. version=1
247+
registered_model = Model(ws, name="resnet50")
248+
print(registered_model.name, registered_model.description,
249+
registered_model.version, sep='\t')
250+
```
251+
252+
1. Convert the TensorFlow graph to the Open Neural Network Exchange format ([ONNX](https://onnx.ai/)). You will need to provide the names of the input and output tensors, and these names will be used by your client when you consume the web service.
253+
254+
```python
255+
from azureml.accel import AccelOnnxConverter
256+
257+
convert_request = AccelOnnxConverter.convert_tf_model(
258+
ws, registered_model, input_tensors, output_tensors)
259+
260+
# If it fails, you can run wait_for_completion again with show_output=True.
261+
convert_request.wait_for_completion(show_output=False)
262+
263+
# If the above call succeeded, get the converted model
264+
converted_model = convert_request.result
265+
print("\nSuccessfully converted: ", converted_model.name, converted_model.url, converted_model.version,
266+
converted_model.id, converted_model.created_time, '\n')
267+
```
268+
269+
1. Create Docker image from the converted model and all dependencies. This Docker image can then be deployed and instantiated. Supported deployment targets include AKS in the cloud or an edge device such as [Azure Data Box Edge](https://docs.microsoft.com/azure/databox-online/data-box-edge-overview). You can also add tags and descriptions for your registered Docker image.
270+
271+
```python
272+
from azureml.core.image import Image
273+
from azureml.accel import AccelContainerImage
274+
275+
image_config = AccelContainerImage.image_configuration()
276+
# Image name must be lowercase
277+
image_name = "{}-image".format(model_name)
278+
279+
image = Image.create(name=image_name,
280+
models=[converted_model],
281+
image_config=image_config,
282+
workspace=ws)
283+
image.wait_for_creation(show_output=False)
284+
```
285+
286+
List the images by tag and get the detailed logs for any debugging.
287+
288+
```python
289+
for i in Image.list(workspace=ws):
290+
print('{}(v.{} [{}]) stored at {} with build log {}'.format(
291+
i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))
292+
```
293+
294+
### 2. Deploy to cloud or edge
314295

315296
To deploy your model as a high-scale production web service, use Azure Kubernetes Service (AKS). You can create a new one using the Azure Machine Learning SDK, CLI, or [Azure Machine Learning studio](https://ml.azure.com).
316297

0 commit comments

Comments
 (0)