Skip to content

Commit bce0459

Browse files
Merge pull request #217389 from santiagxf/santiagxf/azureml-batch-example
Santiagxf/azureml batch example
2 parents 0e8c101 + 2ef2397 commit bce0459

File tree

5 files changed

+117
-52
lines changed

5 files changed

+117
-52
lines changed

articles/machine-learning/batch-inference/how-to-deploy-model-custom-output.md

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -27,26 +27,31 @@ Sometimes you need to execute inference having a higher control of what is being
2727
2828
In any of those cases, Batch Deployments allow you to take control of the output of the jobs by allowing you to write directly to the output of the batch deployment job. In this tutorial, we'll see how to deploy a model to perform batch inference and writes the outputs in `parquet` format by appending the predictions to the original input data.
2929

30-
## Prerequisites
31-
32-
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
33-
34-
* A model registered in the workspace. In this tutorial, we'll use an MLflow model. Particularly, we are using the *heart condition classifier* created in the tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md).
35-
* You must have an endpoint already created. If you don't, follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md). This example assumes the endpoint is named `heart-classifier-batch`.
36-
* You must have a compute created where to deploy the deployment. If you don't, follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute). This example assumes the name of the compute is `cpu-cluster`.
37-
3830
## About this sample
3931

4032
This example shows how you can deploy a model to perform batch inference and customize how your predictions are written in the output. This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).
4133

4234
The model has been trained using an `XGBBoost` classifier and all the required preprocessing has been packaged as a `scikit-learn` pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.
4335

44-
[!INCLUDE [clone repo & set defaults](../../../includes/machine-learning-cli-prepare.md)]
36+
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the `cli/endpoints/batch` if you are using the Azure CLI or `sdk/endpoints/batch` if you are using our SDK for Python.
37+
38+
```azurecli
39+
git clone https://github.com/Azure/azureml-examples --depth 1
40+
cd azureml-examples/cli/endpoints/batch
41+
```
4542

4643
### Follow along in Jupyter Notebooks
4744

4845
You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: [custom-output-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/custom-output-batch.ipynb).
4946

47+
## Prerequisites
48+
49+
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
50+
51+
* A model registered in the workspace. In this tutorial, we'll use an MLflow model. Particularly, we are using the *heart condition classifier* created in the tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md).
52+
* You must have an endpoint already created. If you don't, follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md). This example assumes the endpoint is named `heart-classifier-batch`.
53+
* You must have a compute created where to deploy the deployment. If you don't, follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute). This example assumes the name of the compute is `cpu-cluster`.
54+
5055
## Creating a batch deployment with a custom output
5156

5257
In this example, we are going to create a deployment that can write directly to the output folder of the batch deployment job. The deployment will use this feature to write custom parquet files.

articles/machine-learning/batch-inference/how-to-image-processing-batch.md

Lines changed: 21 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,28 +17,33 @@ ms.custom: devplatv2
1717

1818
[!INCLUDE [ml v2](../../../includes/machine-learning-dev-v2.md)]
1919

20-
Batch Endpoints can be used for processing tabular data, but also any other file type like images. Those deployments are supported in both MLflow and custom models. In this tutorial we will learn how to deploy a model that classifies images according to the ImageNet taxonomy.
20+
Batch Endpoints can be used for processing tabular data, but also any other file type like images. Those deployments are supported in both MLflow and custom models. In this tutorial, we will learn how to deploy a model that classifies images according to the ImageNet taxonomy.
2121

22-
## Prerequisites
23-
24-
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
25-
26-
* You must have an endpoint already created. If you don't please follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md). This example assumes the endpoint is named `imagenet-classifier-batch`.
27-
* You must have a compute created where to deploy the deployment. If you don't please follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute). This example assumes the name of the compute is `cpu-cluster`.
22+
## About this sample
2823

29-
## About the model used in the sample
30-
31-
The model we are going to work with was built using TensorFlow along with the RestNet architecture ([Identity Mappings in Deep Residual Networks](https://arxiv.org/abs/1603.05027)). This model has the following constrains that are important to keep in mind for deployment:
24+
The model we are going to work with was built using TensorFlow along with the RestNet architecture ([Identity Mappings in Deep Residual Networks](https://arxiv.org/abs/1603.05027)). A sample of this model can be downloaded from `https://azuremlexampledata.blob.core.windows.net/data/imagenet/model.zip`. The model has the following constrains that are important to keep in mind for deployment:
3225

3326
* It works with images of size 244x244 (tensors of `(224, 224, 3)`).
3427
* It requires inputs to be scaled to the range `[0,1]`.
3528

36-
A sample of this model can be downloaded from `https://azuremlexampledata.blob.core.windows.net/data/imagenet/model.zip`.
29+
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo, and then change directories to the `cli/endpoints/batch` if you are using the Azure CLI or `sdk/endpoints/batch` if you are using our SDK for Python.
30+
31+
```azurecli
32+
git clone https://github.com/Azure/azureml-examples --depth 1
33+
cd azureml-examples/cli/endpoints/batch
34+
```
3735

3836
### Follow along in Jupyter Notebooks
3937

4038
You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: [imagenet-classifier-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/imagenet-classifier-batch.ipynb).
4139

40+
## Prerequisites
41+
42+
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
43+
44+
* You must have a batch endpoint already created. This example assumes the endpoint is named `imagenet-classifier-batch`. If you don't have one, follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md).
45+
* You must have a compute created where to deploy the deployment. This example assumes the name of the compute is `cpu-cluster`. If you don't, follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute).
46+
4247
## Image classification with batch deployments
4348

4449
In this example, we are going to learn how to deploy a deep learning model that can classify a given image according to the [taxonomy of ImageNet](https://image-net.org/).
@@ -61,8 +66,11 @@ Batch Endpoint can only deploy registered models so we need to register it. You
6166
6267
```python
6368
import os
69+
import requests
6470
from zipfile import ZipFile
6571
72+
requests.get('https://azuremlexampledata.blob.core.windows.net/data/imagenet/model.zip', allow_redirects=True)
73+
6674
os.mkdirs("imagenet-classifier", exits_ok=True)
6775
with ZipFile(file, 'r') as zip:
6876
model_path = zip.extractall(path="imagenet-classifier")
@@ -88,7 +96,7 @@ Batch Endpoint can only deploy registered models so we need to register it. You
8896
8997
### Creating a scoring script
9098
91-
We need to create a scoring script that can read the images provided by the batch deployment and return the scores of the model. The following script does the following:
99+
We need to create a scoring script that can read the images provided by the batch deployment and return the scores of the model. The following script:
92100
93101
> [!div class="checklist"]
94102
> * Indicates an `init` function that load the model using `keras` module in `tensorflow`.
@@ -244,7 +252,7 @@ One the scoring script is created, it's time to create a batch deployment for it
244252
ml_client.batch_deployments.begin_create_or_update(deployment)
245253
```
246254

247-
1. Although you can invoke a specific deployment inside of an endpoint, you will usually want to invoke the endpoint itself and let the endpoint decide which deployment to use. Such deployment is named the "default" deployment. This gives you the possibility of changing the default deployment and hence changing the model serving the deployment without changing the contract with the user invoking the endpoint. Use the following instruction to update the default deployment:
255+
1. Although you can invoke a specific deployment inside of an endpoint, you will usually want to invoke the endpoint itself, and let the endpoint decide which deployment to use. Such deployment is named the "default" deployment. This gives you the possibility of changing the default deployment - and hence changing the model serving the deployment - without changing the contract with the user invoking the endpoint. Use the following instruction to update the default deployment:
248256

249257
# [Azure ML CLI](#tab/cli)
250258

articles/machine-learning/batch-inference/how-to-mlflow-batch.md

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,24 +27,29 @@ For no-code-deployment, Azure Machine Learning
2727
> [!NOTE]
2828
> For more information about the supported file types in batch endpoints with MLflow, view [Considerations when deploying to batch inference](#considerations-when-deploying-to-batch-inference).
2929
30-
## Prerequisites
31-
32-
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
33-
34-
* You must have a MLflow model. If your model is not in MLflow format and you want to use this feature, you can [convert your custom ML model to MLflow format](../how-to-convert-custom-model-to-mlflow.md).
35-
3630
## About this example
3731

3832
This example shows how you can deploy an MLflow model to a batch endpoint to perform batch predictions. This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).
3933

4034
The model has been trained using an `XGBBoost` classifier and all the required preprocessing has been packaged as a `scikit-learn` pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.
4135

42-
[!INCLUDE [clone repo & set defaults](../../../includes/machine-learning-cli-prepare.md)]
36+
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the `cli/endpoints/batch` if you are using the Azure CLI or `sdk/endpoints/batch` if you are using our SDK for Python.
37+
38+
```azurecli
39+
git clone https://github.com/Azure/azureml-examples --depth 1
40+
cd azureml-examples/cli/endpoints/batch
41+
```
4342

4443
### Follow along in Jupyter Notebooks
4544

4645
You can follow along this sample in the following notebooks. In the cloned repository, open the notebook: [mlflow-for-batch-tabular.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/mlflow-for-batch-tabular.ipynb).
4746

47+
## Prerequisites
48+
49+
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
50+
51+
* You must have a MLflow model. If your model is not in MLflow format and you want to use this feature, you can [convert your custom ML model to MLflow format](../how-to-convert-custom-model-to-mlflow.md).
52+
4853
## Steps
4954

5055
Follow these steps to deploy an MLflow model to a batch endpoint for running batch inference over new data:

articles/machine-learning/batch-inference/how-to-nlp-processing-batch.md

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,37 +19,41 @@ ms.custom: devplatv2
1919

2020
Batch Endpoints can be used for processing tabular data, but also any other file type like text. Those deployments are supported in both MLflow and custom models. In this tutorial we will learn how to deploy a model that can perform text summarization of long sequences of text using a model from HuggingFace.
2121

22-
## Prerequisites
23-
24-
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
25-
26-
* You must have an endpoint already created. If you don't please follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md). This example assumes the endpoint is named `text-summarization-batch`.
27-
* You must have a compute created where to deploy the deployment. If you don't please follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute). This example assumes the name of the compute is `cpu-cluster`.
28-
29-
## About the model used in the sample
22+
## About this sample
3023

3124
The model we are going to work with was built using the popular library transformers from HuggingFace along with [a pre-trained model from Facebook with the BART architecture](https://huggingface.co/facebook/bart-large-cnn). It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation](https://arxiv.org/abs/1910.13461). This model has the following constrains that are important to keep in mind for deployment:
3225

3326
* It can work with sequences up to 1024 tokens.
3427
* It is trained for summarization of text in English.
3528
* We are going to use TensorFlow as a backend.
3629

37-
Due to the size of the model, it hasn't been included in this repository. Instead, you can generate a local copy using:
38-
39-
```python
40-
from transformers import pipeline
30+
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the `cli/endpoints/batch` if you are using the Azure CLI or `sdk/endpoints/batch` if you are using our SDK for Python.
4131

42-
model = pipeline("summarization", model="facebook/bart-large-cnn")
43-
model_local_path = 'bart-text-summarization/model'
44-
summarizer.save_pretrained(model_local_path)
32+
```azurecli
33+
git clone https://github.com/Azure/azureml-examples --depth 1
34+
cd azureml-examples/cli/endpoints/batch
4535
```
4636

47-
A local copy of the model will be placed at `bart-text-summarization/model`. We will use it during the course of this tutorial.
48-
4937
### Follow along in Jupyter Notebooks
5038

5139
You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: [text-summarization-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/text-summarization-batch.ipynb).
5240

41+
## Prerequisites
42+
43+
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
44+
45+
* You must have an endpoint already created. If you don't please follow the instructions at [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md). This example assumes the endpoint is named `text-summarization-batch`.
46+
* You must have a compute created where to deploy the deployment. If you don't please follow the instructions at [Create compute](how-to-use-batch-endpoint.md#create-compute). This example assumes the name of the compute is `cpu-cluster`.
47+
* Due to the size of the model, it hasn't been included in this repository. Instead, you can generate a local copy with the following code. A local copy of the model will be placed at `bart-text-summarization/model`. We will use it during the course of this tutorial.
48+
49+
```python
50+
from transformers import pipeline
51+
52+
model = pipeline("summarization", model="facebook/bart-large-cnn")
53+
model_local_path = 'bart-text-summarization/model'
54+
summarizer.save_pretrained(model_local_path)
55+
```
56+
5357
## NLP tasks with batch deployments
5458

5559
In this example, we are going to learn how to deploy a deep learning model based on the BART architecture that can perform text summarization over text in English. The text will be placed in CSV files for convenience.

articles/machine-learning/batch-inference/how-to-use-batch-endpoint.md

Lines changed: 49 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,12 +33,7 @@ In this article, you will learn how to use batch endpoints to do batch scoring.
3333
> [!TIP]
3434
> We suggest you to read the Scenarios sections (see the navigation bar at the left) to find more about how to use Batch Endpoints in specific scenarios including NLP, computer vision, or how to integrate them with other Azure services.
3535
36-
## Prerequisites
37-
38-
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
39-
40-
41-
### About this example
36+
## About this example
4237

4338
On this example, we are going to deploy a model to solve the classic MNIST ("Modified National Institute of Standards and Technology") digit recognition problem to perform batch inferencing over large amounts of data (image files). In the first section of this tutorial, we are going to create a batch deployment with a model created using Torch. Such deployment will become our default one in the endpoint. On the second half, [we are going to see how we can create a second deployment](#adding-deployments-to-an-endpoint) using a model created with TensorFlow (Keras), test it out, and then switch the endpoint to start using the new deployment as default.
4439

@@ -49,6 +44,54 @@ git clone https://github.com/Azure/azureml-examples --depth 1
4944
cd azureml-examples/cli/endpoints/batch
5045
```
5146

47+
### Follow along in Jupyter Notebooks
48+
49+
You can follow along this sample in the following notebooks. In the cloned repository, open the notebook: [mnist-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/mnist-batch.ipynb).
50+
51+
## Prerequisites
52+
53+
[!INCLUDE [basic cli prereqs](../../../includes/machine-learning-cli-prereqs.md)]
54+
55+
### Connect to your workspace
56+
57+
First, let's connect to Azure Machine Learning workspace where we are going to work on.
58+
59+
# [Azure ML CLI](#tab/cli)
60+
61+
```azurecli
62+
az account set --subscription <subscription>
63+
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>
64+
```
65+
66+
# [Azure ML SDK for Python](#tab/sdk)
67+
68+
The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, we'll connect to the workspace in which you'll perform deployment tasks.
69+
70+
1. Import the required libraries:
71+
72+
```python
73+
from azure.ai.ml import MLClient, Input
74+
from azure.ai.ml.entities import BatchEndpoint, BatchDeployment, Model, AmlCompute, Data, BatchRetrySettings
75+
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
76+
from azure.identity import DefaultAzureCredential
77+
```
78+
79+
2. Configure workspace details and get a handle to the workspace:
80+
81+
```python
82+
subscription_id = "<subscription>"
83+
resource_group = "<resource-group>"
84+
workspace = "<workspace>"
85+
86+
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)
87+
```
88+
89+
# [studio](#tab/studio)
90+
91+
Open the [Azure ML studio portal](https://ml.azure.com) and log in using your credentials.
92+
93+
---
94+
5295
### Create compute
5396

5497
Batch endpoints run on compute clusters. They support both [Azure Machine Learning Compute clusters (AmlCompute)](../how-to-create-attach-compute-cluster.md) or [Kubernetes clusters](../how-to-attach-kubernetes-anywhere.md). Clusters are a shared resource so one cluster can host one or many batch deployments (along with other workloads if desired).

0 commit comments

Comments
 (0)