Skip to content

Commit 09889b4

Browse files
authored
Merge pull request #100943 from likebupt/update-batch-prediction-article
update batch prediction article
2 parents 5e84618 + 8e9a2df commit 09889b4

23 files changed

+84
-48
lines changed

articles/machine-learning/how-to-run-batch-predictions-designer.md

Lines changed: 84 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -5,101 +5,137 @@ description: Learn how to train a model and set up a batch prediction pipeline u
55
services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: core
8-
ms.topic: tutorial
9-
ms.reviewer: trbye
10-
ms.author: trbye
11-
author: trevorbye
12-
ms.date: 11/19/2019
8+
ms.topic: how-to
9+
ms.author: peterlu
10+
author: peterclu
11+
ms.date: 01/13/2020
1312
ms.custom: Ignite2019
1413
---
1514

1615
# Run batch predictions using Azure Machine Learning designer
1716
[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
1817

19-
In this how-to, you learn how to use the designer to train a model and set up a batch prediction pipeline and web service. Batch prediction allows for continuous and on-demand scoring of trained models on large data sets, optionally configured as a web service that can be triggered from any HTTP library.
18+
In this article, you learn how to use the designer to create a batch prediction pipeline. Batch prediction lets you continuously score large datasets on-demand using a web service that can be triggered from any HTTP library.
2019

21-
For setting up batch scoring services using the SDK, see the accompanying [how-to](how-to-use-parallel-run-step.md).
22-
23-
In this how-to, you learn the following tasks:
20+
In this how-to, you learn to do the following tasks:
2421

2522
> [!div class="checklist"]
26-
> * Create a basic ML experiment in a pipeline
27-
> * Create a parameterized batch inference pipeline
28-
> * Manage and run pipelines manually or from a REST endpoint
23+
> * Create and publish a batch inference pipeline
24+
> * Consume a pipeline endpoint
25+
> * Manage endpoint versions
26+
27+
To learn how to set up batch scoring services using the SDK, see the accompanying [how-to](how-to-run-batch-predictions.md).
2928

3029
## Prerequisites
3130

32-
1. If you don’t have an Azure subscription, create a free account before you begin. Try the [free or paid version of the Azure Machine Learning](https://aka.ms/AMLFree).
31+
This how-to assumes you already have a training pipeline. For a guided introduction to the designer, complete [part one of the designer tutorial](tutorial-designer-automobile-price-train-score.md).
3332

34-
1. Create a [workspace](tutorial-1st-experiment-sdk-setup.md).
33+
## Create a batch inference pipeline
3534

36-
1. Sign in to [Azure Machine Learning studio](https://ml.azure.com/).
35+
Your training pipeline must be run at least once to be able to create an inferencing pipeline.
3736

38-
This how-to assumes basic knowledge of building a simple pipeline in the designer. For a guided introduction to the designer, complete the [tutorial](tutorial-designer-automobile-price-train-score.md).
37+
1. Go to the **Designer** tab in your workspace.
3938

40-
## Create a pipeline
39+
1. Select the training pipeline that trains the model want to use to make prediction.
4140

42-
To create a batch inference pipeline, you first need a machine learning experiment. To create one, navigate to the **Designer** tab in your workspace and create a new pipeline by selecting the **Easy-to-use prebuilt modules** option.
41+
1. **Run** the pipeline.
4342

44-
![Designer home](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-1.png)
43+
![Run the pipeline](./media/how-to-run-batch-predictions-designer/run-training-pipeline.png)
4544

46-
The following is a simple machine learning model for demonstration purposes. The data is a registered Dataset created from the Azure Open Datasets diabetes data. See the [how-to section](how-to-create-register-datasets.md#create-datasets-with-azure-open-datasets) for registering Datasets from Azure Open Datasets. The data is split into training and validation sets, and a boosted decision tree is trained and scored. The pipeline must be run at least once to be able to create an inferencing pipeline. Click the **Run** button to run the pipeline.
45+
Now that the training pipeline has been run, you can create a batch inference pipeline.
4746

48-
![Create simple experiment](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-2.png)
47+
1. Next to **Run**, select the new dropdown **Create inference pipeline**.
4948

50-
## Create a batch inference pipeline
49+
1. Select **Batch inference pipeline**.
50+
51+
![Create batch inference pipeline](./media/how-to-run-batch-predictions-designer/create-batch-inference.png)
52+
53+
The result is a default batch inference pipeline.
54+
55+
### Add a pipeline parameter
56+
57+
To create predictions on new data, you can either manually connect a different dataset in this pipeline draft view or create a parameter for your dataset. Parameters let you change the behavior of the batch inferencing process at runtime.
58+
59+
In this section, you create a dataset parameter to specify a different dataset to make predictions on.
60+
61+
1. Select the dataset module.
62+
63+
1. A pane will appear to the right of the canvas. At the bottom of the pane, select **Set as pipeline parameter**.
64+
65+
Enter a name for the parameter, or accept the default value.
66+
67+
## Publish your batch inferencing pipeline
68+
69+
Now you're ready to deploy the inferencing pipeline. This will deploy the pipeline and make it available for others to use.
70+
71+
1. Select the **Publish** button.
72+
73+
1. In the dialog that appears, expand the drop-down for **PipelineEndpoint**, and select **New PipelineEndpoint**.
74+
75+
1. Provide an endpoint name and optional description.
76+
77+
Near the bottom of the dialog, you can see the parameter you configured with a default value of the dataset ID used during training.
78+
79+
1. Select **Publish**.
80+
81+
![Publish a pipeline](./media/how-to-run-batch-predictions-designer/publish-inference-pipeline.png)
5182

52-
Now that the pipeline has been run, there is a new option available next to **Run** and **Publish** called **Create inference pipeline**. Click the dropdown and select **Batch inference pipeline**.
5383

54-
![Create batch inference pipeline](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-5.png)
84+
## Consume an endpoint
5585

56-
The result is a default batch inference pipeline. This includes a node for your original pipeline experiment setup, a node for raw data for scoring, and a node to score the raw data against your original pipeline.
86+
Now, you have a published pipeline with a dataset parameter. The pipeline will use the trained model created in the training pipeline to score the dataset you provide as a parameter.
5787

58-
![Default batch inference pipeline](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-6.png)
88+
### Submit a pipeline run
5989

60-
You can add other nodes to change the behavior of the batch inferencing process. In this example, you add a node for randomly sampling from the input data before scoring. Create a **Partition and Sample** node and place it between the raw data and scoring nodes. Next, click on the **Partition and Sample** node to gain access to the settings and parameters.
90+
In this section, you will set up a manual pipeline run and alter the pipeline parameter to score new data.
6191

62-
![New node](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-7.png)
92+
1. After the deployment is complete, go to the **Endpoints** section.
6393

64-
The *Rate of sampling* parameter controls what percent of the original data set to take a random sample from. This is a parameter that will be useful to adjust frequently, so you enable it as a pipeline parameter. Pipeline parameters can be changed at runtime, and can be specified in a payload object when rerunning the pipeline from a REST endpoint.
94+
1. Select **Pipeline endpoints**.
6595

66-
To enable this field as a pipeline parameter, click the ellipses above the field and then click **Add to pipeline parameter**.
96+
1. Select the name of the endpoint you created.
6797

68-
![Sample settings](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-8.png)
98+
![Endpoint link](./media/how-to-run-batch-predictions-designer/manage-endpoints.png)
6999

70-
Next, give the parameter a name and default value. The name will be used to identify the parameter, and specify it in a REST call.
100+
1. Select **Published pipelines**.
71101

72-
![Pipeline parameter](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-9.png)
102+
This screen shows all published pipelines published under this endpoint.
73103

74-
## Deploy batch inferencing pipeline
104+
1. Select the pipeline you published.
75105

76-
Now you are ready to deploy the pipeline. Click the **Deploy** button, which opens the interface to set up an endpoint. Click the dropdown and select **New PipelineEndpoint**.
106+
The pipeline details page shows you a detailed run history and connection string information for your pipeline.
107+
108+
1. Select **Run** to create a manual run of the pipeline.
77109

78-
![Pipeline deploy](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-10.png)
110+
![Pipeline details](./media/how-to-run-batch-predictions-designer/submit-manual-run.png)
111+
112+
1. Change the parameter to use a different dataset.
113+
114+
1. Select **Run** to run the pipeline.
79115

80-
Give the endpoint a name and optional description. Near the bottom you see the `sample-rate` parameter you configured with a default value of 0.8. When you're ready, click **Deploy**.
116+
### Use the REST endpoint
81117

82-
![Setup endpoint](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-11.png)
118+
You can find information on how to consume pipeline endpoints and published pipeline in the **Endpoints** section.
83119

84-
## Manage endpoints
120+
You can find the REST endpoint of a pipeline endpoint in the run overview panel. By calling the endpoint, you are consuming its default published pipeline.
85121

86-
After deployment is complete, go to the **Endpoints** tab and click the name of the endpoint you just created.
122+
You can also consume a published pipeline in the **Published pipelines** page. Select a published pipeline and find the REST endpoint of it.
87123

88-
![Endpoint link](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-12.png)
124+
![Rest endpoint details](./media/how-to-run-batch-predictions-designer/rest-endpoint-details.png)
89125

90-
This screen shows all published pipelines under the specific endpoint. Click on your inferencing pipeline.
126+
To make a REST call, you will need an OAuth 2.0 bearer-type authentication header. See the following [tutorial section](tutorial-pipeline-batch-scoring-classification.md#publish-and-run-from-a-rest-endpoint) for more detail on setting up authentication to your workspace and making a parameterized REST call.
91127

92-
![Inference pipeline](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-13.png)
128+
## Versioning endpoints
93129

94-
The pipeline details page shows you detailed run history and connection string information for your pipeline. Click the **Run** button to create a manual run of the pipeline.
130+
The designer assigns a version to each subsequent pipeline that you publish to an endpoint. You can specify the pipeline version that you want to execute as a parameter in your REST call. If you don't specify a version number, the designer will use the default pipeline.
95131

96-
![Pipeline details](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-14.png)
132+
When you publish a pipeline, you can choose to make it the new default pipeline for that endpoint.
97133

98-
In the run setup, you can provide a description for the run, and change the value for any pipeline parameters. This time, rerun the inferencing pipeline with a sample rate of 0.9. Click **Run** to run the pipeline.
134+
![Set default pipeline](./media/how-to-run-batch-predictions-designer/set-default-pipeline.png)
99135

100-
![Pipeline run](./media/how-to-run-batch-predictions-designer/designer-batch-scoring-15.png)
136+
You can also set a new default pipeline in the **Published pipelines** tab of your endpoint.
101137

102-
The **Consume** tab contains the REST endpoint for rerunning your pipeline. To make a rest call, you will need an OAuth 2.0 bearer-type authentication header. See the following [tutorial section](tutorial-pipeline-batch-scoring-classification.md#publish-and-run-from-a-rest-endpoint) for more detail on setting up authentication to your workspace and making a parameterized REST call.
138+
![Set default pipeline](./media/how-to-run-batch-predictions-designer/set-new-default-pipeline.png)
103139

104140
## Next steps
105141

113 KB
Loading
90.4 KB
Loading

0 commit comments

Comments
 (0)