Skip to content

Commit 98dcd18

Browse files
author
Larry O'Brien
committed
Pipeline batch scoring classification updates per Luis' review.
1 parent 7394339 commit 98dcd18

File tree

1 file changed

+10
-9
lines changed

1 file changed

+10
-9
lines changed

articles/machine-learning/tutorial-pipeline-batch-scoring-classification.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'Tutorial: ML pipelines for batch scoring'
33
titleSuffix: Azure Machine Learning
4-
description: In this tutorial, you build a machine learning pipeline for running batch scoring on an image classification model in Azure Machine Learning. Machine learning pipelines optimize your workflow with speed, portability, and reuse, so you can focus on your expertise - machine learning - instead of on infrastructure and automation.
4+
description: In this tutorial, you build a machine learning pipeline to perform batch scoring on an image classification model. Azure Machine Learning allow you to focus on machine learning instead of infrastructure and automation.
55
services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: core
@@ -16,15 +16,16 @@ ms.date: 02/10/2020
1616

1717
[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
1818

19-
In this tutorial, you use a pipeline in Azure Machine Learning to run a batch scoring job. The example uses the pretrained [Inception-V3](https://arxiv.org/abs/1512.00567) convolutional neural network Tensorflow model to classify unlabeled images. After you build and publish a pipeline, you configure a REST endpoint that you can use to trigger the pipeline from any HTTP library on any platform.
19+
Learn how to build a pipeline in Azure Machine Learning to run a batch scoring job. Machine learning pipelines optimize your workflow with speed, portability, and reuse, so you can focus on machine learning instead of infrastructure and automation. After you build and publish a pipeline, you configure a REST endpoint that you can use to trigger the pipeline from any HTTP library on any platform.
2020

21-
Machine learning pipelines optimize your workflow with speed, portability, and reuse, so you can focus on your expertise - machine learning - instead of on infrastructure and automation. [Learn more about machine learning pipelines](concept-ml-pipelines.md).
21+
The example uses a pretrained [Inception-V3](https://arxiv.org/abs/1512.00567) convolutional neural network model implemented in Tensorflow to classify unlabeled images. [Learn more about machine learning pipelines](concept-ml-pipelines.md).
2222

2323
In this tutorial, you complete the following tasks:
2424

2525
> [!div class="checklist"]
26-
> * Configure workspace and download sample data
27-
> * Create data objects to fetch and output data
26+
> * Configure workspace
27+
> * Download and store sample data
28+
> * Create dataset objects to fetch and output data
2829
> * Download, prepare, and register the model in your workspace
2930
> * Provision compute targets and create a scoring script
3031
> * Use the `ParallelRunStep` class for async batch scoring
@@ -52,7 +53,7 @@ from azureml.core import Workspace
5253
ws = Workspace.from_config()
5354
```
5455

55-
### Create a datastore for sample images
56+
## Create a datastore for sample images
5657

5758
On the `pipelinedata` account, get the ImageNet evaluation public data sample from the `sampledata` public blob container. Call `register_azure_blob_container()` to make the data available to the workspace under the name `images_datastore`. Then, set the workspace default datastore as the output datastore. Use the output datastore to score output in the pipeline.
5859

@@ -68,7 +69,7 @@ batchscore_blob = Datastore.register_azure_blob_container(ws,
6869
def_data_store = ws.get_default_datastore()
6970
```
7071

71-
## Create data objects
72+
## Create dataset objects
7273

7374
When building pipelines, `Dataset` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.
7475

@@ -254,7 +255,7 @@ def run(mini_batch):
254255
> [!TIP]
255256
> The pipeline in this tutorial has only one step, and it writes the output to a file. For multi-step pipelines, you also use `ArgumentParser` to define a directory to write output data for input to subsequent steps. For an example of passing data between multiple pipeline steps by using the `ArgumentParser` design pattern, see the [notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb).
256257
257-
## Build and run the pipeline
258+
## Build the pipeline
258259

259260
Before you run the pipeline, create an object that defines the Python environment and creates the dependencies that your `batch_scoring.py` script requires. The main dependency required is Tensorflow, but you also install `azureml-defaults` for background processes. Create a `RunConfiguration` object by using the dependencies. Also, specify Docker and Docker-GPU support.
260261

@@ -319,7 +320,7 @@ batch_score_step = ParallelRunStep(
319320

320321
For a list of all the classes you can use for different step types, see the [steps package](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps?view=azure-ml-py).
321322

322-
### Run the pipeline
323+
## Run the pipeline
323324

324325
Now, run the pipeline. First, create a `Pipeline` object by using your workspace reference and the pipeline step you created. The `steps` parameter is an array of steps. In this case, there's only one step for batch scoring. To build pipelines that have multiple steps, place the steps in order in this array.
325326

0 commit comments

Comments
 (0)