Skip to content

Commit 99f87bb

Browse files
committed
edits
1 parent a93d970 commit 99f87bb

File tree

1 file changed

+43
-43
lines changed

1 file changed

+43
-43
lines changed

articles/machine-learning/how-to-create-component-pipeline-python.md

Lines changed: 43 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'Create and run machine learning pipelines using components with the Azure Machine Learning SDK v2'
2+
title: 'Create and Run Machine Learning Pipelines Using Components with the Machine Learning SDK v2'
33
titleSuffix: Azure Machine Learning
44
description: Build a machine learning pipeline for image classification. Focus on machine learning instead of infrastructure and automation.
55
ms.service: azure-machine-learning
@@ -15,38 +15,40 @@ ms.custom:
1515
- build-2023
1616
- ignite-2023
1717
- update-code
18+
19+
#customer intent: As a machine learning engineer, I want to create a component-based machine learning pipeline so that I can take advantage of the flexibility and reuse provided by components.
1820
---
1921

20-
# Create and run machine learning pipelines using components with the Azure Machine Learning SDK v2
22+
# Create and run machine learning pipelines by using components with the Machine Learning SDK v2
2123

2224
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
2325

24-
In this article, you learn how to build an [Azure Machine Learning pipeline](concept-ml-pipelines.md) using Python SDK v2 to complete an image classification task containing three steps: prepare data, train an image classification model, and score the model. Machine learning pipelines optimize your workflow with speed, portability, and reuse, so you can focus on machine learning instead of infrastructure and automation.
26+
In this article, you learn how to build an [Azure Machine Learning pipeline](concept-ml-pipelines.md) by using the Azure Machine Learning Python SDK v2 to complete an image classification task that contains three steps: prepare data, train an image classification model, and score the model. Machine Learning pipelines optimize your workflow with speed, portability, and reuse, so you can focus on machine learning instead of infrastructure and automation.
2527

26-
The example trains a small [Keras](https://keras.io/) convolutional neural network to classify images in the [Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset. The pipeline looks like following.
28+
The example pipeline trains a small [Keras](https://keras.io/) convolutional neural network to classify images in the [Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset. The pipeline looks like this:
2729

28-
:::image type="content" source="./media/how-to-create-component-pipeline-python/pipeline-graph.png" alt-text="Screenshot showing pipeline graph of the image classification Keras example." lightbox ="./media/how-to-create-component-pipeline-python/pipeline-graph.png":::
30+
:::image type="content" source="./media/how-to-create-component-pipeline-python/pipeline-graph.png" alt-text="Screenshot showing a pipeline graph of the image classification example." lightbox ="./media/how-to-create-component-pipeline-python/pipeline-graph.png":::
2931

3032
In this article, you complete the following tasks:
3133

3234
> [!div class="checklist"]
3335
> * Prepare input data for the pipeline job
34-
> * Create three components to prepare the data, train and score
35-
> * Compose a Pipeline from the components
36-
> * Get access to workspace with compute
36+
> * Create three components to prepare the data, train an image, and score the model
37+
> * Build a pipeline from the components
38+
> * Get access to a workspace with compute
3739
> * Submit the pipeline job
3840
> * Review the output of the components and the trained neural network
39-
> * (Optional) Register the component for further reuse and sharing within workspace
41+
> * (Optional) Register the component for further reuse and sharing within the workspace
4042
4143
If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/) today.
4244

4345
## Prerequisites
4446

45-
* Azure Machine Learning workspace - if you don't have one, complete the [Create resources tutorial](quickstart-create-resources.md).
46-
* A Python environment in which you've installed Azure Machine Learning Python SDK v2 - [install instructions](https://github.com/Azure/azureml-examples/tree/sdk-preview/sdk#getting-started) - check the getting started section. This environment is for defining and controlling your Azure Machine Learning resources and is separate from the environment used at runtime for training.
47-
* Clone examples repository
47+
* An Azure Machine Learning workspace. If you don't have one, complete the [Create resources tutorial](quickstart-create-resources.md).
48+
* A Python environment in which you've installed Azure Machine Learning Python SDK v2. For installation instructions, see [Getting started](https://github.com/Azure/azureml-examples/tree/sdk-preview/sdk#getting-started). This environment is for defining and controlling your Azure Machine Learning resources and is separate from the environment that's used at runtime for training.
49+
* A clone of the examples repository.
4850

49-
To run the training examples, first clone the examples repository and change into the `sdk` directory:
51+
To run the training examples, first clone the examples repository and go to the `sdk` directory:
5052

5153
```bash
5254
git clone --depth 1 https://github.com/Azure/azureml-examples
@@ -55,78 +57,76 @@ If you don't have an Azure subscription, create a free account before you begin.
5557

5658
## Start an interactive Python session
5759

58-
This article uses the Python SDK for Azure Machine Learning to create and control an Azure Machine Learning pipeline. The article assumes that you'll be running the code snippets interactively in either a Python REPL environment or a Jupyter notebook.
60+
This article uses the Azure Machine Learning Python SDK to create and control an Azure Machine Learning pipeline. The article is written based on the assumption that you'll be running the code snippets interactively in either a Python REPL environment or a Jupyter notebook.
5961
60-
This article is based on the [image_classification_keras_minist_convnet.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet/image_classification_keras_minist_convnet.ipynb) notebook found in the `sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet` directory of the [Azure Machine Learning Examples](https://github.com/azure/azureml-examples) repository.
62+
This article is based on the [image_classification_keras_minist_convnet.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet/image_classification_keras_minist_convnet.ipynb) notebook, which you can find in the `sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet` directory of the [Azure Machine Learning examples](https://github.com/azure/azureml-examples) repository.
6163
6264
## Import required libraries
6365
64-
Import all the Azure Machine Learning required libraries that you'll need for this article:
66+
Import all the Azure Machine Learning libraries that you need for this article:
6567
6668
[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet/image_classification_keras_minist_convnet.ipynb?name=required-library)]
6769
6870
## Prepare input data for your pipeline job
6971
70-
You need to prepare the input data for this image classification pipeline.
71-
72-
Fashion-MNIST is a dataset of fashion images divided into 10 classes. Each image is a 28x28 grayscale image and there are 60,000 training and 10,000 test images. As an image classification problem, Fashion-MNIST is harder than the classic MNIST handwritten digit database. It's distributed in the same compressed binary form as the original [handwritten digit database](http://yann.lecun.com/exdb/mnist/).
72+
You need to prepare the input data for the image classification pipeline.
7373
74-
Import all the Azure Machine Learning required libraries that you'll need.
74+
Fashion-MNIST is a dataset of fashion images divided into 10 classes. Each image is a 28 x 28 grayscale image. There are 60,000 training images and 10,000 test images. As an image classification problem, Fashion-MNIST is more challenging than the classic MNIST handwritten digit database. It's distributed in the same compressed binary form as the original [handwritten digit database](http://yann.lecun.com/exdb/mnist/).
7575

7676
By defining an `Input`, you create a reference to the data source location. The data remains in its existing location, so no extra storage cost is incurred.
7777

78-
## Create components for building pipeline
78+
## Create components for building the pipeline
7979

80-
The image classification task can be split into three steps: prepare data, train model and score model.
80+
The image classification task can be split into three steps: prepare data, train the model, and score the model.
8181

82-
[Azure Machine Learning component](concept-component.md) is a self-contained piece of code that does one step in a machine learning pipeline. In this article, you'll create three components for the image classification task:
82+
An [Azure Machine Learning component](concept-component.md) is a self-contained piece of code that completes one step in a machine learning pipeline. In this article, you create three components for the image classification task:
8383

84-
* Prepare data for training and test
85-
* Train a neural network for image classification using training data
86-
* Score the model using test data
84+
* Prepare data for training and test it.
85+
* Train a neural network for image classification by using training data.
86+
* Score the model by using test data.
8787

88-
For each component, you need to prepare the following:
88+
For each component, you need to complete these steps:
8989

90-
1. Prepare the Python script containing the execution logic
90+
1. Prepare the Python script that contains the execution logic.
9191

92-
1. Define the interface of the component
92+
1. Define the interface of the component.
9393

94-
1. Add other metadata of the component, including run-time environment, command to run the component, and etc.
94+
1. Add other metadata of the component, including the runtime environment and the command to run the component.
9595

96-
The next section will show the create components in two different ways: the first two components using Python function and the third component using YAML definition.
96+
The next section shows how to create the components in two ways. For the first two components, you use a Python function. For the third component you use YAML definition.
9797

9898
### Create the data-preparation component
9999

100-
The first component in this pipeline will convert the compressed data files of `fashion_ds` into two csv files, one for training and the other for scoring. You'll use Python function to define this component.
100+
The first component in this pipeline converts the compressed data files of `fashion_ds` into two .csv files, one for training and the other for scoring. You use a Python function to define this component.
101101

102-
If you're following along with the example in the [Azure Machine Learning examples repo](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet), the source files are already available in `prep/` folder. This folder contains two files to construct the component: `prep_component.py`, which defines the component and `conda.yaml`, which defines the run-time environment of the component.
102+
If you're following along with the example in the [Azure Machine Learning examples repo](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet), the source files are already available in the `prep/` folder. This folder contains two files to construct the component: `prep_component.py`, which defines the component, and `conda.yaml`, which defines the runtime environment of the component.
103103
104-
#### Define component using Python function
104+
#### Define component by using a Python function
105105
106-
By using `command_component()` function as a decorator, you can easily define the component's interface, metadata and code to execute from a Python function. Each decorated Python function will be transformed into a single static specification (YAML) that the pipeline service can process.
106+
By using the `command_component()` function as a decorator, you can easily define the component's interface, its metadata, and the code to run from a Python function. Each decorated Python function will be transformed into a single static specification (YAML) that the pipeline service can process.
107107

108108
:::code language="python" source="~/azureml-examples-main/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet/prep/prep_component.py":::
109109

110-
The code above define a component with display name `Prep Data` using `@command_component` decorator:
110+
The preceding code defines a component with display name `Prep Data` by using the `@command_component` decorator:
111111

112112
* `name` is the unique identifier of the component.
113113
* `version` is the current version of the component. A component can have multiple versions.
114-
* `display_name` is a friendly display name of the component in UI, which isn't unique.
115-
* `description` usually describes what task this component can complete.
116-
* `environment` specifies the run-time environment for this component. The environment of this component specifies a docker image and refers to the `conda.yaml` file.
114+
* `display_name` is a friendly display name of the component for UI. It isn't unique.
115+
* `description` usually describes the task the component can complete.
116+
* `environment` specifies the runtime environment for the component. The environment of this component specifies a Docker image and refers to the `conda.yaml` file.
117117
118-
The `conda.yaml` file contains all packages used for the component like following:
118+
The `conda.yaml` file contains all packages used for the component:
119119
120120
:::code language="python" source="~/azureml-examples-v2samplesreorg/sdk/python/jobs/pipelines/2e_image_classification_keras_minist_convnet/prep/conda.yaml":::
121121
122122
* The `prepare_data_component` function defines one input for `input_data` and two outputs for `training_data` and `test_data`.
123123
`input_data` is input data path. `training_data` and `test_data` are output data paths for training data and test data.
124-
* This component converts the data from `input_data` into a training data csv to `training_data` and a test data csv to `test_data`.
124+
* The component converts the data from `input_data` into a `training_data` .csv to train data and a `test_data` .csv to test data.
125125
126-
Following is what a component looks like in the studio UI.
126+
This is what a component looks like in the studio UI:
127127
128128
* A component is a block in a pipeline graph.
129-
* The `input_data`, `training_data` and `test_data` are ports of the component, which connects to other components for data streaming.
129+
* `input_data`, `training_data`, and `test_data` are ports of the component, which connect to other components for data streaming.
130130
131131
:::image type="content" source="./media/how-to-create-component-pipeline-python/prep-data-component.png" alt-text="Screenshot of the Prep Data component in the UI and code." lightbox ="./media/how-to-create-component-pipeline-python/prep-data-component.png":::
132132

0 commit comments

Comments
 (0)