Skip to content

Commit e94050e

Browse files
committed
fix reviewer suggestion
1 parent 3a87f2b commit e94050e

File tree

3 files changed

+19
-19
lines changed

3 files changed

+19
-19
lines changed

articles/machine-learning/how-to-create-component-pipelines-cli.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ ms.devlang: azurecli, cliv2
1919
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
2020

2121

22-
In this article, you learn how to create and run [machine learning pipelines](concept-ml-pipelines.md) by using the Azure CLI and components (for more, see [What is an Azure Machine Learning component?](concept-component.md)). You can create pipelines without using components, but components offer the greatest amount of flexibility and reuse. Azure Machine Learning Pipelines may be defined in YAML and run from the CLI, authored in Python, or composed in Azure Machine Learning Studio Designer with a drag-and-drop UI. This document focuses on the CLI.
22+
In this article, you learn how to create and run [machine learning pipelines](concept-ml-pipelines.md) by using the Azure CLI and components (for more, see [What is an Azure Machine Learning component?](concept-component.md)). You can create pipelines without using components, but components offer the greatest amount of flexibility and reuse. Azure Machine Learning Pipelines may be defined in YAML and run from the CLI, authored in Python, or composed in Azure Machine Learning studio Designer with a drag-and-drop UI. This document focuses on the CLI.
2323

2424
## Prerequisites
2525

@@ -36,7 +36,7 @@ In this article, you learn how to create and run [machine learning pipelines](co
3636
cd azureml-examples/cli/jobs/pipelines-with-components/basics
3737
```
3838
39-
### Suggested pre-reading
39+
### Suggested prereading
4040
4141
- [What is Azure Machine Learning pipeline](./concept-ml-pipelines.md)
4242
- [What is Azure Machine Learning component](./concept-component.md)
@@ -45,18 +45,18 @@ In this article, you learn how to create and run [machine learning pipelines](co
4545
4646
Let's create your first pipeline with component using an example. This section aims to give you an initial impression of what pipeline and component look like in Azure Machine Learning with a concrete example.
4747
48-
From the `cli/jobs/pipelines-with-components/basics` directory of the [`azureml-examples` repository](https://github.com/Azure/azureml-examples), navigate to the `3b_pipeline_with_data` subdirector. There are three types of files in this directory. Those are the files you'll need to create when building your own pipeline.
48+
From the `cli/jobs/pipelines-with-components/basics` directory of the [`azureml-examples` repository](https://github.com/Azure/azureml-examples), navigate to the `3b_pipeline_with_data` subdirector. There are three types of files in this directory. Those are the files you need to create when building your own pipeline.
4949
5050
- **pipeline.yml**: This YAML file defines the machine learning pipeline. This YAML file describes how to break a full machine learning task into a multistep workflow. For example, considering a simple machine learning task of using historical data to train a sales forecasting model, you may want to build a sequential workflow with data processing, model training, and model evaluation steps. Each step is a component that has well defined interface and can be developed, tested, and optimized independently. The pipeline YAML also defines how the child steps connect to other steps in the pipeline, for example the model training step generate a model file and the model file will pass to a model evaluation step.
5151
5252
- **component.yml**: This YAML file defines the component. It packages following information:
5353
- Metadata: name, display name, version, description, type etc. The metadata helps to describe and manage the component.
54-
- Interface: inputs and outputs. For example, a model training component will take training data and number of epochs as input, and generate a trained model file as output. Once the interface is defined, different teams can develop and test the component independently.
54+
- Interface: inputs and outputs. For example, a model training component takes training data and number of epochs as input, and generate a trained model file as output. Once the interface is defined, different teams can develop and test the component independently.
5555
- Command, code & environment: the command, code and environment to run the component. Command is the shell command to execute the component. Code usually refers to a source code directory. Environment could be an Azure Machine Learning environment(curated or customer created), docker image or conda environment.
5656
57-
- **component_src**: This is the source code directory for a specific component. It contains the source code that will be executed in the component. You can use your preferred language(Python, R...). The code must be executed by a shell command. The source code can take a few inputs from shell command line to control how this step is going to be executed. For example, a training step may take training data, learning rate, number of epochs to control the training process. The argument of a shell command is used to pass inputs and outputs to the code.
57+
- **component_src**: This is the source code directory for a specific component. It contains the source code that is executed in the component. You can use your preferred language(Python, R...). The code must be executed by a shell command. The source code can take a few inputs from shell command line to control how this step is going to be executed. For example, a training step may take training data, learning rate, number of epochs to control the training process. The argument of a shell command is used to pass inputs and outputs to the code.
5858
59-
Now let's create a pipeline using the `3b_pipeline_with_data` example. We'll explain the detailed meaning of each file in following sections.
59+
Now let's create a pipeline using the `3b_pipeline_with_data` example. We explain the detailed meaning of each file in following sections.
6060
6161
First list your available compute resources with the following command:
6262
@@ -73,7 +73,7 @@ If you don't have it, create a cluster called `cpu-cluster` by running:
7373
az ml compute create -n cpu-cluster --type amlcompute --min-instances 0 --max-instances 10
7474
```
7575

76-
Now, create a pipeline job defined in the pipeline.yml file with the following command. The compute target will be referenced in the pipeline.yml file as `azureml:cpu-cluster`. If your compute target uses a different name, remember to update it in the pipeline.yml file.
76+
Now, create a pipeline job defined in the pipeline.yml file with the following command. The compute target is referenced in the pipeline.yml file as `azureml:cpu-cluster`. If your compute target uses a different name, remember to update it in the pipeline.yml file.
7777

7878
```azurecli
7979
az ml job create --file pipeline.yml
@@ -88,7 +88,7 @@ You should receive a JSON dictionary with information about the pipeline job, in
8888
| `services.Studio.endpoint` | A URL for monitoring and reviewing the pipeline job. |
8989
| `status` | The status of the job. This will likely be `Preparing` at this point. |
9090

91-
Open the `services.Studio.endpoint` URL you'll see a graph visualization of the pipeline looks like below.
91+
Open the `services.Studio.endpoint` URL you see a graph visualization of the pipeline looks like below.
9292

9393
:::image type="content" source="./media/how-to-create-component-pipelines-cli/pipeline-graph-dependencies.png" alt-text="Screenshot of a graph visualization of the pipeline.":::
9494

@@ -116,7 +116,7 @@ In the *3b_pipeline_with_data* example, we've created a three steps pipeline.
116116

117117
- The three steps are defined under `jobs`. All three step type is command job. Each step's definition is in corresponding `component.yml` file. You can see the component YAML files under *3b_pipeline_with_data* directory. We'll explain the componentA.yml in next section.
118118
- This pipeline has data dependency, which is common in most real world pipelines. Component_a takes data input from local folder under `./data`(line 17-20) and passes its output to componentB (line 29). Component_a's output can be referenced as `${{parent.jobs.component_a.outputs.component_a_output}}`.
119-
- The `compute` defines the default compute for this pipeline. If a component under `jobs` defines a different compute for this component, the system will respect component specific setting.
119+
- The `compute` defines the default compute for this pipeline. If a component under `jobs` defines a different compute for this component, the system respects component specific setting.
120120

121121
:::image type="content" source="./media/how-to-create-component-pipelines-cli/pipeline-inputs-and-outputs.png" alt-text="Screenshot of the pipeline with data example above." lightbox ="./media/how-to-create-component-pipelines-cli/pipeline-inputs-and-outputs.png":::
122122

@@ -140,13 +140,13 @@ The most common used schema of the component YAML is described in below table. S
140140
|key|description|
141141
|------|------|
142142
|name|**Required**. Name of the component. Must be unique across the Azure Machine Learning workspace. Must start with lowercase letter. Allow lowercase letters, numbers and underscore(_). Maximum length is 255 characters.|
143-
|display_name|Display name of the component in the studio UI. Can be non-unique within the workspace.|
143+
|display_name|Display name of the component in the studio UI. Can be nonunique within the workspace.|
144144
|command|**Required** the command to execute|
145145
|code|Local path to the source code directory to be uploaded and used for the component.|
146-
|environment|**Required**. The environment that will be used to execute the component.|
146+
|environment|**Required**. The environment that is used to execute the component.|
147147
|inputs|Dictionary of component inputs. The key is a name for the input within the context of the component and the value is the component input definition. Inputs can be referenced in the command using the ${{ inputs.<input_name> }} expression.|
148148
|outputs|Dictionary of component outputs. The key is a name for the output within the context of the component and the value is the component output definition. Outputs can be referenced in the command using the ${{ outputs.<output_name> }} expression.|
149-
|is_deterministic|Whether to reuse the previous job's result if the component inputs did not change. Default value is `true`, also known as reuse by default. The common scenario when set as `false` is to force reload data from a cloud storage or URL.|
149+
|is_deterministic|Whether to reuse the previous job's result if the component inputs didn't change. Default value is `true`, also known as reuse by default. The common scenario when set as `false` is to force reload data from a cloud storage or URL.|
150150

151151
For the example in *3b_pipeline_with_data/componentA.yml*, componentA has one data input and one data output, which can be connected to other steps in the parent pipeline. All the files under `code` section in component YAML will be uploaded to Azure Machine Learning when submitting the pipeline job. In this example, files under `./componentA_src` will be uploaded (line 16 in *componentA.yml*). You can see the uploaded source code in Studio UI: double select the ComponentA step and navigate to Snapshot tab, as shown in below screenshot. We can see it's a hello-world script just doing some simple printing, and write current datetime to the `componentA_output` path. The component takes input and output through command line argument, and it's handled in the *hello.py* using `argparse`.
152152

@@ -155,9 +155,9 @@ For the example in *3b_pipeline_with_data/componentA.yml*, componentA has one da
155155
### Input and output
156156
Input and output define the interface of a component. Input and output could be either of a literal value(of type `string`,`number`,`integer`, or `boolean`) or an object containing input schema.
157157

158-
**Object input** (of type `uri_file`, `uri_folder`,`mltable`,`mlflow_model`,`custom_model`) can connect to other steps in the parent pipeline job and hence pass data/model to other steps. In pipeline graph, the object type input will render as a connection dot.
158+
**Object input** (of type `uri_file`, `uri_folder`,`mltable`,`mlflow_model`,`custom_model`) can connect to other steps in the parent pipeline job and hence pass data/model to other steps. In pipeline graph, the object type input renders as a connection dot.
159159

160-
**Literal value inputs** (`string`,`number`,`integer`,`boolean`) are the parameters you can pass to the component at run time. You can add default value of literal inputs under `default` field. For `number` and `integer` type, you can also add minimum and maximum value of the accepted value using `min` and `max` fields. If the input value exceeds the min and max, pipeline will fail at validation. Validation happens before you submit a pipeline job to save your time. Validation works for CLI, Python SDK and designer UI. Below screenshot shows a validation example in designer UI. Similarly, you can define allowed values in `enum` field.
160+
**Literal value inputs** (`string`,`number`,`integer`,`boolean`) are the parameters you can pass to the component at run time. You can add default value of literal inputs under `default` field. For `number` and `integer` type, you can also add minimum and maximum value of the accepted value using `min` and `max` fields. If the input value exceeds the min and max, pipeline fails at validation. Validation happens before you submit a pipeline job to save your time. Validation works for CLI, Python SDK and designer UI. Below screenshot shows a validation example in designer UI. Similarly, you can define allowed values in `enum` field.
161161

162162
:::image type="content" source="./media/how-to-create-component-pipelines-cli/component-input-output.png" alt-text="Screenshot of the input and output of the train linear regression model component." lightbox= "./media/how-to-create-component-pipelines-cli/component-input-output.png":::
163163

@@ -175,7 +175,7 @@ Environment defines the environment to execute the component. It could be an Azu
175175

176176
## Register component for reuse and sharing
177177

178-
While some components will be specific to a particular pipeline, the real benefit of components comes from reuse and sharing. Register a component in your Machine Learning workspace to make it available for reuse. Registered components support automatic versioning so you can update the component but assure that pipelines that require an older version will continue to work.
178+
While some components are specific to a particular pipeline, the real benefit of components comes from reuse and sharing. Register a component in your Machine Learning workspace to make it available for reuse. Registered components support automatic versioning so you can update the component but assure that pipelines that require an older version will continue to work.
179179

180180
In the azureml-examples repository, navigate to the `cli/jobs/pipelines-with-components/basics/1b_e2e_registered_components` directory.
181181

@@ -191,11 +191,11 @@ After these commands run to completion, you can see the components in Studio, un
191191

192192
:::image type="content" source="./media/how-to-create-component-pipelines-cli/registered-components.png" alt-text="Screenshot of Studio showing the components that were just registered." lightbox ="./media/how-to-create-component-pipelines-cli/registered-components.png":::
193193

194-
Select a component. You'll see detailed information for each version of the component.
194+
Select a component. You see detailed information for each version of the component.
195195

196-
Under **Details** tab, you'll see basic information of the component like name, created by, version etc. You'll see editable fields for Tags and Description. The tags can be used for adding rapidly searched keywords. The description field supports Markdown formatting and should be used to describe your component's functionality and basic use.
196+
Under **Details** tab, you see basic information of the component like name, created by, version etc. You see editable fields for Tags and Description. The tags can be used for adding rapidly searched keywords. The description field supports Markdown formatting and should be used to describe your component's functionality and basic use.
197197

198-
Under **Jobs** tab, you'll see the history of all jobs that use this component.
198+
Under **Jobs** tab, you see the history of all jobs that use this component.
199199

200200
:::image type="content" source="./media/how-to-create-component-pipelines-cli/registered-components.png" alt-text="Screenshot of the component tab showing 3 components." lightbox ="./media/how-to-create-component-pipelines-cli/registered-components.png":::
201201

articles/machine-learning/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -772,7 +772,7 @@
772772
- name: How to do hyperparameter sweep in pipeline
773773
href: how-to-use-sweep-in-pipeline.md
774774
- name: How to manage inputs and ouputs in pipeline
775-
href: how-to-manage-inputs-and-outputs-in-pipeline.md
775+
href: how-to-manage-inputs-outputs-pipeline.md
776776
- name: How to use pipeline components in pipeline
777777
href: how-to-use-pipeline-component.md
778778
- name: Schedule a pipeline job

0 commit comments

Comments
 (0)