Skip to content

Commit eff89ae

Browse files
authored
Merge pull request #232192 from lgayhardt/amlpipelinecomp0323
Pipeline Component
2 parents 1e1912c + c77995f commit eff89ae

File tree

5 files changed

+169
-0
lines changed

5 files changed

+169
-0
lines changed
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
title: How to use pipeline component in pipeline
3+
titleSuffix: Azure Machine Learning
4+
description: How to use pipeline component to build nested pipeline job in Azure Machine Learning pipeline using CLI v2 and Python SDK
5+
services: machine-learning
6+
ms.service: machine-learning
7+
ms.subservice: mlops
8+
ms.topic: how-to
9+
author: cloga
10+
ms.author: lochen
11+
ms.reviewer: lagayhar
12+
ms.date: 04/12/2023
13+
ms.custom: sdkv2, cliv2,
14+
---
15+
16+
# How to use pipeline component to build nested pipeline job (V2) (preview)
17+
18+
[!INCLUDE [dev v2](../../includes/machine-learning-dev-v2.md)]
19+
20+
When developing a complex machine learning pipeline, it's common to have sub-pipelines that use multi-step to perform tasks such as data preprocessing and model training. These sub-pipelines can be developed and tested standalone. Pipeline component groups multi-step as a component that can be used as a single step to create complex pipelines. Which will help you share your work and better collaborate with team members.
21+
22+
By using a pipeline component, the author can focus on developing sub-tasks and easily integrate them with the entire pipeline job. Furthermore, a pipeline component has a well-defined interface in terms of inputs and outputs, which means that user of the pipeline component doesn't need to know the implementation details of the component.
23+
24+
In this article, you'll learn how to use pipeline component in Azure Machine Learning pipeline.
25+
26+
[!INCLUDE [machine-learning-preview-generic-disclaimer](../../includes/machine-learning-preview-generic-disclaimer.md)]
27+
28+
## Prerequisites
29+
30+
- Understand how to use Azure Machine Learning pipeline with [CLI v2](how-to-create-component-pipelines-cli.md) and [SDK v2](how-to-create-component-pipeline-python.md).
31+
- Understand what is [component](concept-component.md) and how to use component in Azure Machine Learning pipeline.
32+
- Understand what is a [Azure Machine Learning pipeline](concept-ml-pipelines.md)
33+
34+
## The difference between pipeline job and pipeline component
35+
36+
In general, pipeline component is similar to pipeline job. They're both consist of a group of jobs/components.
37+
38+
Here are some main differences you need aware when defining pipeline component:
39+
40+
- Pipeline component only defines the interface of inputs/outputs, which means when defining a pipeline component you need to explicitly define the type of inputs/outputs instead of directly assigning values to them.
41+
- Pipeline component can't have runtime settings, you can't hard-code compute, or data node in the pipeline component. Instead you need to promote them as pipeline level inputs and assign values during runtime.
42+
- Pipeline level settings such as default_datastore and default_compute are also runtime settings. They aren't part of pipeline component definition.
43+
44+
### CLI v2
45+
46+
The example used in this article can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *azureml-examples/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component* to check the example.
47+
48+
You can use multi-components to build a pipeline component. Similar to how you built pipeline job with component. This is two step pipeline component.
49+
50+
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/components/train_pipeline_component.yml" highlight="7-48":::
51+
52+
When reference pipeline component to define child job in a pipeline job, just like reference other type of component. You can provide runtime settings such as default_datastore, default_compute in pipeline job level, any parameter you want to change during run time need promote as pipeline job inputs, otherwise, they'll be hard-code in next pipeline component. We're support to promote compute as pipeline component input to support heterogenous pipeline, which may need different compute target in different steps.
53+
54+
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline.yml" highlight="11-16,23-25,60":::
55+
56+
### Python SDK
57+
58+
The python SDK example can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *azureml-examples/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component* to check the example.
59+
60+
You can define a pipeline component using a Python function, which is similar to defining a pipeline job using a function. You can also promote the compute of some step to be used as inputs for the pipeline component.
61+
62+
[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component)]
63+
64+
You can use pipeline component as a step like other components in pipeline job.
65+
66+
[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component-pipeline-job)]
67+
68+
## Pipeline job with pipeline component in studio
69+
70+
You can use `az ml component create` or `ml_client.components.create_or_update` to register pipeline component as a registered component. After that you can view the component in asset library and component list page.
71+
72+
### Using pipeline component to build pipeline job
73+
74+
After you register the pipeline component, you can drag and drop the pipeline component into the designer canvas and use the UI to build pipeline job.
75+
76+
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-authoring.png" alt-text="Screenshot of the designer canvas page to build pipeline job with pipeline component." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-authoring.png":::
77+
78+
### View pipeline job using pipeline component
79+
80+
After submitted pipeline job, you can go to pipeline job detail page to change pipeline component status, you can also drill down to child component in pipeline component to debug specific component.
81+
82+
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-right-panel.png" alt-text="Screenshot of view pipeline component on the pipeline job detail page." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-right-panel.png":::
83+
84+
## Sample notebooks
85+
86+
- [nyc_taxi_data_regression_with_pipeline_component](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component.ipynb)
87+
- [pipeline_with_train_eval_pipeline_component](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb)
88+
89+
## Next steps
90+
- [YAML reference for pipeline component](reference-yaml-component-pipeline.md)
91+
- [Track an experiment](how-to-log-view-metrics.md)
92+
- [Deploy a trained model](how-to-deploy-managed-online-endpoints.md)
117 KB
Loading
225 KB
Loading
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
title: 'CLI (v2) pipeline component YAML schema'
3+
titleSuffix: Azure Machine Learning
4+
description: Reference documentation for the CLI (v2) pipeline component YAML schema.
5+
services: machine-learning
6+
ms.service: machine-learning
7+
ms.subservice: core
8+
ms.topic: reference
9+
ms.custom: cliv2
10+
author: cloga
11+
ms.author: lochen
12+
ms.date: 04/12/2023
13+
ms.reviewer: lagayhar
14+
---
15+
16+
# CLI (v2) pipeline component YAML schema (preview)
17+
18+
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
19+
20+
The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json.
21+
22+
[!INCLUDE [schema note](../../includes/machine-learning-preview-old-json-schema-note.md)]
23+
24+
[!INCLUDE [machine-learning-preview-generic-disclaimer](../../includes/machine-learning-preview-generic-disclaimer.md)]
25+
26+
## YAML syntax
27+
28+
| Key | Type | Description | Allowed values | Default value |
29+
| --- | ---- | ----------- | -------------- | ------------- |
30+
| `$schema` | string | The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including `$schema` at the top of your file enables you to invoke schema and resource completions. | | |
31+
| `type` | const | The type of component. | `pipeline` | `pipeline` |
32+
| `name` | string | **Required.** Name of the component. Must start with lowercase letter. Allowed characters are lowercase letters, numbers, and underscore(_). Maximum length is 255 characters.| | |
33+
| `version` | string | Version of the component. If omitted, Azure Machine Learning will autogenerate a version. | | |
34+
| `display_name` | string | Display name of the component in the studio UI. It can be non-unique within the workspace. | | |
35+
| `description` | string | Description of the component. | | |
36+
| `tags` | object | Dictionary of tags for the component. | | |
37+
| `jobs` | object | **Required.** Dictionary of the set of individual jobs to run as steps within the pipeline. These jobs are considered child jobs of the parent pipeline job. <br><br> The key is the name of the step within the context of the pipeline job. This name is different from the unique job name of the child job. The value is the job specification, which can follow the [command job schema](reference-yaml-job-command.md#yaml-syntax) or [sweep job schema](reference-yaml-job-sweep.md#yaml-syntax). Currently only command jobs and sweep jobs can be run in a pipeline. | | |
38+
| `inputs` | object | Dictionary of inputs to the pipeline job. The key is a name for the input within the context of the job and the value is the input value. <br><br> These pipeline inputs can be referenced by the inputs of an individual step job in the pipeline using the `${{ parent.inputs.<input_name> }}` expression. For more information on how to bind the inputs of a pipeline step to the inputs of the top-level pipeline job, see the [Expression syntax for binding inputs and outputs between steps in a pipeline job](reference-yaml-core-syntax.md#binding-inputs-and-outputs-between-steps-in-a-pipeline-job). | | |
39+
| `inputs.<input_name>` | number, integer, boolean, string or object | One of a literal value (of type number, integer, boolean, or string) or an object containing a [component input data specification](#component-input). | | |
40+
| `outputs` | object | Dictionary of output configurations of the pipeline job. The key is a name for the output within the context of the job and the value is the output configuration. <br><br> These pipeline outputs can be referenced by the outputs of an individual step job in the pipeline using the `${{ parents.outputs.<output_name> }}` expression. For more information on how to bind the inputs of a pipeline step to the inputs of the top-level pipeline job, see the [Expression syntax for binding inputs and outputs between steps in a pipeline job](reference-yaml-core-syntax.md#binding-inputs-and-outputs-between-steps-in-a-pipeline-job). | |
41+
| `outputs.<output_name>` | object | You can leave the object empty, in which case by default the output will be of type `uri_folder` and Azure Machine Learning will system-generate an output location for the output based on the following template path: `{settings.datastore}/azureml/{job-name}/{output-name}/`. File(s) to the output directory will be written via read-write mount. If you want to specify a different mode for the output, provide an object containing the [component output specification](#component-output). | |
42+
43+
### Component input
44+
45+
| Key | Type | Description | Allowed values | Default value |
46+
| --- | ---- | ----------- | -------------- | ------------- |
47+
| `type` | string | **Required.** The type of component input. [Learn more about data access](concept-data.md) | `number`, `integer`, `boolean`, `string`, `uri_file`, `uri_folder`, `mltable`, `mlflow_model`, `custom_model`| |
48+
| `description` | string | Description of the input. | | |
49+
| `default` | number, integer, boolean, or string | The default value for the input. | | |
50+
| `optional` | boolean | Whether the input is required. If set to `true`, you need use the command includes optional inputs with `$[[]]`| | `false` |
51+
| `min` | integer or number | The minimum accepted value for the input. This field can only be specified if `type` field is `number` or `integer`. | |
52+
| `max` | integer or number | The maximum accepted value for the input. This field can only be specified if `type` field is `number` or `integer`. | |
53+
| `enum` | array | The list of allowed values for the input. Only applicable if `type` field is `string`.| |
54+
55+
### Component output
56+
57+
| Key | Type | Description | Allowed values | Default value |
58+
| --- | ---- | ----------- | -------------- | ------------- |
59+
| `type` | string | **Required.** The type of component output. | `uri_file`, `uri_folder`, `mltable`, `mlflow_model`, `custom_model` | |
60+
| `description` | string | Description of the output. | | |
61+
62+
## Remarks
63+
64+
The `az ml component` commands can be used for managing Azure Machine Learning components.
65+
66+
## Examples
67+
68+
Examples are available in the [examples GitHub repository](https://github.com/Azure/azureml-examples/tree/lochen/pipeline-component-pup/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component).
69+
70+
## Next steps
71+
72+
- [Install and use the CLI (v2)](how-to-configure-cli.md)
73+
- [Create ML pipelines using components](how-to-create-component-pipelines-cli.md)

articles/machine-learning/toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1084,6 +1084,8 @@
10841084
href: how-to-use-pipeline-ui.md
10851085
- name: How to use parallel job in pipeline
10861086
href: how-to-use-parallel-job-in-pipeline.md
1087+
- name: How to use pipeline component in pipeline
1088+
href: how-to-use-pipeline-component.md
10871089
# v1
10881090
- name: Create ML pipelines (Python)
10891091
href: ./v1/how-to-create-machine-learning-pipelines.md
@@ -1231,6 +1233,8 @@
12311233
href: reference-yaml-model.md
12321234
- name: Schedule
12331235
href: reference-yaml-schedule.md
1236+
- name: Pipeline component
1237+
href: reference-yaml-component-pipeline.md
12341238
- name: Compute
12351239
items:
12361240
- name: Compute cluster (AmlCompute)

0 commit comments

Comments
 (0)