Skip to content

Commit b2bdb63

Browse files
authored
Merge pull request #233 from v-thepet/components
Freshness 2 - Azure Machine Learning Pipelines and RAI
2 parents 63f71c7 + b139a69 commit b2bdb63

File tree

3 files changed

+80
-43
lines changed

3 files changed

+80
-43
lines changed
Lines changed: 80 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,96 +1,133 @@
11
---
2-
title: How to use pipeline component in pipeline
2+
title: How to use pipeline components in pipeline jobs
33
titleSuffix: Azure Machine Learning
4-
description: How to use pipeline component to build nested pipeline job in Azure Machine Learning pipeline using CLI v2 and Python SDK
4+
description: Learn how to nest multistep pipeline components in Azure Machine Learning pipeline jobs by using CLI v2, Python SDK v2, or the studio UI.
55
services: machine-learning
66
ms.service: azure-machine-learning
77
ms.subservice: mlops
88
ms.topic: how-to
99
author: lgayhardt
1010
ms.author: lagayhar
1111
ms.reviewer: lochen
12-
ms.date: 04/12/2023
12+
ms.date: 09/13/2024
1313
ms.custom:
1414
- sdkv2
1515
- cliv2
1616
- devx-track-python
1717
- ignite-2023
1818
---
1919

20-
# How to use pipeline component to build nested pipeline job (V2)
20+
# Use multistep pipeline components in pipeline jobs
2121

2222
[!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
2323

24-
When developing a complex machine learning pipeline, it's common to have sub-pipelines that use multi-step to perform tasks such as data preprocessing and model training. These sub-pipelines can be developed and tested standalone. Pipeline component groups multi-step as a component that can be used as a single step to create complex pipelines. Which will help you share your work and better collaborate with team members.
24+
It's common to use pipeline components to develop complex machine learning pipelines. You can group multiple steps into a pipeline component that you use as a single step to do tasks like data preprocessing or model training.
2525

26-
By using a pipeline component, the author can focus on developing sub-tasks and easily integrate them with the entire pipeline job. Furthermore, a pipeline component has a well-defined interface in terms of inputs and outputs, which means that user of the pipeline component doesn't need to know the implementation details of the component.
26+
This article shows you how to nest multiple steps in components that you use to build complex Azure Machine Learning pipeline jobs. You can develop and test these multistep components standalone, which helps you share your work and collaborate better with team members.
2727

28-
In this article, you'll learn how to use pipeline component in Azure Machine Learning pipeline.
28+
By using multistep pipeline components, you can focus on developing subtasks and easily integrate them with the entire pipeline job. A pipeline component has a well-defined input and output interface, so multistep pipeline component users don't need to know the implementation details of the component.
29+
30+
Both pipeline components and pipeline jobs contain groups of steps or components, but defining a pipeline component differs from defining a pipeline job in the following ways:
31+
32+
- Pipeline components define only the interfaces of inputs and outputs. In a pipeline component, you explicitly set the input and output types, but you don't directly assign values to them.
33+
- Pipeline components don't have runtime settings, so you can't hardcode a compute or data node in a pipeline component. Instead you must promote these nodes as pipeline level inputs and assign values during runtime.
34+
- Pipeline level settings such as `default_datastore` and `default_compute` are also runtime settings that aren't part of pipeline component definitions.
2935

3036
## Prerequisites
3137

32-
- Understand how to use Azure Machine Learning pipeline with [CLI v2](how-to-create-component-pipelines-cli.md) and [SDK v2](how-to-create-component-pipeline-python.md).
33-
- Understand what is [component](concept-component.md) and how to use component in Azure Machine Learning pipeline.
34-
- Understand what is an [Azure Machine Learning pipeline](concept-ml-pipelines.md)
38+
- Have an Azure Machine Learning workspace. For more information, see [Create workspace resources](quickstart-create-resources.md).
39+
- Understand the concepts of Azure Machine Learning [pipelines](concept-ml-pipelines.md) and [components](concept-component.md), and know how to use components in Azure Machine Learning pipelines.
40+
41+
# [Azure CLI](#tab/cliv2)
3542

36-
## The difference between pipeline job and pipeline component
43+
- Install the Azure CLI and the `ml` extension. For more information, see [Install, set up, and use the CLI (v2)](how-to-configure-cli.md). The `ml` extension automatically installs the first time you run an `az ml` command.
44+
- Understand how to [create and run Azure Machine Learning pipelines and components with the CLI v2](how-to-create-component-pipelines-cli.md).
3745

38-
In general, pipeline components are similar to pipeline jobs because they both contain a group of jobs/components.
46+
# [Python SDK](#tab/python)
3947

40-
Here are some main differences you need to be aware of when defining pipeline components:
48+
- Install the [Azure Machine Learning SDK v2 for Python](/python/api/overview/azure/ai-ml-readme).
49+
- Understand how to [create and run Azure Machine Learning pipelines and components with the Python SDK v2](how-to-create-component-pipeline-python.md).
4150

42-
- Pipeline component only defines the interface of inputs/outputs, which means when defining a pipeline component you need to explicitly define the type of inputs/outputs instead of directly assigning values to them.
43-
- Pipeline component can't have runtime settings, you can't hard-code compute, or data node in the pipeline component. Instead you need to promote them as pipeline level inputs and assign values during runtime.
44-
- Pipeline level settings such as default_datastore and default_compute are also runtime settings. They aren't part of pipeline component definition.
51+
# [Studio UI](#tab/ui)
52+
53+
- Understand how to [create and run pipelines and components with the Azure Machine Learning studio UI](how-to-create-component-pipelines-ui.md).
54+
55+
---
4556

46-
### CLI v2
57+
## Build pipeline jobs with pipeline components
4758

48-
The example used in this article can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *azureml-examples/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component* to check the example.
59+
You can define multiple steps as a pipeline component, and then use the multistep component like any other component to build a pipeline job.
4960

50-
You can use multi-components to build a pipeline component. Similar to how you built pipeline job with component. This is two step pipeline component.
61+
### Define pipeline components
5162

52-
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/components/train_pipeline_component.yml" highlight="7-48":::
63+
# [Azure CLI](#tab/cliv2)
5364

54-
When reference pipeline component to define child job in a pipeline job, just like reference other type of component. You can provide runtime settings such as default_datastore, default_compute in pipeline job level, any parameter you want to change during run time need promote as pipeline job inputs, otherwise, they'll be hard-code in next pipeline component. We're support to promote compute as pipeline component input to support heterogenous pipeline, which may need different compute target in different steps.
65+
You can use multiple components to build a pipeline component, similar to how you build pipeline jobs with components.
5566

56-
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline.yml" highlight="11-16,23-25,60":::
67+
The following example comes from the [pipeline_with_train_eval_pipeline_component](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component) example pipeline in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) GitHub repository.
5768

58-
### Python SDK
69+
The example component defines a three-node pipeline job. The two nodes in the example pipeline job each use the locally defined components `train`, `score`, and `eval`. The following code defines the pipeline component:
5970

60-
The python SDK example can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *azureml-examples/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component* to check the example.
71+
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/components/train_pipeline_component.yml" highlight="8,20,23,30,43,53":::
6172

62-
You can define a pipeline component using a Python function, which is similar to defining a pipeline job using a function. You can also promote the compute of some step to be used as inputs for the pipeline component.
73+
# [Python SDK](#tab/python)
6374

64-
[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component)]
75+
You can define a pipeline component using a Python function, which is similar to defining a pipeline job using a function. You can also promote the compute of some steps to use as inputs for the pipeline component.
6576

66-
You can use pipeline component as a step like other components in pipeline job.
77+
The following Python SDK examples are from the [Build pipeline with subpipeline (pipeline component)](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb) Azure Machine Learning notebook. Run this notebook to build the example pipeline.
78+
79+
[!Notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component)]
80+
81+
# [Studio UI](#tab/ui)
82+
83+
To access components in Azure Machine Learning studio, you need to register the components. To register pipeline components, follow the instructions at [Register component in your workspace](how-to-create-component-pipelines-ui.md#register-component-in-your-workspace). After that, you can view and use the components in the studio asset library and components list page.
84+
85+
---
6786

68-
[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component-pipeline-job)]
87+
### Use components in pipelines
6988

70-
## Pipeline job with pipeline component in studio
89+
# [Azure CLI](#tab/cliv2)
7190

72-
You can use `az ml component create` or `ml_client.components.create_or_update` to register pipeline component as a registered component. After that you can view the component in asset library and component list page.
91+
You reference pipeline components as child jobs in a pipeline job just like you reference other types of components. You can provide runtime settings like `default_datastore` and `default_compute` at the pipeline job level.
7392

74-
### Using pipeline component to build pipeline job
93+
You need to promote any parameters you want to change during runtime as pipeline job inputs. Otherwise, they're hard-coded in the pipeline component. Promoting compute definition to a pipeline level input supports heterogenous pipelines that can use different compute targets in different steps.
7594

76-
After you register the pipeline component, you can drag and drop the pipeline component into the designer canvas and use the UI to build pipeline job.
95+
To submit the pipeline job, edit the `cpu-cluster` in the `default_compute` section before you run the `az ml job create -f pipeline.yml` command.
7796

78-
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-authoring.png" alt-text="Screenshot of the designer canvas page to build pipeline job with pipeline component." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-authoring.png":::
97+
:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline.yml" highlight="17,18,27,28,40,50,55":::
7998

80-
### View pipeline job using pipeline component
99+
>[!NOTE]
100+
>To share or reuse components across jobs in the workspace, you need to register the components. You can use [`az ml component create`](/cli/azure/ml/component#az-ml-component-create) to register pipeline components.
81101
82-
After submitted pipeline job, you can go to pipeline job detail page to change pipeline component status, you can also drill down to child component in pipeline component to debug specific component.
102+
You can find other Azure CLI pipeline component-related examples and information at [pipelines-with-components](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components) in the [Azure Machine Learning examples repository](https://github.com/Azure/azureml-examples).
83103

84-
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-right-panel.png" alt-text="Screenshot of view pipeline component on the pipeline job detail page." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-right-panel.png":::
104+
# [Python SDK](#tab/python)
85105

86-
## Sample notebooks
106+
You can use the pipeline component as a step like other components in the pipeline job.
87107

88-
- [nyc_taxi_data_regression_with_pipeline_component](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component.ipynb)
89-
- [pipeline_with_train_eval_pipeline_component](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb)
108+
[!Notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/pipeline_with_train_eval_pipeline_component/pipeline_with_train_eval_pipeline_component.ipynb?name=pipeline-component-pipeline-job)]
109+
110+
>[!NOTE]
111+
>To share or reuse components across jobs in the workspace, you need to register the components. You can use [`ml_client.components.create_or_update`](/python/api/azure-ai-ml/azure.ai.ml.mlclient#azure-ai-ml-mlclient-create-or-update) to register pipeline components.
112+
113+
You can find other Python SDK v2 pipeline component-related notebooks and information at [Pipeline component](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component) in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) GitHub repository.
114+
115+
# [Studio UI](#tab/ui)
116+
117+
After you register a pipeline component, you can drag and drop the component into the studio Designer canvas and use the UI to build a pipeline job. For detailed instructions, see [Create pipelines using registered components](how-to-create-component-pipelines-ui.md#create-pipeline-using-registered-component).
118+
119+
The following screenshots are from the [nyc_taxi_data_regression_with_pipeline_component](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1j_pipeline_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component/nyc_taxi_data_regression_with_pipeline_component.ipynb) notebook in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) GitHub repository.
120+
121+
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-authoring.png" alt-text="Screenshot of the Designer canvas page to build a pipeline job with a pipeline component." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-authoring.png":::
122+
123+
After you submit a pipeline job, you can go to the pipeline job detail page to change pipeline component status. You can also drill down to child components in the pipeline component to debug the components.
124+
125+
:::image type="content" source="./media/how-to-use-pipeline-component/pipeline-component-right-panel.png" alt-text="Screenshot of View pipeline component on the pipeline job detail page." lightbox= "./media/how-to-use-pipeline-component/pipeline-component-right-panel.png":::
126+
127+
---
90128

91-
## Next steps
129+
## Related content
92130

93131
- [YAML reference for pipeline component](reference-yaml-component-pipeline.md)
94-
- [Track an experiment](how-to-log-view-metrics.md)
95-
- [Deploy a trained model](how-to-deploy-managed-online-endpoints.md)
96-
- [Deploy a pipeline with batch endpoints](how-to-use-batch-pipeline-deployments.md)
132+
- [Manage inputs and outputs of components and pipelines](how-to-manage-inputs-outputs-pipeline.md)
133+
- [Deploy your pipeline as batch endpoint](how-to-deploy-pipeline-component-as-batch-endpoint.md)
-26.4 KB
Loading
-60.2 KB
Loading

0 commit comments

Comments
 (0)