You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Manage inputs and outputs for components and pipelines
16
16
17
17
Azure Machine Learning pipelines support inputs and outputs at both the component and pipeline levels. This article describes pipeline and component inputs and outputs and how to manage them.
18
18
19
-
At the component level, the inputs and outputs define the component interface. You can use the output from one component as an input for another component in the same parent pipeline, allowing for data or models to be passed between components. You can represent this interconnectivity as a graph that illustrates the data flow within the pipeline.
19
+
At the component level, the inputs and outputs define the component interface. You can use the output from one component as an input for another component in the same parent pipeline, allowing for data or models to be passed between components. This interconnectivity represents the data flow within the pipeline.
20
20
21
-
At the pipeline level, you can use inputs and outputs to submit pipeline jobs with varying data inputs or parameters that control training logic, such as `learning_rate`. Inputs and outputs are especially useful when you invoke a pipeline via a REST endpoint. You can assign different values to the pipeline input or access the output of pipeline jobs. For more information, see [Create jobs and input data for batch endpoints](how-to-access-data-batch-endpoints-jobs.md).
21
+
At the pipeline level, you can use inputs and outputs to submit pipeline jobs with varying data inputs or parameters, such as `learning_rate`. Inputs and outputs are especially useful when you invoke a pipeline via a REST endpoint. You can assign different values to the pipeline input or access the output of different pipeline jobs. For more information, see [Create jobs and input data for batch endpoints](how-to-access-data-batch-endpoints-jobs.md).
22
22
23
23
## Input and output types
24
24
@@ -51,7 +51,7 @@ These examples are from the [NYC Taxi Data Regression](https://github.com/Azure/
51
51
- The [prep component](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression/prep.yml) has a `uri_folder` type output. The component source code reads the CSV files from the input folder, processes the files, and writes the processed CSV files to the output folder.
52
52
- The [train component](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression/train.yml) has a `mlflow_model` type output. The component source code saves the trained model using the `mlflow.sklearn.save_model` method.
53
53
54
-
### Serialization
54
+
##Output serialization
55
55
56
56
Using data or model outputs serializes the outputs and saves them as files in a storage location. Later steps can access the files during job execution by mounting this storage location or by downloading or uploading the files to the compute file system.
The `ro_mount` or `rw_mount` modes are recommended for most cases. For more information, see [Modes](how-to-read-write-data-v2.md#modes).
88
88
89
-
## Visual representation in studio
89
+
## Inputs and outputs in pipeline graphs
90
90
91
91
On the pipeline job page in Azure Machine Learning studio, component inputs and outputs appear as small circles called input/output ports. These ports represent the data flow in the pipeline. Pipeline level output is displayed in purple boxes for easy identification.
92
92
93
-
The following Azure Machine Learning studio screenshots from the [NYC Taxi Data Regression](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression) pipeline highlight the component and pipeline inputs and outputs.
93
+
The following screenshot from the [NYC Taxi Data Regression](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression) pipeline graph shows many component and pipeline inputs and outputs.
94
94
95
95
:::image type="content" source="./media/how-to-manage-pipeline-input-output/input-output-port.png" lightbox="./media/how-to-manage-pipeline-input-output/input-output-port.png" alt-text="Screenshot highlighting the pipeline input and output ports.":::
96
96
97
97
When you hover over an input/output port, the type is displayed.
98
98
99
99
:::image type="content" source="./media/how-to-manage-pipeline-input-output/hover-port.png" alt-text="Screenshot that highlights the port type when hovering over the port.":::
100
100
101
-
The pipeline graph doesn't display primitive type inputs. You can find these inputs on the **Settings** tab of the pipeline **Job overview** panel for pipeline level inputs, or the component panel for component level inputs.
102
-
103
-
The following screenshot shows the **Settings** tab of a pipeline job, which you can open by selecting **Job overview**. To check inputs and outputs for a component, double-click the component in the graph to open the component panel.
101
+
The pipeline graph doesn't display primitive type inputs. These inputs appear on the **Settings** tab of the pipeline **Job overview** panel for pipeline level inputs, or the component panel for component level inputs. To open the component panel, double-click the component in the graph.
@@ -219,9 +217,9 @@ By default, all inputs are required and must either have a default value or be a
219
217
220
218
Setting optional inputs can be useful in two scenarios:
221
219
222
-
- If you define an optional data/model type input and don't assign a value to it when you submit the pipeline job, the pipeline component lacks a preceding data dependency. If the component's input port isn't linked to any component or data/model node, the pipeline invokes the component directly instead of waiting for the preceding dependency.
220
+
- If you define an optional data/model type input and don't assign a value to it when you submit the pipeline job, the pipeline component lacks that data dependency. If the component's input port isn't linked to any component or data/model node, the pipeline invokes the component directly instead of waiting for a preceding dependency.
223
221
224
-
- If you set `continue_on_step_failure = True` for the pipeline and`node2` uses optional input from `node1`, `node2`executes even if `node1` fails. If `node1` input is required, `node2`doesn't execute if `node1` fails. The following example demonstrates this scenario.
222
+
- If you set `continue_on_step_failure = True` for the pipeline but`node2` uses required input from `node1`, `node2`doesn't execute if `node1` fails. If `node1` input is optional, `node2`executes even if `node1` fails. The following graph demonstrates this scenario.
225
223
226
224
:::image type="content" source="./media/how-to-manage-pipeline-input-output/continue-on-failure-optional-input.png" alt-text="Screenshot showing the orchestration logic of optional input and continue on failure.":::
227
225
@@ -380,7 +378,7 @@ On the **Outputs + logs** tab of the component panel for a component, select **D
380
378
381
379
You can register output of a component or pipeline as a named asset by assigning a `name` and `version` to the output. The registered asset can be listed in your workspace through the studio UI, CLI, or SDK and can be referenced in future workspace jobs.
0 commit comments