Skip to content

Commit 9154b4c

Browse files
Merge pull request #7139 from s-polly/stp_ml_freshness_9-17
ML Freshness - 9-17
2 parents abc2c4c + b85d2db commit 9154b4c

File tree

3 files changed

+145
-186
lines changed

3 files changed

+145
-186
lines changed

articles/machine-learning/how-to-manage-inputs-outputs-pipeline.md

Lines changed: 65 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: core
88
ms.author: lagayhar
99
author: lgayhardt
1010
ms.reviewer: zhanxia
11-
ms.date: 09/13/2024
11+
ms.date: 09/18/2025
1212
ms.topic: how-to
1313
ms.custom: devplatv2, pipeline, devx-track-azurecli, update-code6
1414
---
@@ -45,7 +45,7 @@ Primitive type output isn't supported.
4545

4646
### Example inputs and outputs
4747

48-
These examples are from the [NYC Taxi Data Regression](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression) pipeline in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) GitHub repository.
48+
These examples are from the [NYC Taxi Data Regression](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression) pipeline in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) GitHub repository:
4949

5050
- The [train component](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression/train.yml) has a `number` input named `test_split_ratio`.
5151
- The [prep component](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/nyc_taxi_data_regression/prep.yml) has a `uri_folder` type output. The component source code reads the CSV files from the input folder, processes the files, and writes the processed CSV files to the output folder.
@@ -59,7 +59,7 @@ The component source code must serialize the output object, which is usually sto
5959

6060
## Data type input and output paths
6161

62-
For data asset inputs and outputs, you must specify a path parameter that points to the data location. The following table shows the supported data locations for Azure Machine Learning pipeline inputs and outputs, with `path` parameter examples.
62+
For data asset inputs and outputs, you must specify a path parameter that points to the data location. The following table shows the supported data locations for Azure Machine Learning pipeline inputs and outputs, with `path` parameter examples:
6363

6464
|Location | Input | Output | Example |
6565
|---------|---------|---------|---------|
@@ -69,7 +69,8 @@ For data asset inputs and outputs, you must specify a path parameter that points
6969
|A path on an Azure Machine Learning datastore ||| `azureml://datastores/<data_store_name>/paths/<path>` |
7070
|A path to a data asset ||| `azureml:my_data:<version>` |
7171

72-
\* Using Azure Storage directly isn't recommended for input, because it might need extra identity configuration to read the data. It's better to use Azure Machine Learning datastore paths, which are supported across various pipeline job types.
72+
> [!TIP]
73+
> Using Azure Storage directly isn't recommended for input, because it can need extra identity configuration to read the data. It's better to use Azure Machine Learning datastore paths, which are supported across various pipeline job types.
7374
7475
## Data type input and output modes
7576

@@ -84,7 +85,7 @@ Type | `upload` | `download` | `ro_mount` | `rw_mount` | `direct` | `eval_downlo
8485
`uri_file` output | ✓ | | | ✓ | | |
8586
`mltable` output | ✓ | | | ✓ | ✓ | |
8687

87-
The `ro_mount` or `rw_mount` modes are recommended for most cases. For more information, see [Modes](how-to-read-write-data-v2.md#modes).
88+
We recommend the `ro_mount` or `rw_mount` modes for most cases. For more information, see [Modes](how-to-read-write-data-v2.md#modes).
8889

8990
## Inputs and outputs in pipeline graphs
9091

@@ -245,7 +246,7 @@ Job `{name}` is resolved at job execution time, and `{output_name}` is the name
245246

246247
# [Azure CLI](#tab/cli)
247248

248-
The [pipeline.yml](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/basics/1b_e2e_registered_components/pipeline.yml) file at [train-score-eval pipeline with registered components example](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/basics/1b_e2e_registered_components) defines a pipeline that has three pipeline level outputs. You can use the following command to set custom output paths for the `pipeline_job_trained_model` output.
249+
The [pipeline.yml](https://github.com/Azure/azureml-examples/blob/main/cli/jobs/pipelines-with-components/basics/1b_e2e_registered_components/pipeline.yml) file at [train-score-eval pipeline with registered components example](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/basics/1b_e2e_registered_components) defines a pipeline that has three pipeline level outputs. Use the following command to set custom output paths for the `pipeline_job_trained_model` output:
249250

250251
```azurecli
251252
# define the custom output path using datastore uri
@@ -258,9 +259,28 @@ az ml job create -f ./pipeline.yml --set outputs.pipeline_job_trained_model.path
258259

259260
# [Python SDK](#tab/python)
260261

261-
The following code that demonstrates how to customize output paths is from the [Build pipeline with command_component decorated python function](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1b_pipeline_with_python_function_components/pipeline_with_python_function_components.ipynb) notebook.
262+
The following code demonstrates how to customize output paths and is from the [Build pipeline with command_component decorated python function](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1b_pipeline_with_python_function_components/pipeline_with_python_function_components.ipynb) notebook:
262263

263-
[!Notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1b_pipeline_with_python_function_components/pipeline_with_python_function_components.ipynb?name=custom-output-path)]
264+
```python
265+
from azure.ai.ml import dsl, Output
266+
267+
# Load component functions
268+
components_dir = "./components/"
269+
helloworld_component = load_component(source=f"{components_dir}/helloworld_component.yml")
270+
271+
@pipeline()
272+
def register_node_output():
273+
# Call component obj as function: apply given inputs & parameters to create a node in pipeline
274+
node = helloworld_component(component_in_path=Input(
275+
type='uri_file', path='https://dprepdata.blob.core.windows.net/demo/Titanic.csv'))
276+
277+
# Define name and version to register node output
278+
node.outputs.component_out_path.name = 'node_output'
279+
node.outputs.component_out_path.version = '1'
280+
281+
pipeline = register_node_output()
282+
pipeline.settings.default_compute = "azureml:cpu-cluster"
283+
```
264284

265285
# [Studio UI](#tab/ui)
266286

@@ -423,64 +443,64 @@ def register_pipeline_output():
423443
'component_out_path': node.outputs.component_out_path
424444
}
425445

426-
pipeline = register_pipeline_output()
427-
# Define name and version to register pipeline output
428-
pipeline.settings.default_compute = "azureml:cpu-cluster"
429-
pipeline.outputs.component_out_path.name = 'pipeline_output'
430-
pipeline.outputs.component_out_path.version = '1'
431-
```
432-
433-
# [Studio UI](#tab/ui)
434-
435-
On the **Overview** tab for a pipeline job, select a **Data asset** link under **Inputs** or **Outputs**. On the data asset page, select **Register**.
436-
437-
:::image type="content" source="./media/how-to-manage-pipeline-input-output/register-output.png" alt-text="Screenshot showing how to register output from a pipeline job.":::
438-
439-
---
440-
441-
### Register component output
442-
443-
# [Azure CLI](#tab/cli)
444-
445-
```yaml
446-
display_name: register_node_output
447-
type: pipeline
448-
jobs:
449-
node:
446+
yamle = register_pipeline_output()
447+
display_name: register_node_outputter pipeline output
448+
type: pipelinengs.default_compute = "azureml:cpu-cluster"
449+
jobs:ine.outputs.component_out_path.name = 'pipeline_output'
450+
node:e.outputs.component_out_path.version = '1'
450451
type: command
451452
component: ../components/helloworld_component.yml
452-
inputs:
453+
inputs:I](#tab/ui)
453454
component_in_path:
454-
type: uri_file
455+
type: uri_fileb for a pipeline job, select a **Data asset** link under **Inputs** or **Outputs**. On the data asset page, select **Register**.
455456
path: 'https://dprepdata.blob.core.windows.net/demo/Titanic.csv'
456-
outputs:
457+
outputs:e="content" source="./media/how-to-manage-pipeline-input-output/register-output.png" alt-text="Screenshot showing how to register output from a pipeline job.":::
457458
component_out_path:
458459
type: uri_folder
459460
name: 'node_output' # Define name and version to register a child job's output
460-
version: '1'
461+
version: '1'nt output
461462
settings:
462463
default_compute: azureml:cpu-cluster
463464
```
464-
465-
# [Python SDK](#tab/python)
466-
465+
```yaml
466+
# [Python SDK](#tab/python)_output
467+
type: pipeline
467468
```python
468469
from azure.ai.ml import dsl, Output
469-
470-
# Load component functions
470+
type: command
471+
# Load component functionsts/helloworld_component.yml
471472
components_dir = "./components/"
472473
helloworld_component = load_component(source=f"{components_dir}/helloworld_component.yml")
473-
474-
@pipeline()
474+
type: uri_file
475+
@pipeline()h: 'https://dprepdata.blob.core.windows.net/demo/Titanic.csv'
475476
def register_node_output():
476477
# Call component obj as function: apply given inputs & parameters to create a node in pipeline
477478
node = helloworld_component(component_in_path=Input(
478-
type='uri_file', path='https://dprepdata.blob.core.windows.net/demo/Titanic.csv'))
479-
479+
type='uri_file', path='https://dprepdata.blob.core.windows.net/demo/Titanic.csv'))t
480+
version: '1'
480481
# Define name and version to register node output
481482
node.outputs.component_out_path.name = 'node_output'
482483
node.outputs.component_out_path.version = '1'
483484
485+
pipeline = register_node_output()
486+
pipeline.settings.default_compute = "azureml:cpu-cluster"
487+
```python
488+
from azure.ai.ml import dsl, Output
489+
# [Studio UI](#tab/ui)
490+
# Load component functions
491+
On the **Overview** tab for a component, select a **Data asset** link under **Inputs** or **Outputs**. On the data asset page, select **Register**.
492+
helloworld_component = load_component(source=f"{components_dir}/helloworld_component.yml")
493+
---
494+
@pipeline()
495+
## Related contentoutput():
496+
# Call component obj as function: apply given inputs & parameters to create a node in pipeline
497+
- [YAML reference for pipeline job](./reference-yaml-job-pipeline.md)
498+
- [How to debug pipeline failure](./how-to-debug-pipeline-failure.md)mo/Titanic.csv'))
499+
- [Schedule a pipeline job](./how-to-schedule-pipeline-job.md)
500+
- [Deploy a pipeline with batch endpoints (preview)](./how-to-use-batch-pipeline-deployments.md)
501+
node.outputs.component_out_path.name = 'node_output'
502+
node.outputs.component_out_path.version = '1'
503+
484504
pipeline = register_node_output()
485505
pipeline.settings.default_compute = "azureml:cpu-cluster"
486506
```

0 commit comments

Comments
 (0)