Skip to content

Commit aa63ca4

Browse files
Merge pull request #219699 from Blackmist/add-yaml-identity-field
updating to add identity information
2 parents e85aaf8 + a3cbcbf commit aa63ca4

File tree

3 files changed

+55
-10
lines changed

3 files changed

+55
-10
lines changed

articles/machine-learning/reference-yaml-job-command.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom: cliv2, event-tier1-build-2022
1010

1111
author: balapv
1212
ms.author: balapv
13-
ms.date: 08/08/2022
13+
ms.date: 11/28/2022
1414
ms.reviewer: larryfr
1515
---
1616

@@ -48,6 +48,7 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
4848
| `inputs.<input_name>` | number, integer, boolean, string or object | One of a literal value (of type number, integer, boolean, or string) or an object containing a [job input data specification](#job-inputs). | | |
4949
| `outputs` | object | Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration. <br><br> Outputs can be referenced in the `command` using the `${{ outputs.<output_name> }}` expression. | |
5050
| `outputs.<output_name>` | object | You can leave the object empty, in which case by default the output will be of type `uri_folder` and Azure ML will system-generate an output location for the output. File(s) to the output directory will be written via read-write mount. If you want to specify a different mode for the output, provide an object containing the [job output specification](#job-outputs). | |
51+
| `identity` | object | The identity is used for data accessing. It can be [UserIdentityConfiguration](#useridentityconfiguration), [ManagedIdentityConfiguration](#managedidentityconfiguration) or None. If it's UserIdentityConfiguration the identity of job submitter will be used to access input data and write result to output folder, otherwise, the managed identity of the compute target will be used. | |
5152

5253
### Distribution configurations
5354

@@ -88,6 +89,20 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
8889
| `type` | string | The type of job output. For the default `uri_folder` type, the output will correspond to a folder. | `uri_folder` , `mlflow_model`, `custom_model`| `uri_folder` |
8990
| `mode` | string | Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode (`rw_mount`) the output directory will be a mounted directory. For upload mode the file(s) written will get uploaded at the end of the job. | `rw_mount`, `upload` | `rw_mount` |
9091

92+
### Identity configurations
93+
94+
#### UserIdentityConfiguration
95+
96+
| Key | Type | Description | Allowed values |
97+
| --- | ---- | ----------- | -------------- |
98+
| `type` | const | **Required.** Identity type. | `user_identity` |
99+
100+
#### ManagedIdentityConfiguration
101+
102+
| Key | Type | Description | Allowed values |
103+
| --- | ---- | ----------- | -------------- |
104+
| `type` | const | **Required.** Identity type. | `managed` or `managed_identity` |
105+
91106
## Remarks
92107

93108
The `az ml job` command can be used for managing Azure Machine Learning jobs.

articles/machine-learning/reference-yaml-job-pipeline.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: reference
99
ms.custom: cliv2, event-tier1-build-2022
1010
author: cloga
1111
ms.author: lochen
12-
ms.date: 08/08/2022
12+
ms.date: 11/28/2022
1313
ms.reviewer: scottpolly
1414
---
1515

@@ -42,6 +42,7 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
4242
| `inputs.<input_name>` | number, integer, boolean, string or object | One of a literal value (of type number, integer, boolean, or string) or an object containing a [job input data specification](#job-inputs). | | |
4343
| `outputs` | object | Dictionary of output configurations of the pipeline job. The key is a name for the output within the context of the job and the value is the output configuration. <br><br> These pipeline outputs can be referenced by the outputs of an individual step job in the pipeline using the `${{ parents.outputs.<output_name> }}` expression. For more information on how to bind the inputs of a pipeline step to the inputs of the top-level pipeline job, see the [Expression syntax for binding inputs and outputs between steps in a pipeline job](reference-yaml-core-syntax.md#binding-inputs-and-outputs-between-steps-in-a-pipeline-job). | |
4444
| `outputs.<output_name>` | object | You can leave the object empty, in which case by default the output will be of type `uri_folder` and Azure ML will system-generate an output location for the output based on the following templatized path: `{settings.datastore}/azureml/{job-name}/{output-name}/`. File(s) to the output directory will be written via read-write mount. If you want to specify a different mode for the output, provide an object containing the [job output specification](#job-outputs). | |
45+
| `identity` | object | The identity is used for data accessing. It can be [UserIdentityConfiguration](#useridentityconfiguration), [ManagedIdentityConfiguration](#managedidentityconfiguration) or None. If it's UserIdentityConfiguration the identity of job submitter will be used to access input data and write result to output folder, otherwise, the managed identity of the compute target will be used. | |
4546

4647
### Attributes of the `settings` key
4748

@@ -66,6 +67,20 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
6667
| `type` | string | The type of job output. For the default `uri_folder` type, the output will correspond to a folder. | `uri_file`, `uri_folder`, `mltable`, `mlflow_model` | `uri_folder` |
6768
| `mode` | string | Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode (`rw_mount`) the output directory will be a mounted directory. For upload mode the file(s) written will get uploaded at the end of the job. | `rw_mount`, `upload` | `rw_mount` |
6869

70+
### Identity configurations
71+
72+
#### UserIdentityConfiguration
73+
74+
| Key | Type | Description | Allowed values |
75+
| --- | ---- | ----------- | -------------- |
76+
| `type` | const | **Required.** Identity type. | `user_identity` |
77+
78+
#### ManagedIdentityConfiguration
79+
80+
| Key | Type | Description | Allowed values |
81+
| --- | ---- | ----------- | -------------- |
82+
| `type` | const | **Required.** Identity type. | `managed` or `managed_identity` |
83+
6984
## Remarks
7085

7186
The `az ml job` commands can be used for managing Azure Machine Learning pipeline jobs.

articles/machine-learning/reference-yaml-job-sweep.md

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ ms.topic: reference
99
ms.custom: cliv2, event-tier1-build-2022
1010
ms.author: amipatel
1111
author: amibp
12-
ms.date: 08/08/2022
13-
ms.reviewer: nibaccam
12+
ms.date: 11/28/2022
13+
ms.reviewer: larryfr
1414
---
1515

1616
# CLI (v2) sweep job YAML schema
@@ -47,6 +47,7 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
4747
| `inputs.<input_name>` | number, integer, boolean, string or object | One of a literal value (of type number, integer, boolean, or string) or an object containing a [job input data specification](#job-inputs). | | |
4848
| `outputs` | object | Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration. <br><br> Outputs can be referenced in the `command` using the `${{ outputs.<output_name> }}` expression. | |
4949
| `outputs.<output_name>` | object | You can leave the object empty, in which case by default the output will be of type `uri_folder` and Azure ML will system-generate an output location for the output. File(s) to the output directory will be written via read-write mount. If you want to specify a different mode for the output, provide an object containing the [job output specification](#job-outputs). | |
50+
| `identity` | object | The identity is used for data accessing. It can be [UserIdentityConfiguration](#useridentityconfiguration), [ManagedIdentityConfiguration](#managedidentityconfiguration) or None. If UserIdentityConfiguration, the identity of job submitter will be used to access input data and write result to output folder, otherwise, the managed identity of the compute target will be used. | |
5051

5152
### Sampling algorithms
5253

@@ -161,18 +162,18 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
161162

162163
| Key | Type | Description | Default value |
163164
| --- | ---- | ----------- | ------------- |
164-
| `max_total_trials` | integer | The maximum time in seconds the job is allowed to run. Once this limit is reached the system will cancel the job. | `1000` |
165+
| `max_total_trials` | integer | The maximum time in seconds the job is allowed to run. Once this limit is reached, the system will cancel the job. | `1000` |
165166
| `max_concurrent_trials` | integer | | Defaults to `max_total_trials`. |
166-
| `timeout` | integer | The maximum time in seconds the entire sweep job is allowed to run. Once this limit is reached the system will cancel the sweep job, including all its trials. | `604800` |
167-
| `trial_timeout` | integer | The maximum time in seconds each trial job is allowed to run. Once this limit is reached the system will cancel the trial. | |
167+
| `timeout` | integer | The maximum time in seconds the entire sweep job is allowed to run. Once this limit is reached, the system will cancel the sweep job, including all its trials. | `604800` |
168+
| `trial_timeout` | integer | The maximum time in seconds each trial job is allowed to run. Once this limit is reached, the system will cancel the trial. | |
168169

169170
### Attributes of the `trial` key
170171

171172
| Key | Type | Description | Default value |
172173
| --- | ---- | ----------- | ------------- |
173174
| `command` | string | **Required.** The command to execute. | |
174175
| `code` | string | Local path to the source code directory to be uploaded and used for the job. | |
175-
| `environment` | string or object | **Required.** The environment to use for the job. This can be either a reference to an existing versioned environment in the workspace or an inline environment specification. <br> <br> To reference an existing environment use the `azureml:<environment-name>:<environment-version>` syntax. <br><br> To define an environment inline please follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they are not supported for inline environments. | |
176+
| `environment` | string or object | **Required.** The environment to use for the job. This can be either a reference to an existing versioned environment in the workspace or an inline environment specification. <br> <br> To reference an existing environment, use the `azureml:<environment-name>:<environment-version>` syntax. <br><br> To define an environment inline, follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they aren't supported for inline environments. | |
176177
| `environment_variables` | object | Dictionary of environment variable name-value pairs to set on the process where the command is executed. | |
177178
| `distribution` | object | The distribution configuration for distributed training scenarios. One of [MpiConfiguration](#mpiconfiguration), [PyTorchConfiguration](#pytorchconfiguration), or [TensorFlowConfiguration](#tensorflowconfiguration). | |
178179
| `resources.instance_count` | integer | The number of nodes to use for the job. | `1` |
@@ -206,8 +207,8 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
206207
| Key | Type | Description | Allowed values | Default value |
207208
| --- | ---- | ----------- | -------------- | ------------- |
208209
| `type` | string | The type of job input. Specify `uri_file` for input data that points to a single file source, or `uri_folder` for input data that points to a folder source. [Learn more about data access.](concept-data.md)| `uri_file`, `uri_folder`, `mltable`, `mlflow_model` | `uri_folder` |
209-
| `path` | string | The path to the data to use as input. This can be specified in a few ways: <br><br> - A local path to the data source file or folder, e.g. `path: ./iris.csv`. The data will get uploaded during job submission. <br><br> - A URI of a cloud path to the file or folder to use as the input. Supported URI types are `azureml`, `https`, `wasbs`, `abfss`, `adl`. See [Core yaml syntax](reference-yaml-core-syntax.md) for more information on how to use the `azureml://` URI format. <br><br> - An existing registered Azure ML data asset to use as the input. To reference a registered data asset use the `azureml:<data_name>:<data_version>` syntax or `azureml:<data_name>@latest` (to reference the latest version of that data asset), e.g. `path: azureml:cifar10-data:1` or `path: azureml:cifar10-data@latest`. | | |
210-
| `mode` | string | Mode of how the data should be delivered to the compute target. <br><br> For read-only mount (`ro_mount`), the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. Azure ML will resolve the input to the mount path. <br><br> For `download` mode the data will be downloaded to the compute target. Azure ML wil resolve the input to the downloaded path. <br><br> If you only want the URL of the storage location of the data artifact(s) rather than mounting or downloading the data itself, you can use the `direct` mode. This will pass in the URL of the storage location as the job input. Note that in this case you are fully responsible for handling credentials to access the storage. | `ro_mount`, `download`, `direct` | `ro_mount` |
210+
| `path` | string | The path to the data to use as input. This can be specified in a few ways: <br><br> - A local path to the data source file or folder, for example, `path: ./iris.csv`. The data will get uploaded during job submission. <br><br> - A URI of a cloud path to the file or folder to use as the input. Supported URI types are `azureml`, `https`, `wasbs`, `abfss`, `adl`. For more information on using the `azureml://` URI format, see [Core yaml syntax](reference-yaml-core-syntax.md). <br><br> - An existing registered Azure ML data asset to use as the input. To reference a registered data asset, use the `azureml:<data_name>:<data_version>` syntax or `azureml:<data_name>@latest` (to reference the latest version of that data asset), for example, `path: azureml:cifar10-data:1` or `path: azureml:cifar10-data@latest`. | | |
211+
| `mode` | string | Mode of how the data should be delivered to the compute target. <br><br> For read-only mount (`ro_mount`), the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. Azure ML will resolve the input to the mount path. <br><br> For `download` mode the data will be downloaded to the compute target. Azure ML will resolve the input to the downloaded path. <br><br> If you only want the URL of the storage location of the data artifact(s) rather than mounting or downloading the data itself, you can use the `direct` mode. This will pass in the URL of the storage location as the job input. In this case you're fully responsible for handling credentials to access the storage. | `ro_mount`, `download`, `direct` | `ro_mount` |
211212

212213
### Job outputs
213214

@@ -216,6 +217,20 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late
216217
| `type` | string | The type of job output. For the default `uri_folder` type, the output will correspond to a folder. | `uri_file`, `uri_folder`, `mltable`, `mlflow_model` | `uri_folder` |
217218
| `mode` | string | Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode (`rw_mount`) the output directory will be a mounted directory. For upload mode the file(s) written will get uploaded at the end of the job. | `rw_mount`, `upload` | `rw_mount` |
218219

220+
### Identity configurations
221+
222+
#### UserIdentityConfiguration
223+
224+
| Key | Type | Description | Allowed values |
225+
| --- | ---- | ----------- | -------------- |
226+
| `type` | const | **Required.** Identity type. | `user_identity` |
227+
228+
#### ManagedIdentityConfiguration
229+
230+
| Key | Type | Description | Allowed values |
231+
| --- | ---- | ----------- | -------------- |
232+
| `type` | const | **Required.** Identity type. | `managed` or `managed_identity` |
233+
219234
## Remarks
220235

221236
The `az ml job` command can be used for managing Azure Machine Learning jobs.

0 commit comments

Comments
 (0)