Skip to content

Commit dc0e3fb

Browse files
Merge pull request #219774 from rastala/main
Update how-to-identity-based-service-authentication.md
2 parents ccc4293 + 90ab0de commit dc0e3fb

File tree

1 file changed

+39
-12
lines changed

1 file changed

+39
-12
lines changed

articles/machine-learning/how-to-identity-based-service-authentication.md

Lines changed: 39 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ Once the identity-based authentication is enabled, the compute managed identity
189189

190190
For information on using configuring Azure RBAC for the storage, see [role-based access controls](../storage/blobs/assign-azure-role-data-access.md).
191191

192-
### Access data for training jobs on compute clusters using user identity (preview)
192+
### Access data for training jobs on compute clusters using user identity
193193

194194
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
195195

@@ -202,20 +202,13 @@ This authentication mode allows you to:
202202

203203
> [!IMPORTANT]
204204
> This functionality has the following limitations
205-
> * Feature is only supported for experiments submitted via the [Azure Machine Learning CLI](how-to-configure-cli.md)
206-
> * Only CommandJobs, and PipelineJobs with CommandSteps and AutoMLSteps are supported
205+
> * Feature is supported for experiments submitted via the [Azure Machine Learning CLI and Python SDK V2](concept-v2.md), but not via ML Studio.
207206
> * User identity and compute managed identity cannot be used for authentication within same job.
207+
> * For pipeline jobs, the user identity must be configured at job top level, not for individual pipeline steps.
208208
209-
> [!WARNING]
210-
> This feature is __public preview__ and is __not secure for production workloads__. Ensure that only trusted users have permissions to access your workspace and storage accounts.
211-
>
212-
> Preview features are provided without a service-level agreement, and are not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
213-
>
214-
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
215-
216-
The following steps outline how to set up identity-based data access for training jobs on compute clusters.
209+
The following steps outline how to set up data access with user identity for training jobs on compute clusters from CLI.
217210

218-
1. Grant the user identity access to storage resources. For example, grant StorageBlobReader access to the specific storage account you want to use or grant ACL-based permission to specific folders or files in Azure Data Lake Gen 2 storage.
211+
1. Grant the user identity access to storage resources. For example, grant StorageBlobReader access to the specific storage account you want to use or grant ACL-based permission to specific folders or files in Azure Data Lake Gen 2 storage.
219212

220213
1. Create an Azure Machine Learning datastore without cached credentials for the storage account. If a datastore has cached credentials, such as storage account key, those credentials are used instead of user identity.
221214

@@ -239,6 +232,40 @@ The following steps outline how to set up identity-based data access for trainin
239232
type: user_identity
240233
```
241234
235+
The following steps outline how to set up data access with user identity for training jobs on compute clusters from Python SDK.
236+
237+
1. Grant data access and create data store as described above for CLI.
238+
239+
1. Submit a training job with identity parameter set to [azure.ai.ml.UserIdentity](https://learn.microsoft.com/python/api/azure-ai-ml/azure.ai.ml.useridentity). This parameter setting enables the job to access data on behalf of user submitting the job.
240+
241+
```python
242+
from azure.ai.ml import command
243+
from azure.ai.ml.entities import Data, UriReference
244+
from azure.ai.ml import Input
245+
from azure.ai.ml.constants import AssetTypes
246+
from azure.ai.ml import UserIdentity
247+
248+
# Specify the data location
249+
my_job_inputs = {
250+
"input_data": Input(type=AssetTypes.URI_FILE, path="<path-to-my-data>")
251+
}
252+
253+
# Define the job
254+
job = command(
255+
code="<my-local-code-location>",
256+
command="python <my-script>.py --input_data ${{inputs.input_data}}",
257+
inputs=my_job_inputs,
258+
environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:9",
259+
compute="<my-compute-cluster-name>",
260+
identity= UserIdentity()
261+
)
262+
# submit the command
263+
returned_job = ml_client.jobs.create_or_update(job)
264+
```
265+
266+
> [!IMPORTANT]
267+
> During job submission with authentication with user identity enabled, the code snapshots are protected against tampering by checksum validation. If you have existing pipeline components and intend to use them with authentication with user identity enabled, you may need to re-upload them. Otherwise the job may fail during checksum validation.
268+
242269
### Work with virtual networks
243270

244271
By default, Azure Machine Learning can't communicate with a storage account that's behind a firewall or in a virtual network.

0 commit comments

Comments
 (0)