Skip to content

Commit 25051b2

Browse files
authored
Update how-to-identity-based-service-authentication.md
Added section for kubernetes cluster
1 parent 4be3b47 commit 25051b2

File tree

1 file changed

+49
-0
lines changed

1 file changed

+49
-0
lines changed

articles/machine-learning/how-to-identity-based-service-authentication.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,55 @@ The following steps outline how to set up data access with user identity for tra
413413
> [!IMPORTANT]
414414
> During job submission with authentication with user identity enabled, the code snapshots are protected against tampering by checksum validation. If you have existing pipeline components and intend to use them with authentication with user identity enabled, you might need to re-upload them. Otherwise the job may fail during checksum validation.
415415

416+
### Access data for training jobs on AKS clusters using user identity
417+
When training on Azure Kubernetes Service (AKS) clusters, the authentication to dependent azure resources works differently.
418+
The following steps outline how to set up data access with a given managed identity for training jobs on AKS clusters:
419+
420+
1. Firstly, create and attach the [Azure Kubernetes Cluster to your Azure Machine Learning Workspace](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#how-to-attach-a-kubernetes-cluster-to-azure-machine-learning-workspace).
421+
422+
1. Ensure that the kubernetes cluster has an [assigned managed identity](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#assign-managed-identity) and that the identity has the necessary [azure roles assigned to it](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#assign-azure-roles-to-managed-identity).
423+
424+
1. When submitting the job, make sure to provide the managed identity of the compute **without specifying the client_id** in the parameters:
425+
426+
```yaml
427+
command: |
428+
echo "--census-csv: ${{inputs.census_csv}}"
429+
python hello-census.py --census-csv ${{inputs.census_csv}}
430+
code: src
431+
inputs:
432+
census_csv:
433+
type: uri_file
434+
path: azureml://datastores/mydata/paths/census.csv
435+
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest
436+
compute: azureml:kubernetes-cluster
437+
```
438+
439+
```python
440+
from azure.ai.ml import command
441+
from azure.ai.ml.entities import Data, UriReference
442+
from azure.ai.ml import Input
443+
from azure.ai.ml.constants import AssetTypes
444+
from azure.ai.ml import UserIdentityConfiguration
445+
446+
# Specify the data location
447+
my_job_inputs = {
448+
"input_data": Input(type=AssetTypes.URI_FILE, path="<path-to-my-data>")
449+
}
450+
451+
# Define the job
452+
job = command(
453+
code="<my-local-code-location>",
454+
command="python <my-script>.py --input_data ${{inputs.input_data}}",
455+
inputs=my_job_inputs,
456+
environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:9",
457+
compute="<my-kubernetes-cluster-name>",
458+
identity= ManagedIdentityConfiguration()
459+
)
460+
# submit the command
461+
returned_job = ml_client.jobs.create_or_update(job)
462+
```
463+
In this case, you can leave the identity property unspecified in the yaml, as it will default to the managed identity of the kubernetes cluster.
464+
416465
### Work with virtual networks
417466

418467
By default, Azure Machine Learning can't communicate with a storage account that's behind a firewall or in a virtual network.

0 commit comments

Comments
 (0)