Skip to content

Commit 694a752

Browse files
authored
Add Kubernetes compute cluster identity instructions
Editted the file according to Siyu Zhang's feedback.
1 parent 25051b2 commit 694a752

File tree

1 file changed

+30
-48
lines changed

1 file changed

+30
-48
lines changed

articles/machine-learning/how-to-identity-based-service-authentication.md

Lines changed: 30 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,36 @@ During cluster creation or when editing compute cluster details, in the **Advanc
276276

277277
---
278278

279+
### Kubernetes Compute Cluster
280+
281+
> [!NOTE]
282+
> Azure Machine Learning kubernetes clusters support only **one system-assigned identity** or **one multiple user-assigned identities**, not both concurrently.
283+
284+
The **default managed identity** is the system-assigned managed identity or the first user-assigned managed identity.
285+
286+
287+
During a run there are two applications of an identity:
288+
289+
1. The system uses an identity to set up the user's storage mounts, container registry, and datastores.
290+
291+
* In this case, the system will use the default-managed identity.
292+
293+
1. You apply an identity to access resources from within the code for a submitted job:
294+
295+
* In the case of kubernetes compute clusters, the ManagedIdentityCredential object should be passed **without any client_id**.
296+
297+
For example, to retrieve a token for a datastore with the default-managed identity:
298+
299+
```python
300+
client_id = os.environ.get('DEFAULT_IDENTITY_CLIENT_ID')
301+
credential = ManagedIdentityCredential()
302+
token = credential.get_token('https://storage.azure.com/')
303+
```
304+
305+
To configure a kubernetes compute cluster, make sure that it has the [necessary AML extension deployed in it](https://learn.microsoft.com/azure/machine-learning/how-to-deploy-kubernetes-extension?view=azureml-api-2&tabs=deploy-extension-with-cli) and follow the documentation on [how to attach the kubernetes compute cluster to your AML workspace](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=cli).
306+
307+
---
308+
279309
### Data storage
280310

281311
When you create a datastore that uses **identity-based data access**, your Azure account ([Microsoft Entra token](/azure/active-directory/fundamentals/active-directory-whatis)) is used to confirm you have permission to access the storage service. In the **identity-based data access** scenario, no authentication credentials are saved. Only the storage account information is stored in the datastore.
@@ -413,54 +443,6 @@ The following steps outline how to set up data access with user identity for tra
413443
> [!IMPORTANT]
414444
> During job submission with authentication with user identity enabled, the code snapshots are protected against tampering by checksum validation. If you have existing pipeline components and intend to use them with authentication with user identity enabled, you might need to re-upload them. Otherwise the job may fail during checksum validation.
415445

416-
### Access data for training jobs on AKS clusters using user identity
417-
When training on Azure Kubernetes Service (AKS) clusters, the authentication to dependent azure resources works differently.
418-
The following steps outline how to set up data access with a given managed identity for training jobs on AKS clusters:
419-
420-
1. Firstly, create and attach the [Azure Kubernetes Cluster to your Azure Machine Learning Workspace](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#how-to-attach-a-kubernetes-cluster-to-azure-machine-learning-workspace).
421-
422-
1. Ensure that the kubernetes cluster has an [assigned managed identity](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#assign-managed-identity) and that the identity has the necessary [azure roles assigned to it](https://learn.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-to-workspace?view=azureml-api-2&tabs=sdk#assign-azure-roles-to-managed-identity).
423-
424-
1. When submitting the job, make sure to provide the managed identity of the compute **without specifying the client_id** in the parameters:
425-
426-
```yaml
427-
command: |
428-
echo "--census-csv: ${{inputs.census_csv}}"
429-
python hello-census.py --census-csv ${{inputs.census_csv}}
430-
code: src
431-
inputs:
432-
census_csv:
433-
type: uri_file
434-
path: azureml://datastores/mydata/paths/census.csv
435-
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest
436-
compute: azureml:kubernetes-cluster
437-
```
438-
439-
```python
440-
from azure.ai.ml import command
441-
from azure.ai.ml.entities import Data, UriReference
442-
from azure.ai.ml import Input
443-
from azure.ai.ml.constants import AssetTypes
444-
from azure.ai.ml import UserIdentityConfiguration
445-
446-
# Specify the data location
447-
my_job_inputs = {
448-
"input_data": Input(type=AssetTypes.URI_FILE, path="<path-to-my-data>")
449-
}
450-
451-
# Define the job
452-
job = command(
453-
code="<my-local-code-location>",
454-
command="python <my-script>.py --input_data ${{inputs.input_data}}",
455-
inputs=my_job_inputs,
456-
environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:9",
457-
compute="<my-kubernetes-cluster-name>",
458-
identity= ManagedIdentityConfiguration()
459-
)
460-
# submit the command
461-
returned_job = ml_client.jobs.create_or_update(job)
462-
```
463-
In this case, you can leave the identity property unspecified in the yaml, as it will default to the managed identity of the kubernetes cluster.
464446

465447
### Work with virtual networks
466448

0 commit comments

Comments
 (0)