Skip to content

Commit 19b6229

Browse files
committed
Update Azure CLI and Python code
1 parent 5d51750 commit 19b6229

File tree

1 file changed

+56
-43
lines changed

1 file changed

+56
-43
lines changed

articles/machine-learning/how-to-access-data-batch-endpoints-jobs.md

Lines changed: 56 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ This article describes how to specify parameter inputs for batch endpoints and c
2828

2929
To successfully invoke a batch endpoint and create jobs, ensure you complete the following prerequisites:
3030

31-
- A batch endpoint and deployment. If you don't have these resources, see [Deploy models for scoring in batch endpoints](how-to-use-batch-model-deployments.md) to create a deployment.
31+
- A batch endpoint and deployment. If you don't have these resources, see [Deploy MLflow models in batch deployments in Azure Machine Learning](how-to-mlflow-batch.md) to create a deployment.
3232

3333
- Permissions to run a batch endpoint deployment. **AzureML Data Scientist**, **Contributor**, and **Owner** roles can be used to run a deployment. For custom role definitions, see [Authorization on batch endpoints](how-to-authenticate-batch-endpoint.md) to review the specific required permissions.
3434

@@ -47,17 +47,20 @@ To successfully invoke a batch endpoint and create jobs, ensure you complete the
4747
Use the Azure Machine Learning SDK for Python to sign in:
4848
4949
```python
50-
from azure.ai.ml import MLClient
50+
from azure.ai.ml import MLClient, Input
5151
from azure.identity import DefaultAzureCredential
52-
52+
from azure.ai.ml.constants import AssetTypes
53+
from azure.ai.ml.entities import Data
54+
5355
ml_client = MLClient.from_config(DefaultAzureCredential())
5456
```
5557
5658
If your configuration runs outside an Azure Machine Learning compute, you need to specify the workspace where the endpoint is deployed:
5759
5860
```python
59-
from azure.ai.ml import MLClient
61+
from azure.ai.ml import MLClient, Input
6062
from azure.identity import DefaultAzureCredential
63+
from azure.ai.ml.constants import AssetTypes
6164
6265
subscription_id = "<subscription>"
6366
resource_group = "<resource-group>"
@@ -102,13 +105,13 @@ az ml batch-endpoint invoke --name $ENDPOINT_NAME \
102105

103106
# [Python](#tab/sdk)
104107

105-
Use the `MLClient.batch_endpoints.invoke()` method to specify the name of the experiment:
108+
Use the `MLClient.batch_endpoints.invoke()` method to invoke a batch endpoint. In the following code, `endpoint` is an endpoint object.
106109

107110
```python
108111
job = ml_client.batch_endpoints.invoke(
109112
endpoint_name=endpoint.name,
110113
inputs={
111-
"heart_dataset": Input("https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data")
114+
"heart_dataset": Input(path="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data")
112115
}
113116
)
114117
```
@@ -159,14 +162,14 @@ az ml batch-endpoint invoke --name $ENDPOINT_NAME \
159162

160163
# [Python](#tab/sdk)
161164

162-
Use the parameter `deployment_name` to specify the name of the deployment:
165+
Use the parameter `deployment_name` to specify the name of the deployment. In the following code, `deployment` is a deployment object.
163166

164167
```python
165168
job = ml_client.batch_endpoints.invoke(
166169
endpoint_name=endpoint.name,
167170
deployment_name=deployment.name,
168171
inputs={
169-
"heart_dataset": Input("https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data")
172+
"heart_dataset": Input(path="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data")
170173
}
171174
)
172175
```
@@ -232,7 +235,7 @@ job = ml_client.batch_endpoints.invoke(
232235
endpoint_name=endpoint.name,
233236
experiment_name="my-batch-job-experiment",
234237
inputs={
235-
"heart_dataset": Input("https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data"),
238+
"heart_dataset": Input(path="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data"),
236239
}
237240
)
238241
```
@@ -350,7 +353,7 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
350353
name: heart-dataset-unlabeled
351354
description: An unlabeled dataset for heart classification.
352355
type: uri_folder
353-
path: heart-classifier-mlflow/data
356+
path: data
354357
```
355358
356359
Then, create the data asset:
@@ -398,7 +401,7 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
398401
# [Azure CLI](#tab/cli)
399402

400403
```azurecli
401-
DATASET_ID=$(az ml data show -n heart-dataset-unlabeled --label latest | jq -r .id)
404+
DATA_ASSET_ID=$(az ml data show -n heart-dataset-unlabeled --label latest | jq -r .id)
402405
```
403406

404407
# [Python](#tab/sdk)
@@ -423,24 +426,25 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
423426
}
424427
}
425428
```
429+
---
426430

427-
The data assets ID looks like `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/data/<data-asset>/versions/<version>`. You can also use the `azureml:<datasset_name>@latest` format to specify the input.
431+
The data assets ID has the format `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/data/<data-asset>/versions/<version>`. You can also use the `azureml:<datasset_name>@latest` format to specify the input.
428432

429433
1. Run the endpoint:
430434

431435
# [Azure CLI](#tab/cli)
432436

433-
Use the `--set` argument to specify the input:
437+
Use the `--set` argument to specify the input. First replace any hyphens in the data asset name with underscore characters. Keys can contain only alphanumeric characters and underscore characters.
434438

435439
```azurecli
436440
az ml batch-endpoint invoke --name $ENDPOINT_NAME \
437-
--set inputs.heart_dataset.type="uri_folder" inputs.heart_dataset.path=$DATASET_ID
441+
--set inputs.heart_dataset_unlabeled.type="uri_folder" inputs.heart_dataset_unlabeled.path=$DATA_ASSET_ID
438442
```
439443

440444
For an endpoint that serves a model deployment, you can use the `--input` argument to specify the data input because a model deployment always requires only one data input.
441445

442446
```azurecli
443-
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATASET_ID
447+
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATA_ASSET_ID
444448
```
445449

446450
The argument `--set` tends to produce long commands when multiple inputs are specified. In such cases, place your inputs in a `YAML` file and use the `--file` argument to specify the inputs you need for your endpoint invocation.
@@ -449,7 +453,9 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
449453

450454
```yml
451455
inputs:
452-
heart_dataset: azureml:/<datasset_name>@latest
456+
heart_dataset_unlabeled:
457+
type: uri_folder
458+
path: /subscriptions/<subscription-ID>/resourceGroups/<resource-group-name>/providers/Microsoft.MachineLearningServices/workspaces/<workspace-name>/data/heart-dataset-unlabeled/versions/1
453459
```
454460

455461
Run the following command:
@@ -500,15 +506,15 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
500506

501507
You can directly reference data from Azure Machine Learning registered data stores with batch deployments jobs. In this example, you first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store.
502508

503-
1. Access the default data store in the Azure Machine Learning workspace. If your data is in a different store, you can use that store instead. You aren't required to use the default data store.
509+
1. Get access to the data store that you want to use. The examples in this section use the default blob data store in an Azure Machine Learning workspace. But you can also use data that's in a different store. In any Machine Learning workspace, the name of the default blob data store is **workspaceblobstore**. There's no need to update that name in the following command unless you want to use a different data store.
504510

505511
# [Azure CLI](#tab/cli)
506512

507513
```azurecli
508514
DATASTORE_ID=$(az ml datastore show -n workspaceblobstore | jq -r '.id')
509515
```
510516

511-
The data stores ID looks like `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
517+
The data store ID has the format `/subscriptions/<subscription-ID>/resourceGroups/<resource-group-name>/providers/Microsoft.MachineLearningServices/workspaces/<workspace-name>/datastores/<data-store-name>`.
512518

513519
# [Python](#tab/sdk)
514520

@@ -521,18 +527,23 @@ You can directly reference data from Azure Machine Learning registered data stor
521527
Use the Azure Machine Learning CLI, Azure Machine Learning SDK for Python, or the studio to get the data store information.
522528

523529
---
524-
525-
> [!TIP]
526-
> The default blob data store in a workspace is named __workspaceblobstore__. You can skip this step if you already know the resource ID of the default data store in your workspace.
527-
528-
1. Upload some sample data to the data store.
529530

530-
This example assumes you already uploaded the sample data included in the repo in the folder `sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data` in the folder `heart-disease-uci-unlabeled` in the Blob Storage account. Be sure to complete this step before you continue.
531+
1. Upload some sample data to the data store:
532+
1. In the [azureml-examples](https://github.com/Azure/azureml-examples) repository, go to the [sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data](https://github.com/Azure/azureml-examples/tree/main/sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data) folder.
533+
1. Download the files.
534+
1. Use a tool like Azure Storage Explorer to connect to your Azure Blob Storage account.
535+
1. Open to the Blob Storage container that matches the name of your data store's blob container.
536+
1. Upload the sample data files to a folder named `heart-disease-uci-unlabeled` in the container.
531537

532538
1. Create the input or request:
533539

534540
# [Azure CLI](#tab/cli)
535541

542+
Update this section, because this path was invalid. The only thing that worked was the tip format:
543+
You can also use the format azureml://datastores/<data-store>/paths/<data-path> to specify the input.
544+
545+
az ml batch-endpoint invoke --name "heart2kr9c8wma45n1bqoiu" --set inputs.my_heart_blob_ds.type="uri_folder" inputs.my_heart_blob_ds.path=azureml://datastores/workspaceblobstore/paths/heart-disease-uci-unlabeled
546+
536547
Place the file path in the `INPUT_PATH` variable:
537548

538549
```azurecli
@@ -601,7 +612,7 @@ You can directly reference data from Azure Machine Learning registered data stor
601612

602613
```yml
603614
inputs:
604-
heart_dataset:
615+
heart_dataset_unlabeled:
605616
type: uri_folder
606617
path: azureml://datastores/<data-store>/paths/<data-path>
607618
```
@@ -665,13 +676,13 @@ To learn more about extra required configuration for reading data from storage a
665676
Set the `INPUT_DATA` variable:
666677

667678
```azurecli
668-
INPUT_DATA = "https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data"
679+
INPUT_DATA="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data"
669680
```
670681

671682
If your data is a file, set the variable with the following format:
672683

673684
```azurecli
674-
INPUT_DATA = "https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data/heart.csv"
685+
INPUT_DATA="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data/heart.csv"
675686
```
676687

677688
# [Python](#tab/sdk)
@@ -806,7 +817,9 @@ To learn more about extra required configuration for reading data from storage a
806817

807818
## Create jobs with literal inputs
808819

809-
Pipeline component deployments can take literal inputs. The following example shows how to specify an input named `score_mode`, of type `string`, with a value of `append`:
820+
Pipeline component deployments can take literal inputs. For an example of a batch deployment that contains a basic pipeline, see [How to deploy pipelines with batch endpoints](how-to-use-batch-pipeline-deployments.md).
821+
822+
The following example shows how to specify an input named `score_mode`, of type `string`, with a value of `append`:
810823

811824
# [Azure CLI](#tab/cli)
812825

@@ -885,7 +898,7 @@ The following example shows how to change the location where an output named `sc
885898
DATASTORE_ID=$(az ml datastore show -n workspaceblobstore | jq -r '.id')
886899
```
887900

888-
The data stores ID looks like `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
901+
The data store ID has the format `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
889902

890903
# [Python](#tab/sdk)
891904

@@ -903,17 +916,19 @@ The following example shows how to change the location where an output named `sc
903916

904917
# [Azure CLI](#tab/cli)
905918

906-
Set the `OUTPUT_PATH` variable:
919+
Define the input and output values in a file. Use the data store ID in the output path. For completeness, also define the data input.
907920

908-
```azurecli
909-
DATA_PATH="batch-jobs/my-unique-path"
910-
OUTPUT_PATH="$DATASTORE_ID/paths/$DATA_PATH"
911-
```
912-
913-
For completeness, also create a data input:
914-
915-
```azurecli
916-
INPUT_PATH="https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data"
921+
**inputs-and-outputs.yml**
922+
923+
```yml
924+
inputs:
925+
heart_dataset_unlabeled:
926+
type: uri_folder
927+
path: https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data
928+
outputs:
929+
score:
930+
type: uri_file
931+
path: <data-store-ID>/paths/batch-jobs/my-unique-path
917932
```
918933

919934
# [Python](#tab/sdk)
@@ -963,12 +978,10 @@ The following example shows how to change the location where an output named `sc
963978

964979
# [Azure CLI](#tab/cli)
965980

966-
Use the `--set` argument to specify the input:
981+
Use the `--file` argument to specify the input and output values:
967982

968983
```azurecli
969-
az ml batch-endpoint invoke --name $ENDPOINT_NAME \
970-
--set inputs.heart_dataset.path=$INPUT_PATH \
971-
--set outputs.score.path=$OUTPUT_PATH
984+
az ml batch-endpoint invoke --name $ENDPOINT_NAME --file inputs-and-outputs.yml
972985
```
973986

974987
# [Python](#tab/sdk)

0 commit comments

Comments
 (0)