You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-access-data-batch-endpoints-jobs.md
+56-43Lines changed: 56 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@ This article describes how to specify parameter inputs for batch endpoints and c
28
28
29
29
To successfully invoke a batch endpoint and create jobs, ensure you complete the following prerequisites:
30
30
31
-
- A batch endpoint and deployment. If you don't have these resources, see [Deploy models for scoring in batch endpoints](how-to-use-batch-model-deployments.md) to create a deployment.
31
+
- A batch endpoint and deployment. If you don't have these resources, see [Deploy MLflow models in batch deployments in Azure Machine Learning](how-to-mlflow-batch.md) to create a deployment.
32
32
33
33
- Permissions to run a batch endpoint deployment. **AzureML Data Scientist**, **Contributor**, and **Owner** roles can be used to run a deployment. For custom role definitions, see [Authorization on batch endpoints](how-to-authenticate-batch-endpoint.md) to review the specific required permissions.
34
34
@@ -47,17 +47,20 @@ To successfully invoke a batch endpoint and create jobs, ensure you complete the
47
47
Use the Azure Machine Learning SDK for Python to sign in:
@@ -350,7 +353,7 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
350
353
name: heart-dataset-unlabeled
351
354
description: An unlabeled dataset for heart classification.
352
355
type: uri_folder
353
-
path: heart-classifier-mlflow/data
356
+
path: data
354
357
```
355
358
356
359
Then, create the data asset:
@@ -398,7 +401,7 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
398
401
# [Azure CLI](#tab/cli)
399
402
400
403
```azurecli
401
-
DATASET_ID=$(az ml data show -n heart-dataset-unlabeled --label latest | jq -r .id)
404
+
DATA_ASSET_ID=$(az ml data show -n heart-dataset-unlabeled --label latest | jq -r .id)
402
405
```
403
406
404
407
# [Python](#tab/sdk)
@@ -423,24 +426,25 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
423
426
}
424
427
}
425
428
```
429
+
---
426
430
427
-
The data assets ID looks like `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/data/<data-asset>/versions/<version>`. You can also use the `azureml:<datasset_name>@latest` format to specify the input.
431
+
The data assets ID has the format `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/data/<data-asset>/versions/<version>`. You can also use the `azureml:<datasset_name>@latest` format to specify the input.
428
432
429
433
1. Run the endpoint:
430
434
431
435
# [Azure CLI](#tab/cli)
432
436
433
-
Use the `--set` argument to specify the input:
437
+
Use the `--set` argument to specify the input. First replace any hyphens in the data asset name with underscore characters. Keys can contain only alphanumeric characters and underscore characters.
434
438
435
439
```azurecli
436
440
az ml batch-endpoint invoke --name $ENDPOINT_NAME \
For an endpoint that serves a model deployment, you can use the `--input` argument to specify the data input because a model deployment always requires only one data input.
441
445
442
446
```azurecli
443
-
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATASET_ID
447
+
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATA_ASSET_ID
444
448
```
445
449
446
450
The argument `--set` tends to produce long commands when multiple inputs are specified. In such cases, place your inputs in a `YAML` file and use the `--file` argument to specify the inputs you need for your endpoint invocation.
@@ -449,7 +453,9 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
@@ -500,15 +506,15 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
500
506
501
507
You can directly reference data from Azure Machine Learning registered data stores with batch deployments jobs. In this example, you first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store.
502
508
503
-
1. Access the default data store in the Azure Machine Learning workspace. If your data is in a different store, you can use that store instead. You aren't required to use the default data store.
509
+
1. Get access to the data store that you want to use. The examples in this section use the default blob data store in an Azure Machine Learning workspace. But you can also use data that's in a different store. In any Machine Learning workspace, the name of the default blob data store is **workspaceblobstore**. There's no need to update that name in the following command unless you want to use a different data store.
504
510
505
511
# [Azure CLI](#tab/cli)
506
512
507
513
```azurecli
508
514
DATASTORE_ID=$(az ml datastore show -n workspaceblobstore | jq -r '.id')
509
515
```
510
516
511
-
The data stores ID looks like `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
517
+
The data store ID has the format `/subscriptions/<subscription-ID>/resourceGroups/<resource-group-name>/providers/Microsoft.MachineLearningServices/workspaces/<workspace-name>/datastores/<data-store-name>`.
512
518
513
519
# [Python](#tab/sdk)
514
520
@@ -521,18 +527,23 @@ You can directly reference data from Azure Machine Learning registered data stor
521
527
Use the Azure Machine Learning CLI, Azure Machine Learning SDK for Python, or the studio to get the data store information.
522
528
523
529
---
524
-
525
-
> [!TIP]
526
-
> The default blob data store in a workspace is named __workspaceblobstore__. You can skip this step if you already know the resource ID of the default data store in your workspace.
527
-
528
-
1. Upload some sample data to the data store.
529
530
530
-
This example assumes you already uploaded the sample data included in the repo in the folder `sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data` in the folder `heart-disease-uci-unlabeled` in the Blob Storage account. Be sure to complete this step before you continue.
531
+
1. Upload some sample data to the data store:
532
+
1. In the [azureml-examples](https://github.com/Azure/azureml-examples) repository, go to the [sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data](https://github.com/Azure/azureml-examples/tree/main/sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/data) folder.
533
+
1. Download the files.
534
+
1. Use a tool like Azure Storage Explorer to connect to your Azure Blob Storage account.
535
+
1. Open to the Blob Storage container that matches the name of your data store's blob container.
536
+
1. Upload the sample data files to a folder named `heart-disease-uci-unlabeled` in the container.
531
537
532
538
1. Create the input or request:
533
539
534
540
# [Azure CLI](#tab/cli)
535
541
542
+
Update this section, because this path was invalid. The only thing that worked was the tip format:
543
+
You can also use the format azureml://datastores/<data-store>/paths/<data-path> to specify the input.
544
+
545
+
az ml batch-endpoint invoke --name "heart2kr9c8wma45n1bqoiu" --set inputs.my_heart_blob_ds.type="uri_folder" inputs.my_heart_blob_ds.path=azureml://datastores/workspaceblobstore/paths/heart-disease-uci-unlabeled
546
+
536
547
Place the file path in the `INPUT_PATH` variable:
537
548
538
549
```azurecli
@@ -601,7 +612,7 @@ You can directly reference data from Azure Machine Learning registered data stor
@@ -806,7 +817,9 @@ To learn more about extra required configuration for reading data from storage a
806
817
807
818
## Create jobs with literal inputs
808
819
809
-
Pipeline component deployments can take literal inputs. The following example shows how to specify an input named `score_mode`, of type `string`, with a value of `append`:
820
+
Pipeline component deployments can take literal inputs. For an example of a batch deployment that contains a basic pipeline, see [How to deploy pipelines with batch endpoints](how-to-use-batch-pipeline-deployments.md).
821
+
822
+
The following example shows how to specify an input named `score_mode`, of type `string`, with a value of `append`:
810
823
811
824
# [Azure CLI](#tab/cli)
812
825
@@ -885,7 +898,7 @@ The following example shows how to change the location where an output named `sc
885
898
DATASTORE_ID=$(az ml datastore show -n workspaceblobstore | jq -r '.id')
886
899
```
887
900
888
-
The data stores ID looks like `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
901
+
The data store ID has the format `/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
889
902
890
903
# [Python](#tab/sdk)
891
904
@@ -903,17 +916,19 @@ The following example shows how to change the location where an output named `sc
903
916
904
917
# [Azure CLI](#tab/cli)
905
918
906
-
Set the `OUTPUT_PATH` variable:
919
+
Define the input and output values in a file. Use the data store ID in the output path. For completeness, also define the data input.
0 commit comments