Skip to content

Commit 9bf9a25

Browse files
Merge pull request #217104 from santiagxf/santiagxf/azureml-batch-aml-format
Santiagxf/azureml batch aml format
2 parents 9f73444 + 8b982c6 commit 9bf9a25

File tree

1 file changed

+27
-18
lines changed

1 file changed

+27
-18
lines changed

articles/machine-learning/batch-inference/how-to-access-data-batch-endpoints-jobs.md

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Batch endpoints can be used to perform batch scoring on large amounts of data. S
2323

2424
## Supported data inputs
2525

26-
Batch endpoints support reading files located in tje following storage options:
26+
Batch endpoints support reading files located in the following storage options:
2727

2828
* Azure Machine Learning Data Stores. The following stores are supported:
2929
* Azure Blob Storage
@@ -54,8 +54,11 @@ Data from Azure Machine Learning registered data stores can be directly referenc
5454
# [Azure ML CLI](#tab/cli)
5555

5656
```azurecli
57-
az ml workspace show --query storage_account
57+
DATASTORE_ID=$(az ml datastore show -n workspaceblobstore | jq -r '.id')
5858
```
59+
60+
> [!NOTE]
61+
> Data stores ID would look like `azureml:/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>`.
5962
6063
# [Azure ML SDK for Python](#tab/sdk)
6164
@@ -66,21 +69,25 @@ Data from Azure Machine Learning registered data stores can be directly referenc
6669
# [REST](#tab/rest)
6770
6871
Use the Azure ML CLI, Azure ML SDK for Python, or Studio to get the data store information.
72+
73+
---
74+
75+
> [!TIP]
76+
> The default blob data store in a workspace is called __workspaceblobstore__. You can skip this step if you already know the resource ID of the default data store in your workspace.
6977
70-
1. We'll need to upload some sample data to it. This example assumes you've uploaded the sample data included in the repo in the folder `sdk/python/endpoints/batch/heart-classifier/data` in the folder `heart-classifier/data` in the blob storage account.
78+
1. We'll need to upload some sample data to it. This example assumes you've uploaded the sample data included in the repo in the folder `sdk/python/endpoints/batch/heart-classifier/data` in the folder `heart-classifier/data` in the blob storage account. Ensure you have done that before moving forward.
7179
7280
1. Create a data input:
7381
7482
# [Azure ML CLI](#tab/cli)
83+
84+
Let's place the file path in the following variable:
7585
7686
```azurecli
7787
DATA_PATH="heart-disease-uci-unlabeled"
78-
DATASTORE_ID=$(az ml workspace show | jq -r '.storage_account')
88+
INPUT_PATH="$DATASTORE_ID/paths/$DATA_PATH"
7989
```
8090
81-
> [!TIP]
82-
> You can skip this step if you already know the name of the data store you want to use. Here it is used only to know the name of the default data store of the workspace.
83-
8491
# [Azure ML SDK for Python](#tab/sdk)
8592
8693
```python
@@ -90,23 +97,23 @@ Data from Azure Machine Learning registered data stores can be directly referenc
9097
9198
# [REST](#tab/rest)
9299
93-
Use the Azure ML CLI, Azure ML SDK for Python, or Studio to get the data store information.
100+
Use the Azure ML CLI, Azure ML SDK for Python, or Studio to get the subscription ID, resource group, workspace, and name of the data store. You will need them later.
101+
94102
---
95-
103+
96104
> [!NOTE]
97-
> Data stores ID would look like `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/microsoft.storage/storageaccounts/<storage-account-name>`.
105+
> See how the path `paths` is appended to the resource id of the data store to indicate that what follows is a path inside of it.
98106
107+
> [!TIP]
108+
> You can also use `azureml:/datastores/<data-store>/paths/<data-path>` as a way to indicate the input.
99109
100110
1. Run the deployment:
101111
102112
# [Azure ML CLI](#tab/cli)
103113
104114
```bash
105-
INVOKE_RESPONSE = $(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $DATASTORE_ID/paths/$DATA_PATH)
115+
INVOKE_RESPONSE = $(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input $INPUT_PATH)
106116
```
107-
108-
> [!TIP]
109-
> You can also use `--input azureml:/datastores/<data_store_name>/paths/<data_path>` as a way to indicate the input.
110117
111118
# [Azure ML SDK for Python](#tab/sdk)
112119
@@ -127,7 +134,7 @@ Data from Azure Machine Learning registered data stores can be directly referenc
127134
"InputData": {
128135
"mnistinput": {
129136
"JobInputType" : "UriFolder",
130-
"Uri": "azureml://subscriptions/<subscription>/resourcegroups/<resource-group>/providers/microsoft.storage/storageaccounts/<storage-account-name>/paths/<data_path>"
137+
"Uri": "azureml:/subscriptions/<subscription>/resourceGroups/<resource-group/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/datastores/<data-store>/paths/<data-path>"
131138
}
132139
}
133140
}
@@ -136,7 +143,7 @@ Data from Azure Machine Learning registered data stores can be directly referenc
136143
137144
## Reading data from a data asset
138145
139-
Azure Machine Learning data assets (formaly known as datasets) are supported as inputs for jobs. Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
146+
Azure Machine Learning data assets (formerly known as datasets) are supported as inputs for jobs. Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
140147
141148
> [!WARNING]
142149
> Data assets of type Table (`MLTable`) aren't currently supported.
@@ -185,8 +192,10 @@ Azure Machine Learning data assets (formaly known as datasets) are supported as
185192
1. Create a data input:
186193
187194
# [Azure ML CLI](#tab/cli)
188-
195+
196+
```azurecli
189197
DATASET_ID=$(az ml data show -n heart-dataset-unlabeled --label latest --query id)
198+
```
190199
191200
# [Azure ML SDK for Python](#tab/sdk)
192201
@@ -201,7 +210,7 @@ Azure Machine Learning data assets (formaly known as datasets) are supported as
201210
---
202211
203212
> [!NOTE]
204-
> Data stores ID would look like `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/microsoft.storage/storageaccounts/<storage-account-name>`.
213+
> Data assets ID would look like `/subscriptions/<subscription>/resourcegroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace>/data/<data-asset>/versions/<version>`.
205214
206215
207216
1. Run the deployment:

0 commit comments

Comments
 (0)