Skip to content

Commit 9ec3e4f

Browse files
Merge pull request #216735 from santiagxf/santiagxf/azureml-batch-patch
Update how-to-access-data-batch-endpoints-jobs.md
2 parents 5a51caa + 13190ea commit 9ec3e4f

File tree

1 file changed

+16
-7
lines changed

1 file changed

+16
-7
lines changed

articles/machine-learning/batch-inference/how-to-access-data-batch-endpoints-jobs.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,15 @@ ms.custom: devplatv2
1515

1616
# Accessing data from batch endpoints jobs
1717

18-
Batch endpoints can be used to perform batch scoring on large amounts of data. Such data can be placed in different places. In this tutorial we'll cover the different places where batch endpoints can read data from to.
18+
Batch endpoints can be used to perform batch scoring on large amounts of data. Such data can be placed in different places. In this tutorial we'll cover the different places where batch endpoints can read data from and how to reference it.
1919

2020
## Prerequisites
2121

2222
* This example assumes that you've a model correctly deployed as a batch endpoint. Particularly, we're using the *heart condition classifier* created in the tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md).
2323

2424
## Supported data inputs
2525

26-
Batch endpoints support reading files or folders that are located in different locations:
26+
Batch endpoints support reading files located in tje following storage options:
2727

2828
* Azure Machine Learning Data Stores. The following stores are supported:
2929
* Azure Blob Storage
@@ -47,7 +47,7 @@ Batch endpoints support reading files or folders that are located in different l
4747

4848
## Reading data from data stores
4949

50-
We're going to first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store:
50+
Data from Azure Machine Learning registered data stores can be directly referenced by batch deployments jobs. In this example, we're going to first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store:
5151

5252
1. Let's get access to the default data store in the Azure Machine Learning workspace. If your data is in a different store, you can use that store instead. There's no requirement of using the default data store.
5353

@@ -136,10 +136,10 @@ We're going to first upload some data to the default data store in the Azure Mac
136136
137137
## Reading data from a data asset
138138
139-
Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
139+
Azure Machine Learning data assets (formaly known as datasets) are supported as inputs for jobs. Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
140140
141141
> [!WARNING]
142-
> Data assets of type Table (`MLTable`) isn't currently supported.
142+
> Data assets of type Table (`MLTable`) aren't currently supported.
143143
144144
1. Let's create the data asset first. This data asset consists of a folder with multiple CSV files that we want to process in parallel using batch endpoints. You can skip this step is your data is already registered as a data asset.
145145
@@ -243,7 +243,10 @@ Follow these steps to run a batch endpoint job using data stored in a registered
243243

244244
## Reading data from Azure Storage Accounts
245245

246-
Azure Machine Learning batch endpoints can read data from cloud locations in Azure Storage Accounts. Both public and private cloud locations are supported. Use the following steps to run a batch endpoint job using data stored in a storage account:
246+
Azure Machine Learning batch endpoints can read data from cloud locations in Azure Storage Accounts, both public and private. Use the following steps to run a batch endpoint job using data stored in a storage account:
247+
248+
> [!NOTE]
249+
> Check the section [Security considerations when reading data](#security-considerations-when-reading-data) for learn more about additional configuration required to successfully read data from storage accoutns.
247250
248251
1. Create a data input:
249252

@@ -335,7 +338,7 @@ Batch endpoints ensure that only authorized users are able to invoke batch deplo
335338
| Data store | Yes | Data store's credentials in the workspace | Credentials |
336339
| Data store | No | Identity of the job | Depends on type |
337340
| Data asset | Yes | Data store's credentials in the workspace | Credentials |
338-
| Data asset | No | Identity of the job + Managed identity of the compute cluster | Depends on store |
341+
| Data asset | No | Identity of the job | Depends on store |
339342
| Azure Blob Storage | Not apply | Identity of the job + Managed identity of the compute cluster | RBAC |
340343
| Azure Data Lake Storage Gen1 | Not apply | Identity of the job + Managed identity of the compute cluster | POSIX |
341344
| Azure Data Lake Storage Gen2 | Not apply | Identity of the job + Managed identity of the compute cluster | POSIX and RBAC |
@@ -344,3 +347,9 @@ The managed identity of the compute cluster is used for mounting and configuring
344347

345348
> [!NOTE]
346349
> To assign an identity to the compute used by a batch deployment, follow the instructions at [Set up authentication between Azure ML and other services](../how-to-identity-based-service-authentication.md#compute-cluster). Configure the identity on the compute cluster associated with the deployment. Notice that all the jobs running on such compute are affected by this change. However, different deployments (even under the same deployment) can be configured to run under different clusters so you can administer the permissions accordingly depending on your requirements.
350+
351+
## Next steps
352+
353+
* [Troubleshooting batch endpoints](how-to-troubleshoot-batch-endpoints.md).
354+
* [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md).
355+
* [Invoking batch endpoints from Azure Data Factory](how-to-use-batch-azure-data-factory.md).

0 commit comments

Comments
 (0)