Merge pull request #216735 from santiagxf/santiagxf/azureml-batch-patch

prmerger-automator[bot] · web-flow · commit 9ec3e4fffb30 · 2022-11-01T16:19:25.000Z
Update how-to-access-data-batch-endpoints-jobs.md
diff --git a/articles/machine-learning/batch-inference/how-to-access-data-batch-endpoints-jobs.md b/articles/machine-learning/batch-inference/how-to-access-data-batch-endpoints-jobs.md
@@ -15,15 +15,15 @@ ms.custom: devplatv2
 
 # Accessing data from batch endpoints jobs
 
-Batch endpoints can be used to perform batch scoring on large amounts of data. Such data can be placed in different places. In this tutorial we'll cover the different places where batch endpoints can read data from to.
+Batch endpoints can be used to perform batch scoring on large amounts of data. Such data can be placed in different places. In this tutorial we'll cover the different places where batch endpoints can read data from and how to reference it.
 
 ## Prerequisites
 
 * This example assumes that you've a model correctly deployed as a batch endpoint. Particularly, we're using the *heart condition classifier* created in the tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md).
 
 ## Supported data inputs
 
-Batch endpoints support reading files or folders that are located in different locations:
+Batch endpoints support reading files located in tje following storage options:
 
 * Azure Machine Learning Data Stores. The following stores are supported:
     * Azure Blob Storage
@@ -47,7 +47,7 @@ Batch endpoints support reading files or folders that are located in different l
 
 ## Reading data from data stores
 
-We're going to first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store:
+Data from Azure Machine Learning registered data stores can be directly referenced by batch deployments jobs. In this example, we're going to first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store:
 
 1. Let's get access to the default data store in the Azure Machine Learning workspace. If your data is in a different store, you can use that store instead. There's no requirement of using the default data store. 
 
@@ -136,10 +136,10 @@ We're going to first upload some data to the default data store in the Azure Mac
 
 ## Reading data from a data asset
 
-Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
+Azure Machine Learning data assets (formaly known as datasets) are supported as inputs for jobs. Follow these steps to run a batch endpoint job using data stored in a registered data asset in Azure Machine Learning:
 
 > [!WARNING]
-> Data assets of type Table (`MLTable`) isn't currently supported.
+> Data assets of type Table (`MLTable`) aren't currently supported.
 
 1. Let's create the data asset first. This data asset consists of a folder with multiple CSV files that we want to process in parallel using batch endpoints. You can skip this step is your data is already registered as a data asset.
 
@@ -243,7 +243,10 @@ Follow these steps to run a batch endpoint job using data stored in a registered
 
 ## Reading data from Azure Storage Accounts
 
-Azure Machine Learning batch endpoints can read data from cloud locations in Azure Storage Accounts. Both public and private cloud locations are supported. Use the following steps to run a batch endpoint job using data stored in a storage account:
+Azure Machine Learning batch endpoints can read data from cloud locations in Azure Storage Accounts, both public and private. Use the following steps to run a batch endpoint job using data stored in a storage account:
+
+> [!NOTE]
+> Check the section [Security considerations when reading data](#security-considerations-when-reading-data) for learn more about additional configuration required to successfully read data from storage accoutns.
 
 1. Create a data input:
 
@@ -335,7 +338,7 @@ Batch endpoints ensure that only authorized users are able to invoke batch deplo
 | Data store                   | Yes                             | Data store's credentials in the workspace                     | Credentials       |
 | Data store                   | No                              | Identity of the job                                           | Depends on type   |
 | Data asset                   | Yes                             | Data store's credentials in the workspace                     | Credentials       |
-| Data asset                   | No                              | Identity of the job + Managed identity of the compute cluster | Depends on store  |
+| Data asset                   | No                              | Identity of the job                                           | Depends on store  |
 | Azure Blob Storage           | Not apply                       | Identity of the job + Managed identity of the compute cluster | RBAC              |
 | Azure Data Lake Storage Gen1 | Not apply                       | Identity of the job + Managed identity of the compute cluster | POSIX             |
 | Azure Data Lake Storage Gen2 | Not apply                       | Identity of the job + Managed identity of the compute cluster | POSIX and RBAC    |
@@ -344,3 +347,9 @@ The managed identity of the compute cluster is used for mounting and configuring
 
 > [!NOTE]
 > To assign an identity to the compute used by a batch deployment, follow the instructions at [Set up authentication between Azure ML and other services](../how-to-identity-based-service-authentication.md#compute-cluster). Configure the identity on the compute cluster associated with the deployment. Notice that all the jobs running on such compute are affected by this change. However, different deployments (even under the same deployment) can be configured to run under different clusters so you can administer the permissions accordingly depending on your requirements.
+
+## Next steps
+
+* [Troubleshooting batch endpoints](how-to-troubleshoot-batch-endpoints.md).
+* [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md).
+* [Invoking batch endpoints from Azure Data Factory](how-to-use-batch-azure-data-factory.md).