You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-access-data-batch-endpoints-jobs.md
+24-9Lines changed: 24 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Create jobs and input data for batch endpoints
3
3
titleSuffix: Azure Machine Learning
4
4
description: Learn how to access data from different sources in batch endpoints jobs for Azure Machine Learning deployments by using the Azure CLI, the Python SDK, or REST API calls.
5
5
services: machine-learning
6
-
ms.service: machine-learning
6
+
ms.service: azure-machine-learning
7
7
ms.subservice: inferencing
8
8
ms.topic: how-to
9
9
author: msakande
@@ -84,7 +84,7 @@ To successfully invoke a batch endpoint and create jobs, ensure you complete the
84
84
- The **compute cluster** where the endpoint is deployed has access to read the input data.
85
85
86
86
> [!TIP]
87
-
> If you use a credential-less data store or external Azure Storage Account as data input, ensure you [configure compute clusters for data access](how-to-authenticate-batch-endpoint.md#configure-compute-clusters-for-data-access). **The managed identity of the compute cluster** is used **for mounting** the storage account. The identity of the job (invoker) is still used to read the underlying data, which allows you to achieve granular access control.
87
+
> If you use a credential-less data store or external Azure Storage Account as data input, ensure you [configure compute clusters for data access](how-to-authenticate-batch-endpoint.md#configure-compute-clusters-for-data-access). The managed identity of the compute cluster is used for mounting the storage account. The identity of the job (invoker) is still used to read the underlying data, which allows you to achieve granular access control.
You can configure some of the properties in the created job at invocation time.
206
208
207
209
> [!NOTE]
208
-
> The ability to configure job properties is available only in batch endpoints with Pipeline component deployments by the moment.
210
+
> The ability to configure job properties is currently available only in batch endpoints with Pipeline component deployments.
209
211
210
212
#### Configure experiment name
211
213
@@ -266,8 +268,11 @@ Host: <ENDPOINT_URI>
266
268
Authorization: Bearer <TOKEN>
267
269
Content-Type: application/json
268
270
```
271
+
269
272
---
270
273
274
+
<aname="understanding-inputs-and-outputs"></a>
275
+
271
276
## Understand inputs and outputs
272
277
273
278
Batch endpoints provide a durable API that consumers can use to create batch jobs. The same interface can be used to specify the inputs and outputs your deployment expects. Use inputs to pass any information your endpoint needs to perform the job.
@@ -291,15 +296,17 @@ The following table summarizes the inputs and outputs for batch deployments:
291
296
> [!TIP]
292
297
> Inputs and outputs are always named. The names serve as keys to identify the data and pass the actual value during invocation. Because model deployments always require one input and output, the name is ignored during invocation. You can assign the name that best describes your use case, such as "sales_estimation."
293
298
299
+
<aname="data-inputs"></a>
300
+
294
301
### Explore data inputs
295
302
296
303
Data inputs refer to inputs that point to a location where data is placed. Because batch endpoints usually consume large amounts of data, you can't pass the input data as part of the invocation request. Instead, you specify the location where the batch endpoint should go to look for the data. Input data is mounted and streamed on the target compute to improve performance.
297
304
298
305
Batch endpoints support reading files located in the following storage options:
299
306
300
-
-[Azure Machine Learning Data Assets](#use-input-data-from-data-asset), including Folder (`uri_folder`) and File (`uri_file`).
301
-
-[Azure Machine Learning Data Stores](#use-input-data-from-data-stores), including Azure Blob Storage, Azure Data Lake Storage Gen1, and Azure Data Lake Storage Gen2.
302
-
-[Azure Storage Accounts](#input-data-from-azure-storage-accounts), including Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, and Azure Blob Storage.
307
+
-[Azure Machine Learning data assets](#use-input-data-from-data-asset), including Folder (`uri_folder`) and File (`uri_file`).
308
+
-[Azure Machine Learning data stores](#use-input-data-from-data-stores), including Azure Blob Storage, Azure Data Lake Storage Gen1, and Azure Data Lake Storage Gen2.
309
+
-[Azure Storage Accounts](#use-input-data-from-azure-storage-accounts), including Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, and Azure Blob Storage.
303
310
- Local data folders/files (Azure Machine Learning CLI or Azure Machine Learning SDK for Python). However, that operation results in the local data to be uploaded to the default Azure Machine Learning Data Store of the workspace you're working on.
304
311
305
312
> [!IMPORTANT]
@@ -325,7 +332,7 @@ Data outputs refer to the location where the results of a batch job should be pl
325
332
326
333
## Create jobs with data inputs
327
334
328
-
The following examples show how to create jobs, taking data inputs from [data assets](#use-input-data-from-data-asset), [data stores](#use-input-data-from-data-stores), and [Azure Storage Accounts](#input-data-from-azure-storage-accounts).
335
+
The following examples show how to create jobs, taking data inputs from [data assets](#use-input-data-from-data-asset), [data stores](#use-input-data-from-data-stores), and [Azure Storage Accounts](#use-input-data-from-azure-storage-accounts).
329
336
330
337
### Use input data from data asset
331
338
@@ -388,6 +395,8 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
388
395
389
396
Use the Azure Machine Learning CLI, Azure Machine Learning SDK for Python, or the Azure Machine Learning studio to get the location (region), workspace, and data asset name and version. You need these items for later procedures.
390
397
398
+
---
399
+
391
400
1. Create the input or request:
392
401
393
402
# [Azure CLI](#tab/cli)
@@ -491,6 +500,8 @@ Azure Machine Learning data assets (formerly known as datasets) are supported as
491
500
Content-Type: application/json
492
501
```
493
502
503
+
---
504
+
494
505
### Use input data from data stores
495
506
496
507
You can directly reference data from Azure Machine Learning registered data stores with batch deployments jobs. In this example, you first upload some data to the default data store in the Azure Machine Learning workspace and then run a batch deployment on it. Follow these steps to run a batch endpoint job using data stored in a data store.
@@ -647,11 +658,11 @@ You can directly reference data from Azure Machine Learning registered data stor
647
658
648
659
---
649
660
650
-
### Input data from Azure Storage Accounts
661
+
### Use input data from Azure Storage Accounts
651
662
652
663
Azure Machine Learning batch endpoints can read data from cloud locations in Azure Storage Accounts, both public and private. Use the following steps to run a batch endpoint job with data stored in a storage account.
653
664
654
-
To learn more about extra required configuration for reading data from storage accounts, see [Configure compute clusters for data access](how-to-authenticate-batch-endpoint.md#configure-compute-clusters-for-data-access).
665
+
To learn more about extra required configuration for reading data from storage accounts, see [Configure compute clusters for data access](how-to-authenticate-batch-endpoint.md#configure-compute-clusters-for-data-access).
655
666
656
667
1. Create the input or request:
657
668
@@ -723,6 +734,8 @@ To learn more about extra required configuration for reading data from storage a
723
734
}
724
735
```
725
736
737
+
---
738
+
726
739
1. Run the endpoint:
727
740
728
741
# [Azure CLI](#tab/cli)
@@ -889,6 +902,8 @@ The following example shows how to change the location where an output named `sc
889
902
# [REST](#tab/rest)
890
903
891
904
Use the Azure Machine Learning CLI, Azure Machine Learning SDK for Python, or the studio to get the data store information.
Copy file name to clipboardExpand all lines: articles/machine-learning/includes/batch-endpoint-invoke-inputs-sdk.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,12 +2,12 @@
2
2
author: msakande
3
3
ms.service: machine-learning
4
4
ms.topic: include
5
-
ms.date: 12/18/2023
5
+
ms.date: 07/31/2024
6
6
ms.author: mopeakande
7
7
---
8
8
9
-
__What's the difference between `inputs` and `input` when you invoke an endpoint?__
9
+
__What's the difference between the `inputs` and `input` parameter when you invoke an endpoint?__
10
10
11
-
In general, you can use a dictionary `inputs = {}` with the `invoke` method to provide an arbitrary number of required inputs to a batch endpoint that contains a _model deployment_ or a _pipeline deployment_.
11
+
In general, you can use a dictionary `inputs = {}`parameter with the `invoke` method to provide an arbitrary number of required inputs to a batch endpoint that contains a _model deployment_ or a _pipeline deployment_.
12
12
13
-
For a _model deployment_, you can use `input` as a shorter way to specify the input data location for the deployment, since a model deployment always takes only one [data input](../how-to-access-data-batch-endpoints-jobs.md#data-inputs).
13
+
For a _model deployment_, you can use the `input`parameter as a shorter way to specify the input data location for the deployment. This approach works because a model deployment always takes only one [data input](../how-to-access-data-batch-endpoints-jobs.md#explore-data-inputs).
0 commit comments