You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-endpoints.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -89,7 +89,7 @@ Traffic allocation can be used to do safe rollout blue/green deployments by bala
89
89
90
90
:::image type="content" source="media/concept-endpoints/endpoint-concept.png" alt-text="Diagram showing an endpoint splitting traffic to two deployments.":::
91
91
92
-
Traffic to one deployment can also be mirrored (copied) to another deployment. Mirroring is useful when you want to test for things like response latency or error conditions without impacting live clients. For example, a blue/green deployment where 100% of the traffic is routed to blue and a 10% is mirrored to green. With mirroring, the results of the traffic to the green deployment aren't returned to the clients but metrics and logs are collected. Mirror traffic functionality is a __preview__ feature.
92
+
Traffic to one deployment can also be mirrored (copied) to another deployment. Mirroring is useful when you want to test for things like response latency or error conditions without impacting live clients. For example, a blue/green deployment where 100% of the traffic is routed to blue and a 10% is mirrored to the green deployment. With mirroring, the results of the traffic to the green deployment aren't returned to the clients but metrics and logs are collected. Mirror traffic functionality is a __preview__ feature.
93
93
94
94
:::image type="content" source="media/concept-endpoints/endpoint-concept-mirror.png" alt-text="Diagram showing an endpoint mirroring traffic to a deployment.":::
95
95
@@ -222,7 +222,7 @@ Specify the storage output location to any datastore and path. By default, batch
222
222
223
223
- Authentication: Azure Active Directory Tokens
224
224
- SSL: enabled by default for endpoint invocation
225
-
- VNET support: Batch endpoints support ingress protection. A batch endpoint with ingress protection will accept scoring requests only from hosts inside a virtual network but not from the public internet. A batch endpoint that is created in a private-link enabled workspace will have ingress protection. To created a private-link enabled workspace, see [Create a secure workspace](tutorial-create-secure-workspace.md).
225
+
- VNET support: Batch endpoints support ingress protection. A batch endpoint with ingress protection will accept scoring requests only from hosts inside a virtual network but not from the public internet. A batch endpoint that is created in a private-link enabled workspace will have ingress protection. To create a private-link enabled workspace, see [Create a secure workspace](tutorial-create-secure-workspace.md).
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-use-batch-endpoint.md
+9-28Lines changed: 9 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,15 +9,9 @@ ms.topic: conceptual
9
9
author: dem108
10
10
ms.author: sehan
11
11
ms.reviewer: larryfr
12
-
ms.date: 04/26/2022
13
-
<<<<<<< HEAD
14
-
ms.custom: how-to, devplatv2
15
-
16
-
# Customer intent: As an ML engineer or data scientist, I want to create an endpoint to host my models for batch scoring, so that I can use the same endpoint continuously for different large datasets on-demand or on-schedule.
#Customer intent: As an ML engineer or data scientist, I want to create an endpoint to host my models for batch scoring, so that I can use the same endpoint continuously for different large datasets on-demand or on-schedule.
20
-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
21
15
---
22
16
23
17
# Use batch endpoints for batch scoring
@@ -136,7 +130,7 @@ For the full batch deployment YAML schema, see [CLI (v2) batch deployment YAML s
136
130
|`resources.instance_count`| The number of instances to be used for each batch scoring job. |
137
131
|`max_concurrency_per_instance`|[Optional] The maximum number of parallel `scoring_script` runs per instance. |
138
132
|`mini_batch_size`|[Optional] The number of files the `scoring_script` can process in one `run()` call. |
139
-
|`output_action`|[Optional] How the output should be organized in the output file. `append_row` will merge all `run()` returned output results into one single file named `output_file_name`. `summary_only`will not merge the output results and only calculate `error_threshold`. |
133
+
|`output_action`|[Optional] How the output should be organized in the output file. `append_row` will merge all `run()` returned output results into one single file named `output_file_name`. `summary_only`won't merge the output results and only calculate `error_threshold`. |
140
134
|`output_file_name`|[Optional] The name of the batch scoring output file for `append_row``output_action`. |
141
135
|`retry_settings.max_retries`|[Optional] The number of max tries for a failed `scoring_script``run()`. |
142
136
|`retry_settings.timeout`|[Optional] The timeout in seconds for a `scoring_script``run()` for scoring a mini batch. |
@@ -150,7 +144,7 @@ As mentioned earlier, the `code_configuration.scoring_script` must contain two f
150
144
-`init()`: Use this function for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process.
151
145
-`run(mini_batch)`: This function will be called for each `mini_batch` and do the actual scoring.
152
146
-`mini_batch`: The `mini_batch` value is a list of file paths.
153
-
-`response`: The `run()` method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the input `mini_batch`. Make sure that enough data is included in your `run()` response to correlate the input with the output. The resulting DataFrame or array is populated according to this scoring script. It is up to you how much or how little information you’d like to output to correlate output values with the input value, e.g. the array can represent a list of tuples containing both the model's output and input. There is no requirement on the cardinality of the results. All elements in the result DataFrame or array will be written to the output file as-is (given that the `output_action`is not`summary_only`).
147
+
-`response`: The `run()` method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the input `mini_batch`. Make sure that enough data is included in your `run()` response to correlate the input with the output. The resulting DataFrame or array is populated according to this scoring script. It's up to you how much or how little information you’d like to output to correlate output values with the input value, for example, the array can represent a list of tuples containing both the model's output and input. There's no requirement on the cardinality of the results. All elements in the result DataFrame or array will be written to the output file as-is (given that the `output_action`isn't`summary_only`).
154
148
155
149
The example uses `/cli/endpoints/batch/mnist/code/digit_identification.py`. The model is loaded in `init()` from `AZUREML_MODEL_DIR`, which is the path to the model folder created during deployment. `run(mini_batch)` iterates each file in `mini_batch`, does the actual model scoring and then returns output results.
156
150
@@ -194,15 +188,12 @@ Invoke a batch endpoint triggers a batch scoring job. A job `name` will be retur
194
188
#### Invoke the batch endpoint with different input options
195
189
196
190
You can either use CLI or REST to `invoke` the endpoint. For REST experience, see [Use batch endpoints with REST](how-to-deploy-batch-with-rest.md)
197
-
<<<<<<< HEAD
198
191
199
192
There are several options to specify the data inputs in CLI `invoke`.
200
193
201
194
*__Option 1-1: Data in the cloud__
202
-
=======
203
-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
204
195
205
-
Use `--input` and `--input-type` to specify a file or folder on an Azure Machine Learning registered datastore or a publicly accessible path. When you are specifying a single file, use `--input-type uri_file`, and when you are specifying a folder, use `--input-type uri_folder`).
196
+
Use `--input` and `--input-type` to specify a file or folder on an Azure Machine Learning registered datastore or a publicly accessible path. When you're specifying a single file, use `--input-type uri_file`, and when you're specifying a folder, use `--input-type uri_folder`).
206
197
207
198
When the file or folder is on Azure ML registered datastore, the syntax for the URI is `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/` for folder, and `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/<file-name>` for a specific file. When the file of folder is on a publicly accessible path, the syntax for the URI is `https://<public-path>/` for folder, `https://<public-path>/<file-name>` for a specific file.
208
199
@@ -214,25 +205,15 @@ There are several options to specify the data inputs in CLI `invoke`.
214
205
215
206
*__Option 1-2: Registered data asset__
216
207
217
-
<<<<<<< HEAD
218
-
Use `--input` to pass in an Azure Machine Learning registered V2 data asset (with the type of either `uri_file` or `url_folder`). You do not need to specify `--input-type` in this option. The syntax for this option is `azureml:<dataset-name>:<dataset-version>`.
208
+
Use `--input` to pass in an Azure Machine Learning registered V2 data asset (with the type of either `uri_file` or `url_folder`). You don't need to specify `--input-type` in this option. The syntax for this option is `azureml:<dataset-name>:<dataset-version>`.
219
209
220
210
```azurecli
221
211
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input azureml:<dataset-name>:<dataset-version>
222
-
=======
223
-
Use `--input-data` to pass in an Azure Machine Learning registered V1 `FileDataset`. While full backward compatibility is provided, if your intention with your V1 `FileDataset` assets was to have a single path to a file or folder with no loading transforms (sample, take, filter, etc.), then we recommend that you re-create them as a `uri_file`/`uri_folder` using the CLI v2 and use `--input-path` parameter to use with batch endpoint. V1 `TabularDataset` is not supported.
224
-
225
-
> [!NOTE]
226
-
> For more information on V2 data assets, see [Work with data using SDK v2 preview](how-to-use-data.md). As we enable the abstraction for tabular data called `mltable` for batch endpoint in the future, migration from V1 data assets (specifically `FileDataset`) to V2 data assets (`mltable`) will be required.
227
-
228
-
```azurecli
229
-
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-data azureml:<dataset-name>:<dataset-version>
230
-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
231
212
```
232
213
233
214
* __Option 2: Data stored locally__
234
215
235
-
Use `--input` to pass in data files stored locally. You do not need to specify `--input-type` in this option. The data files will be automatically uploaded as a folder to Azure ML datastore, and passed to the batch scoring job.
216
+
Use `--input` to pass in data files stored locally. You don't need to specify `--input-type` in this option. The data files will be automatically uploaded as a folder to Azure ML datastore, and passed to the batch scoring job.
236
217
237
218
```azurecli
238
219
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input <local-path>
@@ -245,7 +226,7 @@ There are several options to specify the data inputs in CLI `invoke`.
245
226
246
227
#### Configure the output location and overwrite settings
247
228
248
-
The batch scoring results are by default stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint. Use `--output-path` to configure any folder in an Azure Machine Learning registered datastore. The syntax for the `--output-path` is the same as `--input` when you are specifying a folder, i.e., `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/`. The prefix `folder:` is not required any more. Use `--set output_file_name=<your-file-name>` to configure a new output file name if you prefer having one output file containing all scoring results (specified `output_action=append_row` in your deployment YAML).
229
+
The batch scoring results are by default stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint. Use `--output-path` to configure any folder in an Azure Machine Learning registered datastore. The syntax for the `--output-path` is the same as `--input` when you're specifying a folder, that is, `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/`. The prefix `folder:` isn't required anymore. Use `--set output_file_name=<your-file-name>` to configure a new output file name if you prefer having one output file containing all scoring results (specified `output_action=append_row` in your deployment YAML).
249
230
250
231
> [!IMPORTANT]
251
232
> You must use a unique output location. If the output file exists, the batch scoring job will fail.
@@ -296,7 +277,7 @@ To create a new batch deployment under the existing batch endpoint but not set i
Notice that `--set-default` is not used. If you `show` the batch endpoint again, you should see no change of the `defaults.deployment_name`.
280
+
Notice that `--set-default` isn't used. If you `show` the batch endpoint again, you should see no change of the `defaults.deployment_name`.
300
281
301
282
The example uses a model (`/cli/endpoints/batch/autolog_nyc_taxi`) trained and tracked with MLflow. `scoring_script` and `environment` can be auto generated using model's metadata, no need to specify in the YAML file. For more about MLflow, see [Train and track ML models with MLflow and Azure Machine Learning](how-to-use-mlflow.md).
302
283
@@ -333,7 +314,7 @@ If you aren't going to use the old batch deployment, you should delete it by run
0 commit comments