Skip to content

Commit b053763

Browse files
author
Larry Franks
committed
fixing merge conflicts
1 parent bf8540b commit b053763

File tree

2 files changed

+11
-30
lines changed

2 files changed

+11
-30
lines changed

articles/machine-learning/concept-endpoints.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ Traffic allocation can be used to do safe rollout blue/green deployments by bala
8989

9090
:::image type="content" source="media/concept-endpoints/endpoint-concept.png" alt-text="Diagram showing an endpoint splitting traffic to two deployments.":::
9191

92-
Traffic to one deployment can also be mirrored (copied) to another deployment. Mirroring is useful when you want to test for things like response latency or error conditions without impacting live clients. For example, a blue/green deployment where 100% of the traffic is routed to blue and a 10% is mirrored to green. With mirroring, the results of the traffic to the green deployment aren't returned to the clients but metrics and logs are collected. Mirror traffic functionality is a __preview__ feature.
92+
Traffic to one deployment can also be mirrored (copied) to another deployment. Mirroring is useful when you want to test for things like response latency or error conditions without impacting live clients. For example, a blue/green deployment where 100% of the traffic is routed to blue and a 10% is mirrored to the green deployment. With mirroring, the results of the traffic to the green deployment aren't returned to the clients but metrics and logs are collected. Mirror traffic functionality is a __preview__ feature.
9393

9494
:::image type="content" source="media/concept-endpoints/endpoint-concept-mirror.png" alt-text="Diagram showing an endpoint mirroring traffic to a deployment.":::
9595

@@ -222,7 +222,7 @@ Specify the storage output location to any datastore and path. By default, batch
222222

223223
- Authentication: Azure Active Directory Tokens
224224
- SSL: enabled by default for endpoint invocation
225-
- VNET support: Batch endpoints support ingress protection. A batch endpoint with ingress protection will accept scoring requests only from hosts inside a virtual network but not from the public internet. A batch endpoint that is created in a private-link enabled workspace will have ingress protection. To created a private-link enabled workspace, see [Create a secure workspace](tutorial-create-secure-workspace.md).
225+
- VNET support: Batch endpoints support ingress protection. A batch endpoint with ingress protection will accept scoring requests only from hosts inside a virtual network but not from the public internet. A batch endpoint that is created in a private-link enabled workspace will have ingress protection. To create a private-link enabled workspace, see [Create a secure workspace](tutorial-create-secure-workspace.md).
226226

227227
## Next steps
228228

articles/machine-learning/how-to-use-batch-endpoint.md

Lines changed: 9 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,9 @@ ms.topic: conceptual
99
author: dem108
1010
ms.author: sehan
1111
ms.reviewer: larryfr
12-
ms.date: 04/26/2022
13-
<<<<<<< HEAD
14-
ms.custom: how-to, devplatv2
15-
16-
# Customer intent: As an ML engineer or data scientist, I want to create an endpoint to host my models for batch scoring, so that I can use the same endpoint continuously for different large datasets on-demand or on-schedule.
17-
=======
12+
ms.date: 05/24/2022
1813
ms.custom: how-to, devplatv2, event-tier1-build-2022
1914
#Customer intent: As an ML engineer or data scientist, I want to create an endpoint to host my models for batch scoring, so that I can use the same endpoint continuously for different large datasets on-demand or on-schedule.
20-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
2115
---
2216

2317
# Use batch endpoints for batch scoring
@@ -136,7 +130,7 @@ For the full batch deployment YAML schema, see [CLI (v2) batch deployment YAML s
136130
| `resources.instance_count` | The number of instances to be used for each batch scoring job. |
137131
| `max_concurrency_per_instance` | [Optional] The maximum number of parallel `scoring_script` runs per instance. |
138132
| `mini_batch_size` | [Optional] The number of files the `scoring_script` can process in one `run()` call. |
139-
| `output_action` | [Optional] How the output should be organized in the output file. `append_row` will merge all `run()` returned output results into one single file named `output_file_name`. `summary_only` will not merge the output results and only calculate `error_threshold`. |
133+
| `output_action` | [Optional] How the output should be organized in the output file. `append_row` will merge all `run()` returned output results into one single file named `output_file_name`. `summary_only` won't merge the output results and only calculate `error_threshold`. |
140134
| `output_file_name` | [Optional] The name of the batch scoring output file for `append_row` `output_action`. |
141135
| `retry_settings.max_retries` | [Optional] The number of max tries for a failed `scoring_script` `run()`. |
142136
| `retry_settings.timeout` | [Optional] The timeout in seconds for a `scoring_script` `run()` for scoring a mini batch. |
@@ -150,7 +144,7 @@ As mentioned earlier, the `code_configuration.scoring_script` must contain two f
150144
- `init()`: Use this function for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process.
151145
- `run(mini_batch)`: This function will be called for each `mini_batch` and do the actual scoring.
152146
- `mini_batch`: The `mini_batch` value is a list of file paths.
153-
- `response`: The `run()` method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the input `mini_batch`. Make sure that enough data is included in your `run()` response to correlate the input with the output. The resulting DataFrame or array is populated according to this scoring script. It is up to you how much or how little information you’d like to output to correlate output values with the input value, e.g. the array can represent a list of tuples containing both the model's output and input. There is no requirement on the cardinality of the results. All elements in the result DataFrame or array will be written to the output file as-is (given that the `output_action` is not `summary_only`).
147+
- `response`: The `run()` method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the input `mini_batch`. Make sure that enough data is included in your `run()` response to correlate the input with the output. The resulting DataFrame or array is populated according to this scoring script. It's up to you how much or how little information you’d like to output to correlate output values with the input value, for example, the array can represent a list of tuples containing both the model's output and input. There's no requirement on the cardinality of the results. All elements in the result DataFrame or array will be written to the output file as-is (given that the `output_action` isn't `summary_only`).
154148

155149
The example uses `/cli/endpoints/batch/mnist/code/digit_identification.py`. The model is loaded in `init()` from `AZUREML_MODEL_DIR`, which is the path to the model folder created during deployment. `run(mini_batch)` iterates each file in `mini_batch`, does the actual model scoring and then returns output results.
156150

@@ -194,15 +188,12 @@ Invoke a batch endpoint triggers a batch scoring job. A job `name` will be retur
194188
#### Invoke the batch endpoint with different input options
195189

196190
You can either use CLI or REST to `invoke` the endpoint. For REST experience, see [Use batch endpoints with REST](how-to-deploy-batch-with-rest.md)
197-
<<<<<<< HEAD
198191

199192
There are several options to specify the data inputs in CLI `invoke`.
200193

201194
* __Option 1-1: Data in the cloud__
202-
=======
203-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
204195

205-
Use `--input` and `--input-type` to specify a file or folder on an Azure Machine Learning registered datastore or a publicly accessible path. When you are specifying a single file, use `--input-type uri_file`, and when you are specifying a folder, use `--input-type uri_folder`).
196+
Use `--input` and `--input-type` to specify a file or folder on an Azure Machine Learning registered datastore or a publicly accessible path. When you're specifying a single file, use `--input-type uri_file`, and when you're specifying a folder, use `--input-type uri_folder`).
206197

207198
When the file or folder is on Azure ML registered datastore, the syntax for the URI is `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/` for folder, and `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/<file-name>` for a specific file. When the file of folder is on a publicly accessible path, the syntax for the URI is `https://<public-path>/` for folder, `https://<public-path>/<file-name>` for a specific file.
208199

@@ -214,25 +205,15 @@ There are several options to specify the data inputs in CLI `invoke`.
214205

215206
* __Option 1-2: Registered data asset__
216207

217-
<<<<<<< HEAD
218-
Use `--input` to pass in an Azure Machine Learning registered V2 data asset (with the type of either `uri_file` or `url_folder`). You do not need to specify `--input-type` in this option. The syntax for this option is `azureml:<dataset-name>:<dataset-version>`.
208+
Use `--input` to pass in an Azure Machine Learning registered V2 data asset (with the type of either `uri_file` or `url_folder`). You don't need to specify `--input-type` in this option. The syntax for this option is `azureml:<dataset-name>:<dataset-version>`.
219209

220210
```azurecli
221211
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input azureml:<dataset-name>:<dataset-version>
222-
=======
223-
Use `--input-data` to pass in an Azure Machine Learning registered V1 `FileDataset`. While full backward compatibility is provided, if your intention with your V1 `FileDataset` assets was to have a single path to a file or folder with no loading transforms (sample, take, filter, etc.), then we recommend that you re-create them as a `uri_file`/`uri_folder` using the CLI v2 and use `--input-path` parameter to use with batch endpoint. V1 `TabularDataset` is not supported.
224-
225-
> [!NOTE]
226-
> For more information on V2 data assets, see [Work with data using SDK v2 preview](how-to-use-data.md). As we enable the abstraction for tabular data called `mltable` for batch endpoint in the future, migration from V1 data assets (specifically `FileDataset`) to V2 data assets (`mltable`) will be required.
227-
228-
```azurecli
229-
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-data azureml:<dataset-name>:<dataset-version>
230-
>>>>>>> 0129ef009e25c15aafd490699d1e4ceaec0f385b
231212
```
232213
233214
* __Option 2: Data stored locally__
234215
235-
Use `--input` to pass in data files stored locally. You do not need to specify `--input-type` in this option. The data files will be automatically uploaded as a folder to Azure ML datastore, and passed to the batch scoring job.
216+
Use `--input` to pass in data files stored locally. You don't need to specify `--input-type` in this option. The data files will be automatically uploaded as a folder to Azure ML datastore, and passed to the batch scoring job.
236217
237218
```azurecli
238219
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input <local-path>
@@ -245,7 +226,7 @@ There are several options to specify the data inputs in CLI `invoke`.
245226
246227
#### Configure the output location and overwrite settings
247228
248-
The batch scoring results are by default stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint. Use `--output-path` to configure any folder in an Azure Machine Learning registered datastore. The syntax for the `--output-path` is the same as `--input` when you are specifying a folder, i.e., `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/`. The prefix `folder:` is not required any more. Use `--set output_file_name=<your-file-name>` to configure a new output file name if you prefer having one output file containing all scoring results (specified `output_action=append_row` in your deployment YAML).
229+
The batch scoring results are by default stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint. Use `--output-path` to configure any folder in an Azure Machine Learning registered datastore. The syntax for the `--output-path` is the same as `--input` when you're specifying a folder, that is, `azureml://datastores/<datastore-name>/paths/<path-on-datastore>/`. The prefix `folder:` isn't required anymore. Use `--set output_file_name=<your-file-name>` to configure a new output file name if you prefer having one output file containing all scoring results (specified `output_action=append_row` in your deployment YAML).
249230
250231
> [!IMPORTANT]
251232
> You must use a unique output location. If the output file exists, the batch scoring job will fail.
@@ -296,7 +277,7 @@ To create a new batch deployment under the existing batch endpoint but not set i
296277
297278
:::code language="azurecli" source="~/azureml-examples-main/cli/batch-score.sh" ID="create_new_deployment_not_default" :::
298279
299-
Notice that `--set-default` is not used. If you `show` the batch endpoint again, you should see no change of the `defaults.deployment_name`.
280+
Notice that `--set-default` isn't used. If you `show` the batch endpoint again, you should see no change of the `defaults.deployment_name`.
300281
301282
The example uses a model (`/cli/endpoints/batch/autolog_nyc_taxi`) trained and tracked with MLflow. `scoring_script` and `environment` can be auto generated using model's metadata, no need to specify in the YAML file. For more about MLflow, see [Train and track ML models with MLflow and Azure Machine Learning](how-to-use-mlflow.md).
302283
@@ -333,7 +314,7 @@ If you aren't going to use the old batch deployment, you should delete it by run
333314
334315
::: code language="azurecli" source="~/azureml-examples-main/cli/batch-score.sh" ID="delete_deployment" :::
335316
336-
Run the following code to delete the batch endpoint and all the underlying deployments. Batch scoring jobs will not be deleted.
317+
Run the following code to delete the batch endpoint and all the underlying deployments. Batch scoring jobs won't be deleted.
337318
338319
::: code language="azurecli" source="~/azureml-examples-main/cli/batch-score.sh" ID="delete_endpoint" :::
339320

0 commit comments

Comments
 (0)