Skip to content

Commit d31ac9c

Browse files
Merge pull request #228467 from santiagxf/santiagxf/azureml-batch-scoring
Update how-to-batch-scoring-script.md
2 parents 2f71de7 + a3b14e4 commit d31ac9c

File tree

1 file changed

+16
-13
lines changed

1 file changed

+16
-13
lines changed

articles/machine-learning/how-to-batch-scoring-script.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@ ms.custom: how-to
1717

1818
[!INCLUDE [cli v2](../../includes/machine-learning-dev-v2.md)]
1919

20-
Batch endpoints allow you to deploy models to perform inference at scale. Because how inference should be executed varies from model's format, model's type and use case, batch endpoints require a scoring script (also known as batch driver script) to indicate the deployment how to use the model over the provided data. In this article you will learn how to use scoring scripts in different scenarios and their best practices.
20+
Batch endpoints allow you to deploy models to perform long-running inference at scale. To indicate how batch endpoints should use your model over the input data to create predictions, you need to create and specify a scoring script (also known as batch driver script). In this article, you will learn how to use scoring scripts in different scenarios and their best practices.
2121

2222
> [!TIP]
23-
> MLflow models don't require a scoring script as it is autogenerated for you. For more details about how batch endpoints work with MLflow models, see the dedicated tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md). Notice that this feature doesn't prevent you from writing an specific scoring script for MLflow models as explained at [Using MLflow models with a scoring script](how-to-mlflow-batch.md#customizing-mlflow-models-deployments-with-a-scoring-script).
23+
> MLflow models don't require a scoring script as it is autogenerated for you. For more details about how batch endpoints work with MLflow models, see the dedicated tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md). If you want to change the default inference routine, write an scoring script for your MLflow models as explained at [Using MLflow models with a scoring script](how-to-mlflow-batch.md#customizing-mlflow-models-deployments-with-a-scoring-script).
2424
2525
> [!WARNING]
2626
> If you are deploying an Automated ML model under a batch endpoint, notice that the scoring script that Automated ML provides only works for Online Endpoints and it is not designed for batch execution. Please follow this guideline to learn how to create one depending on what your model does.
@@ -33,7 +33,7 @@ The scoring script must contain two methods:
3333

3434
#### The `init` method
3535

36-
Use the `init()` method for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process. You model's files will be available in an environment variable called `AZUREML_MODEL_DIR`. Use this variable to locate the files associated with the model. Notice that some models may be contained in a folder (in the following example, the model has several files in a folder named `model`). See [how you can find out what's the folder used by your model](#using-models-that-are-folders).
36+
Use the `init()` method for any costly or common preparation. For example, use it to load the model into memory. This function is called once at the beginning of the entire batch job. Your model's files are available in a path determined by the environment variable `AZUREML_MODEL_DIR`. Notice that depending on how your model was registered, its files may be contained in a folder (in the following example, the model has several files in a folder named `model`). See [how you can find out what's the folder used by your model](#using-models-that-are-folders).
3737

3838
```python
3939
def init():
@@ -51,10 +51,13 @@ Notice that in this example we are placing the model in a global variable `model
5151

5252
#### The `run` method
5353

54-
Use the `run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]` method to perform the scoring of each mini-batch generated by the batch deployment. Such method will be called once per each `mini_batch` generated for your input data. Batch deployments read data in batches accordingly to how the deployment is configured.
54+
Use the `run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]` method to perform the scoring of each mini-batch generated by the batch deployment. Such method is called once per each `mini_batch` generated for your input data. Batch deployments read data in batches accordingly to how the deployment is configured.
5555

5656
```python
57-
def run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]:
57+
import pandas as pd
58+
from typing import List, Any, Union
59+
60+
def run(mini_batch: List[str]) -> Union[List[Any], pd.DataFrame]:
5861
results = []
5962

6063
for file in mini_batch:
@@ -63,14 +66,14 @@ def run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]:
6366
return pd.DataFrame(results)
6467
```
6568

66-
The method receives a list of file paths as a parameter (`mini_batch`). You can use this list to either iterate over each file and process it one by one, or to read the entire batch and process it at once. The best option will depend on your compute memory and the throughput you need to achieve. For an example of how to read entire batches of data at once see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments).
69+
The method receives a list of file paths as a parameter (`mini_batch`). You can use this list to either iterate over each file and process it one by one, or to read the entire batch and process it at once. The best option depends on your compute memory and the throughput you need to achieve. For an example of how to read entire batches of data at once see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments).
6770

6871
> [!NOTE]
6972
> __How is work distributed?__
7073
>
7174
> Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this will happen regardless of the size of the files involved. If your files are too big to be processed in large mini-batches we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
7275
73-
The `run()` method should return a Pandas `DataFrame` or an array/list. Each returned output element indicates one successful run of an input element in the input `mini_batch`. For file datasets, each row/element will represent a single file processed. For a tabular dataset, each row/element will represent a row in a processed file.
76+
The `run()` method should return a Pandas `DataFrame` or an array/list. Each returned output element indicates one successful run of an input element in the input `mini_batch`. For file datasets, each row/element represents a single file processed. For a tabular dataset, each row/element represents a row in a processed file.
7477

7578
> [!IMPORTANT]
7679
> __How to write predictions?__
@@ -82,11 +85,11 @@ The `run()` method should return a Pandas `DataFrame` or an array/list. Each ret
8285
> [!WARNING]
8386
> Do not not output complex data types (or lists of complex data types) in the `run` function. Those outputs will be transformed to string and they will be hard to read.
8487
85-
The resulting DataFrame or array is appended to the output file indicated. There's no requirement on the cardinality of the results (1 file can generate 1 or many rows/elements in the output). All elements in the result DataFrame or array will be written to the output file as-is (considering the `output_action` isn't `summary_only`).
88+
The resulting DataFrame or array is appended to the output file indicated. There's no requirement on the cardinality of the results (1 file can generate 1 or many rows/elements in the output). All elements in the result DataFrame or array are written to the output file as-is (considering the `output_action` isn't `summary_only`).
8689

8790
#### Python packages for scoring
8891

89-
Any library that your scoring script requires to run needs to be indicated in the environment where your batch deployment runs. As for scoring scripts, environments are indicated per deployment. Usually, you will indicate your requirements using a `conda.yml` dependencies file which may look as follows:
92+
Any library that your scoring script requires to run needs to be indicated in the environment where your batch deployment runs. As for scoring scripts, environments are indicated per deployment. Usually, you indicate your requirements using a `conda.yml` dependencies file, which may look as follows:
9093

9194
__mnist/environment/conda.yml__
9295

@@ -96,7 +99,7 @@ Refer to [Create a batch deployment](how-to-use-batch-endpoint.md#create-a-batch
9699

97100
## Writing predictions in a different way
98101

99-
By default, the batch deployment will write the model's predictions in a single file as indicated in the deployment. However, there are some cases where you need to write the predictions in multiple files. For instance, if the input data is partitioned, you typically would want to generate your output partitioned too. On those cases you can [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md) to indicate:
102+
By default, the batch deployment writes the model's predictions in a single file as indicated in the deployment. However, there are some cases where you need to write the predictions in multiple files. For instance, if the input data is partitioned, you typically would want to generate your output partitioned too. On those cases you can [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md) to indicate:
100103

101104
> [!div class="checklist"]
102105
> * The file format used (CSV, parquet, json, etc).
@@ -120,13 +123,13 @@ When writing scoring scripts that work with big amounts of data, you need to tak
120123
* The memory footprint of the model when running over the input data.
121124
* The available memory in your compute.
122125

123-
Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this will happen regardless of the size of the files involved. If your files are too big to be processed in large mini-batches we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
126+
Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each (regardless of the size of the files involved). If your files are too big to be processed in large mini-batches, we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
124127

125128
### Relationship between the degree of parallelism and the scoring script
126129

127130
Your deployment configuration controls the size of each mini-batch and the number of workers on each node. Take into account them when deciding if you want to read the entire mini-batch to perform inference, or if you want to run inference file by file, or row by row (for tabular). See [Running inference at the mini-batch, file or the row level](#running-inference-at-the-mini-batch-file-or-the-row-level) to see the different approaches.
128131

129-
When running multiple workers on the same instance, take into account that memory will be shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size and compute SKU remains the same).
132+
When running multiple workers on the same instance, take into account that memory is shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size and compute SKU remains the same).
130133

131134
### Running inference at the mini-batch, file or the row level
132135

@@ -139,7 +142,7 @@ You will typically want to run inference over the batch all at once when you wan
139142
> [!WARNING]
140143
> Running inference at the batch level may require having high control over the input data size to be able to correctly account for the memory requirements and avoid out of memory exceptions. Whether you are able or not of loading the entire mini-batch in memory will depend on the size of the mini-batch, the size of the instances in the cluster, the number of workers on each node, and the size of the mini-batch.
141144
142-
For an example about how to achieve it see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments). This example processes an entire batch of files at a time.
145+
For an example about how to achieve it, see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments). This example processes an entire batch of files at a time.
143146

144147
#### File level
145148

0 commit comments

Comments
 (0)