Skip to content

Commit 89a3a40

Browse files
Merge pull request #228235 from santiagxf/santiagxf/azureml-batch-hint
Update how-to-batch-scoring-script.md
2 parents 7ba51ce + 92a89ea commit 89a3a40

File tree

1 file changed

+36
-7
lines changed

1 file changed

+36
-7
lines changed

articles/machine-learning/how-to-batch-scoring-script.md

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ The scoring script must contain two methods:
3333

3434
#### The `init` method
3535

36-
Use the `init()` method for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process. You model's files will be available in an environment variable called `AZUREML_MODEL_DIR`. Use this variable to locate the files associated with the model.
36+
Use the `init()` method for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process. You model's files will be available in an environment variable called `AZUREML_MODEL_DIR`. Use this variable to locate the files associated with the model. Notice that some models may be contained in a folder (in the following example, the model has several files in a folder named `model`). See [how you can find out what's the folder used by your model](#using-models-that-are-folders).
3737

3838
```python
3939
def init():
@@ -54,7 +54,7 @@ Notice that in this example we are placing the model in a global variable `model
5454
Use the `run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]` method to perform the scoring of each mini-batch generated by the batch deployment. Such method will be called once per each `mini_batch` generated for your input data. Batch deployments read data in batches accordingly to how the deployment is configured.
5555

5656
```python
57-
def run(mini_batch):
57+
def run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]:
5858
results = []
5959

6060
for file in mini_batch:
@@ -122,6 +122,12 @@ When writing scoring scripts that work with big amounts of data, you need to tak
122122

123123
Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this will happen regardless of the size of the files involved. If your files are too big to be processed in large mini-batches we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
124124

125+
### Relationship between the degree of parallelism and the scoring script
126+
127+
Your deployment configuration controls the size of each mini-batch and the number of workers on each node. Take into account them when deciding if you want to read the entire mini-batch to perform inference, or if you want to run inference file by file, or row by row (for tabular). See [Running inference at the mini-batch, file or the row level](#running-inference-at-the-mini-batch-file-or-the-row-level) to see the different approaches.
128+
129+
When running multiple workers on the same instance, take into account that memory will be shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size and compute SKU remains the same).
130+
125131
### Running inference at the mini-batch, file or the row level
126132

127133
Batch endpoints will call the `run()` function in your scoring script once per mini-batch. However, you will have the power to decide if you want to run the inference over the entire batch, over one file at a time, or over one row at a time (if your data happens to be tabular).
@@ -133,7 +139,7 @@ You will typically want to run inference over the batch all at once when you wan
133139
> [!WARNING]
134140
> Running inference at the batch level may require having high control over the input data size to be able to correctly account for the memory requirements and avoid out of memory exceptions. Whether you are able or not of loading the entire mini-batch in memory will depend on the size of the mini-batch, the size of the instances in the cluster, the number of workers on each node, and the size of the mini-batch.
135141
136-
For an example about how to achieve it see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments).
142+
For an example about how to achieve it see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments). This example processes an entire batch of files at a time.
137143

138144
#### File level
139145

@@ -142,17 +148,40 @@ One of the easiest ways to perform inference is by iterating over all the files
142148
> [!TIP]
143149
> If file sizes are too big to be readed even at once, please consider breaking down files into multiple smaller files to account for better parallelization.
144150
145-
For an example about how to achieve it see [Image processing with batch deployments](how-to-image-processing-batch.md).
151+
For an example about how to achieve it see [Image processing with batch deployments](how-to-image-processing-batch.md). This example processes a file at a time.
146152

147153
#### Row level (tabular)
148154

149155
For models that present challenges in the size of their inputs, you may want to consider running inference at the row level. Your batch deployment will still provide your scoring script with a mini-batch of files, however, you will read one file, one row at a time. This may look inefficient but for some deep learning models may be the only way to perform inference without scaling up your hardware requirements.
150156

151-
For an example about how to achieve it see [Text processing with batch deployments](how-to-nlp-processing-batch.md).
157+
For an example about how to achieve it see [Text processing with batch deployments](how-to-nlp-processing-batch.md). This example processes a row at a time.
152158

153-
### Relationship between the degree of parallelism and the scoring script
159+
### Using models that are folders
160+
161+
When authoring scoring scripts, the environment variable `AZUREML_MODEL_DIR` is typically used in the `init()` function to load the model. However, some models may contain its files inside of a folder. When reading the files in this variable, you may need to account for that. You can identify the folder where your MLflow model is placed as follows:
162+
163+
1. Go to [Azure Machine Learning portal](https://ml.azure.com).
164+
165+
1. Go to the section __Models__.
166+
167+
1. Select the model you are trying to deploy and click on the tab __Artifacts__.
168+
169+
1. Take note of the folder that is displayed. This folder was indicated when the model was registered.
170+
171+
:::image type="content" source="media/how-to-deploy-mlflow-models-online-endpoints/mlflow-model-folder-name.png" lightbox="media/how-to-deploy-mlflow-models-online-endpoints/mlflow-model-folder-name.png" alt-text="Screenshot showing the folder where the model artifacts are placed.":::
154172

155-
Your deployment configuration controls the size of each mini-batch and the number of workers on each node. Take into account them when deciding if you want to read the entire mini-batch to perform inference. When running multiple workers on the same instance, take into account that memory will be shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size remains the same).
173+
Then you can use this path to load the model:
174+
175+
```python
176+
def init():
177+
global model
178+
179+
# AZUREML_MODEL_DIR is an environment variable created during deployment
180+
# The path "model" is the name of the registered model's folder
181+
model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")
182+
183+
model = load_model(model_path)
184+
```
156185

157186
## Next steps
158187

0 commit comments

Comments
 (0)