You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-batch-scoring-script.md
+36-7Lines changed: 36 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ The scoring script must contain two methods:
33
33
34
34
#### The `init` method
35
35
36
-
Use the `init()` method for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process. You model's files will be available in an environment variable called `AZUREML_MODEL_DIR`. Use this variable to locate the files associated with the model.
36
+
Use the `init()` method for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process. You model's files will be available in an environment variable called `AZUREML_MODEL_DIR`. Use this variable to locate the files associated with the model. Notice that some models may be contained in a folder (in the following example, the model has several files in a folder named `model`). See [how you can find out what's the folder used by your model](#using-models-that-are-folders).
37
37
38
38
```python
39
39
definit():
@@ -54,7 +54,7 @@ Notice that in this example we are placing the model in a global variable `model
54
54
Use the `run(mini_batch: List[str]) -> Union[List[Any], pandas.DataFrame]` method to perform the scoring of each mini-batch generated by the batch deployment. Such method will be called once per each `mini_batch` generated for your input data. Batch deployments read data in batches accordingly to how the deployment is configured.
@@ -122,6 +122,12 @@ When writing scoring scripts that work with big amounts of data, you need to tak
122
122
123
123
Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this will happen regardless of the size of the files involved. If your files are too big to be processed in large mini-batches we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
124
124
125
+
### Relationship between the degree of parallelism and the scoring script
126
+
127
+
Your deployment configuration controls the size of each mini-batch and the number of workers on each node. Take into account them when deciding if you want to read the entire mini-batch to perform inference, or if you want to run inference file by file, or row by row (for tabular). See [Running inference at the mini-batch, file or the row level](#running-inference-at-the-mini-batch-file-or-the-row-level) to see the different approaches.
128
+
129
+
When running multiple workers on the same instance, take into account that memory will be shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size and compute SKU remains the same).
130
+
125
131
### Running inference at the mini-batch, file or the row level
126
132
127
133
Batch endpoints will call the `run()` function in your scoring script once per mini-batch. However, you will have the power to decide if you want to run the inference over the entire batch, over one file at a time, or over one row at a time (if your data happens to be tabular).
@@ -133,7 +139,7 @@ You will typically want to run inference over the batch all at once when you wan
133
139
> [!WARNING]
134
140
> Running inference at the batch level may require having high control over the input data size to be able to correctly account for the memory requirements and avoid out of memory exceptions. Whether you are able or not of loading the entire mini-batch in memory will depend on the size of the mini-batch, the size of the instances in the cluster, the number of workers on each node, and the size of the mini-batch.
135
141
136
-
For an example about how to achieve it see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments).
142
+
For an example about how to achieve it see [High throughput deployments](how-to-image-processing-batch.md#high-throughput-deployments). This example processes an entire batch of files at a time.
137
143
138
144
#### File level
139
145
@@ -142,17 +148,40 @@ One of the easiest ways to perform inference is by iterating over all the files
142
148
> [!TIP]
143
149
> If file sizes are too big to be readed even at once, please consider breaking down files into multiple smaller files to account for better parallelization.
144
150
145
-
For an example about how to achieve it see [Image processing with batch deployments](how-to-image-processing-batch.md).
151
+
For an example about how to achieve it see [Image processing with batch deployments](how-to-image-processing-batch.md). This example processes a file at a time.
146
152
147
153
#### Row level (tabular)
148
154
149
155
For models that present challenges in the size of their inputs, you may want to consider running inference at the row level. Your batch deployment will still provide your scoring script with a mini-batch of files, however, you will read one file, one row at a time. This may look inefficient but for some deep learning models may be the only way to perform inference without scaling up your hardware requirements.
150
156
151
-
For an example about how to achieve it see [Text processing with batch deployments](how-to-nlp-processing-batch.md).
157
+
For an example about how to achieve it see [Text processing with batch deployments](how-to-nlp-processing-batch.md). This example processes a row at a time.
152
158
153
-
### Relationship between the degree of parallelism and the scoring script
159
+
### Using models that are folders
160
+
161
+
When authoring scoring scripts, the environment variable `AZUREML_MODEL_DIR` is typically used in the `init()` function to load the model. However, some models may contain its files inside of a folder. When reading the files in this variable, you may need to account for that. You can identify the folder where your MLflow model is placed as follows:
162
+
163
+
1. Go to [Azure Machine Learning portal](https://ml.azure.com).
164
+
165
+
1. Go to the section __Models__.
166
+
167
+
1. Select the model you are trying to deploy and click on the tab __Artifacts__.
168
+
169
+
1. Take note of the folder that is displayed. This folder was indicated when the model was registered.
170
+
171
+
:::image type="content" source="media/how-to-deploy-mlflow-models-online-endpoints/mlflow-model-folder-name.png" lightbox="media/how-to-deploy-mlflow-models-online-endpoints/mlflow-model-folder-name.png" alt-text="Screenshot showing the folder where the model artifacts are placed.":::
154
172
155
-
Your deployment configuration controls the size of each mini-batch and the number of workers on each node. Take into account them when deciding if you want to read the entire mini-batch to perform inference. When running multiple workers on the same instance, take into account that memory will be shared across all the workers. Usually, increasing the number of workers per node should be accompanied by a decrease in the mini-batch size or by a change in the scoring strategy (if data size remains the same).
173
+
Then you can use this path to load the model:
174
+
175
+
```python
176
+
definit():
177
+
global model
178
+
179
+
# AZUREML_MODEL_DIR is an environment variable created during deployment
180
+
# The path "model" is the name of the registered model's folder
0 commit comments