Skip to content

Commit c644cf1

Browse files
Merge pull request #244305 from santiagxf/santiagxf/azureml-batch-mlflow
Update how-to-mlflow-batch.md
2 parents 67bcb65 + 7f35cc6 commit c644cf1

File tree

1 file changed

+2
-9
lines changed

1 file changed

+2
-9
lines changed

articles/machine-learning/how-to-mlflow-batch.md

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -271,12 +271,11 @@ Work is distributed at the file level, for both structured and unstructured data
271271
> [!WARNING]
272272
> Nested folder structures are not explored during inference. If you are partitioning your data using folders, make sure to flatten the structure beforehand.
273273
274-
> [!WARNING]
275-
> Batch deployments will call the `predict` function of the MLflow model once per file. For CSV files containing multiple rows, this may impose a memory pressure in the underlying compute. When sizing your compute, take into account not only the memory consumption of the data being read but also the memory footprint of the model itself. This is specially true for models that processes text, like transformer-based models where the memory consumption is not linear with the size of the input. If you encouter several out-of-memory exceptions, consider splitting the data in smaller files with less rows or implement batching at the row level inside of the model/scoring script.
274+
Batch deployments will call the `predict` function of the MLflow model once per file. For CSV files containing multiple rows, this may impose a memory pressure in the underlying compute. When sizing your compute, take into account not only the memory consumption of the data being read but also the memory footprint of the model itself. This is specially true for models that processes text, like transformer-based models where the memory consumption is not linear with the size of the input. If you encounter several out-of-memory exceptions, consider splitting the data in smaller files with less rows or implement batching at the row level inside of the model/scoring script.
276275

277276
### File's types support
278277

279-
The following data types are supported for batch inference when deploying MLflow models without an environment and a scoring script:
278+
The following data types are supported for batch inference when deploying MLflow models without an environment and a scoring script. If you like to process a different file type, or execute inference in a different way that batch endpoints do by default you can always create the deployment with a scoring script as explained in [Using MLflow models with a scoring script](#customizing-mlflow-models-deployments-with-a-scoring-script).
280279

281280
| File extension | Type returned as model's input | Signature requirement |
282281
| :- | :- | :- |
@@ -286,9 +285,6 @@ The following data types are supported for batch inference when deploying MLflow
286285
> [!WARNING]
287286
> Be advised that any unsupported file that may be present in the input data will make the job to fail. You will see an error entry as follows: *"ERROR:azureml:Error processing input file: '/mnt/batch/tasks/.../a-given-file.avro'. File type 'avro' is not supported."*.
288287
289-
> [!TIP]
290-
> If you like to process a different file type, or execute inference in a different way that batch endpoints do by default you can always create the deploymnet with a scoring script as explained in [Using MLflow models with a scoring script](#customizing-mlflow-models-deployments-with-a-scoring-script).
291-
292288
### Signature enforcement for MLflow models
293289

294290
Input's data types are enforced by batch deployment jobs while reading the data using the available MLflow model signature. This means that your data input should comply with the types indicated in the model signature. If the data can't be parsed as expected, the job will fail with an error message similar to the following one: *"ERROR:azureml:Error processing input file: '/mnt/batch/tasks/.../a-given-file.csv'. Exception: invalid literal for int() with base 10: 'value'"*.
@@ -317,9 +313,6 @@ You will typically select this workflow when:
317313
> [!IMPORTANT]
318314
> If you choose to indicate an scoring script for an MLflow model deployment, you will also have to specify the environment where the deployment will run.
319315
320-
> [!WARNING]
321-
> Customizing the scoring script for MLflow deployments is only available from the Azure CLI or SDK for Python. If you are creating a deployment using [Azure Machine Learning studio UI](https://ml.azure.com), please switch to the CLI or the SDK.
322-
323316

324317
### Steps
325318

0 commit comments

Comments
 (0)