MicrosoftDocs
diff --git a/‎articles/machine-learning/how-to-batch-scoring-script.md
Lines changed: 35 additions & 6 deletions b/‎articles/machine-learning/how-to-batch-scoring-script.md
Lines changed: 35 additions & 6 deletions
diff --git a/‎articles/machine-learning/media/how-to-batch-scoring-script/configure-scoring-script-mlflow.png
23.9 KB b/‎articles/machine-learning/media/how-to-batch-scoring-script/configure-scoring-script-mlflow.png
23.9 KB
diff --git a/‎articles/machine-learning/media/how-to-batch-scoring-script/configure-scoring-script.png
22.6 KB b/‎articles/machine-learning/media/how-to-batch-scoring-script/configure-scoring-script.png
22.6 KB
@@ -20,14 +20,43 @@ ms.custom: how-to
 Batch endpoints allow you to deploy models to perform long-running inference at scale. To indicate how batch endpoints should use your model over the input data to create predictions, you need to create and specify a scoring script (also known as batch driver script). In this article, you will learn how to use scoring scripts in different scenarios and their best practices.
 
 > [!TIP]
-> MLflow models don't require a scoring script as it is autogenerated for you. For more details about how batch endpoints work with MLflow models, see the dedicated tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md). If you want to change the default inference routine, write an scoring script for your MLflow models as explained at [Using MLflow models with a scoring script](how-to-mlflow-batch.md#customizing-mlflow-models-deployments-with-a-scoring-script).
+> MLflow models don't require a scoring script as it is autogenerated for you. For more details about how batch endpoints work with MLflow models, see the dedicated tutorial [Using MLflow models in batch deployments](how-to-mlflow-batch.md).
 
 > [!WARNING]
 > If you are deploying an Automated ML model under a batch endpoint, notice that the scoring script that Automated ML provides only works for Online Endpoints and it is not designed for batch execution. Please follow this guideline to learn how to create one depending on what your model does.
 
 ## Understanding the scoring script
 
-The scoring script is a Python file (`.py`) that contains the logic about how to run the model and read the input data submitted by the batch deployment executor driver. Each model deployment has to provide a scoring script, however, an endpoint may host multiple deployments using different scoring script versions. 
+The scoring script is a Python file (`.py`) that contains the logic about how to run the model and read the input data submitted by the batch deployment executor. Each model deployment provides the scoring script (allow with any other dependency required) at creation time. It is usually indicated as follows:
+
+# [Azure CLI](#tab/cli)
+
+__deployment.yml__
+
+:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/mnist-classifier/deployment-torch/deployment.yml" range="8-10":::
+
+# [Python](#tab/python)
+
+```python
+deployment = BatchDeployment(
+    ...
+    code_path="code",
+    scoring_script="batch_driver.py",
+    ...
+)
+```
+
+# [Studio](#tab/azure-studio)
+
+When creating a new deployment, you will be prompted for a scoring script and dependencies as follows:
+
+:::image type="content" source="./media/how-to-batch-scoring-script/configure-scoring-script.png" alt-text="Screenshot of the step where you can configure the scoring script in a new deployment.":::
+
+For MLflow models, scoring scripts are automatically generated but you can indicate one by checking the following option:
+
+:::image type="content" source="./media/how-to-batch-scoring-script/configure-scoring-script-mlflow.png" alt-text="Screenshot of the step where you can configure the scoring script in a new deployment when the model has MLflow format.":::
+    
+---
 
 The scoring script must contain two methods:
 
@@ -78,9 +107,9 @@ The `run()` method should return a Pandas `DataFrame` or an array/list. Each ret
 > [!IMPORTANT]
 > __How to write predictions?__
 > 
-> Use __arrays__ when you need to output a single prediction. Use __pandas DataFrames__ when you need to return multiple pieces of information. For instance, for tabular data, you may want to append your predictions to the original record. Use a pandas DataFrame for this case. For file datasets, __we still recommend to output a pandas DataFrame__ as they provide a more robust approach to read the results.
-> 
-> Although pandas DataFrame may contain column names, they are not included in the output file. If needed, please see [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md).
+> Whatever you return in the `run()` function will be appended in the output pedictions file generated by the batch job. It is important to return the right data type from this function. Return __arrays__ when you need to output a single prediction. Return __pandas DataFrames__ when you need to return multiple pieces of information. For instance, for tabular data you may want to append your predictions to the original record. Use a pandas DataFrame for this case. Although pandas DataFrame may contain column names, they are not included in the output file.
+>
+> If you need to write predictions in a different way, you can [customize outputs in batch deployments](how-to-deploy-model-custom-output.md).
 
 > [!WARNING]
 > Do not not output complex data types (or lists of complex data types) in the `run` function. Those outputs will be transformed to string and they will be hard to read.
@@ -161,7 +190,7 @@ For an example about how to achieve it see [Text processing with batch deploymen
 
 ### Using models that are folders
 
-When authoring scoring scripts, the environment variable `AZUREML_MODEL_DIR` is typically used in the `init()` function to load the model. However, some models may contain its files inside of a folder. When reading the files in this variable, you may need to account for that. You can identify the folder where your MLflow model is placed as follows:
+The environment variable `AZUREML_MODEL_DIR` contains the path to where the selected model is located and it is typically used in the `init()` function to load the model into memory. However, some models may contain its files inside of a folder. When reading the files in this variable, you may need to account for that. You can identify the folder where your MLflow model is placed as follows:
 
 1. Go to [Azure Machine Learning portal](https://ml.azure.com).