Skip to content

Commit ffc3ca7

Browse files
authored
Merge pull request #232737 from santiagxf/santiagxf/azureml-batch-sample-fix
Update how-to-deploy-model-custom-output.md
2 parents 96893f4 + cca149e commit ffc3ca7

File tree

1 file changed

+19
-74
lines changed

1 file changed

+19
-74
lines changed

articles/machine-learning/how-to-deploy-model-custom-output.md

Lines changed: 19 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -29,20 +29,20 @@ In any of those cases, Batch Deployments allow you to take control of the output
2929

3030
## About this sample
3131

32-
This example shows how you can deploy a model to perform batch inference and customize how your predictions are written in the output. This example uses an MLflow model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).
32+
This example shows how you can deploy a model to perform batch inference and customize how your predictions are written in the output. This example uses a model based on the [UCI Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence).
3333

3434
The model has been trained using an `XGBBoost` classifier and all the required preprocessing has been packaged as a `scikit-learn` pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.
3535

36-
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the `cli/endpoints/batch` if you are using the Azure CLI or `sdk/endpoints/batch` if you are using our SDK for Python.
36+
The information in this article is based on code samples contained in the [azureml-examples](https://github.com/azure/azureml-examples) repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the `cli/endpoints/batch/deploy-models/custom-outputs-parquet` if you are using the Azure CLI or `sdk/python/endpoints/batch/deploy-models/custom-outputs-parquet` if you are using our SDK for Python.
3737

3838
```azurecli
3939
git clone https://github.com/Azure/azureml-examples --depth 1
40-
cd azureml-examples/cli/endpoints/batch
40+
cd azureml-examples/cli/endpoints/batch/deploy-models/custom-outputs-parquet
4141
```
4242

4343
### Follow along in Jupyter Notebooks
4444

45-
You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: [custom-output-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/custom-output-batch.ipynb).
45+
You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: [custom-output-batch.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/endpoints/batch/deploy-models/custom-outputs-parquet/custom-output-batch.ipynb).
4646

4747
## Prerequisites
4848

@@ -63,23 +63,20 @@ Batch Endpoint can only deploy registered models. In this case, we already have
6363
# [Azure CLI](#tab/cli)
6464

6565
```azurecli
66-
MODEL_NAME='heart-classifier'
67-
az ml model create --name $MODEL_NAME --type "mlflow_model" --path "heart-classifier-mlflow/model"
66+
MODEL_NAME='heart-classifier-sklpipe'
67+
az ml model create --name $MODEL_NAME --type "custom_model" --path "model"
6868
```
6969

7070
# [Python](#tab/sdk)
7171

7272
```python
7373
model_name = 'heart-classifier'
7474
model = ml_client.models.create_or_update(
75-
Model(name=model_name, path='heart-classifier-mlflow/model', type=AssetTypes.MLFLOW_MODEL)
75+
Model(name=model_name, path='model', type=AssetTypes.CUSTOM_MODEL)
7676
)
7777
```
7878
---
7979

80-
> [!NOTE]
81-
> The model used in this tutorial is an MLflow model. However, the steps apply for both MLflow models and custom models.
82-
8380
### Creating a scoring script
8481

8582
We need to create a scoring script that can read the input data provided by the batch deployment and return the scores of the model. We are also going to write directly to the output folder of the job. In summary, the proposed scoring script does as follows:
@@ -89,38 +86,9 @@ We need to create a scoring script that can read the input data provided by the
8986
3. Appends the predictions to a `pandas.DataFrame` along with the input data.
9087
4. Writes the data in a file named as the input file, but in `parquet` format.
9188

92-
__batch_driver_parquet.py__
89+
__code/batch_driver.py__
9390

94-
```python
95-
import os
96-
import mlflow
97-
import pandas as pd
98-
from pathlib import Path
99-
100-
def init():
101-
global model
102-
global output_path
103-
104-
# AZUREML_MODEL_DIR is an environment variable created during deployment
105-
# It is the path to the model folder
106-
# Please provide your model's folder name if there's one:
107-
model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")
108-
output_path = os.environ['AZUREML_BI_OUTPUT_PATH']
109-
model = mlflow.pyfunc.load_model(model_path)
110-
111-
def run(mini_batch):
112-
for file_path in mini_batch:
113-
data = pd.read_csv(file_path)
114-
pred = model.predict(data)
115-
116-
data['prediction'] = pred
117-
118-
output_file_name = Path(file_path).stem
119-
output_file_path = os.path.join(output_path, output_file_name + '.parquet')
120-
data.to_parquet(output_file_path)
121-
122-
return mini_batch
123-
```
91+
:::code language="python" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/custom-outputs-parquet/code/batch_driver.py" :::
12492

12593
__Remarks:__
12694
* Notice how the environment variable `AZUREML_BI_OUTPUT_PATH` is used to get access to the output path of the deployment job.
@@ -140,18 +108,21 @@ Follow the next steps to create a deployment using the previous scoring script:
140108

141109
No extra step is required for the Azure Machine Learning CLI. The environment definition will be included in the deployment file.
142110

111+
:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/custom-outputs-parquet/deployment.yml" range="8-11":::
112+
143113
# [Python](#tab/sdk)
144114

145115
Let's get a reference to the environment:
146116

147117
```python
148118
environment = Environment(
149-
conda_file="./heart-classifier-mlflow/environment/conda.yaml",
150-
image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest",
119+
name="batch-mlflow-xgboost",
120+
conda_file="environment/conda.yaml",
121+
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
151122
)
152123
```
153124

154-
2. MLflow models don't require you to indicate an environment or a scoring script when creating the deployments as it is created for you. However, in this case we are going to indicate a scoring script and environment since we want to customize how inference is executed.
125+
2. Create the deployment
155126

156127
> [!NOTE]
157128
> This example assumes you have an endpoint created with the name `heart-classifier-batch` and a compute cluster with name `cpu-cluster`. If you don't, please follow the steps in the doc [Use batch endpoints for batch scoring](how-to-use-batch-endpoint.md).
@@ -160,37 +131,11 @@ Follow the next steps to create a deployment using the previous scoring script:
160131

161132
To create a new deployment under the created endpoint, create a `YAML` configuration like the following:
162133

163-
```yaml
164-
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
165-
endpoint_name: heart-classifier-batch
166-
name: classifier-xgboost-parquet
167-
description: A heart condition classifier based on XGBoost
168-
model: azureml:heart-classifier@latest
169-
environment:
170-
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest
171-
conda_file: ./heart-classifier-mlflow/environment/conda.yaml
172-
code_configuration:
173-
code: ./heart-classifier-custom/code/
174-
scoring_script: batch_driver_parquet.py
175-
compute: azureml:cpu-cluster
176-
resources:
177-
instance_count: 2
178-
max_concurrency_per_instance: 2
179-
mini_batch_size: 2
180-
output_action: summary_only
181-
retry_settings:
182-
max_retries: 3
183-
timeout: 300
184-
error_threshold: -1
185-
logging_level: info
186-
```
134+
:::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/custom-outputs-parquet/deployment.yml":::
187135

188136
Then, create the deployment with the following command:
189137

190-
```azurecli
191-
DEPLOYMENT_NAME="classifier-xgboost-parquet"
192-
az ml batch-deployment create -f endpoint.yml
193-
```
138+
:::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/custom-outputs-parquet/deploy-and-run.sh" ID="create_batch_deployment_set_default" :::
194139

195140
# [Python](#tab/sdk)
196141

@@ -204,8 +149,8 @@ Follow the next steps to create a deployment using the previous scoring script:
204149
model=model,
205150
environment=environment,
206151
code_configuration=CodeConfiguration(
207-
code="./heart-classifier-mlflow/code/",
208-
scoring_script="batch_driver_parquet.py",
152+
code="code/",
153+
scoring_script="batch_driver.py",
209154
),
210155
compute=compute_name,
211156
instance_count=2,

0 commit comments

Comments
 (0)