Skip to content

Commit c9e597e

Browse files
authored
Merge pull request #284685 from TimShererWithAquent/us295242d
Freshness - Azure Machine Learning V1
2 parents a93d1ff + 6bfffdb commit c9e597e

File tree

1 file changed

+133
-125
lines changed

1 file changed

+133
-125
lines changed
Lines changed: 133 additions & 125 deletions
Original file line numberDiff line numberDiff line change
@@ -1,170 +1,178 @@
11
---
2-
title: "Using low priority VMs in batch deployments"
2+
title: "Use low priority VMs in batch deployments"
33
titleSuffix: Azure Machine Learning
4-
description: Learn how to use low priority VMs to save costs when running batch jobs.
4+
description: Learn how to use low priority virtual machines in Azure Machine Learning to save costs when you run batch inference jobs.
55
services: machine-learning
66
ms.service: azure-machine-learning
77
ms.subservice: inferencing
88
ms.topic: how-to
99
author: msakande
1010
ms.author: mopeakande
11-
ms.date: 10/10/2022
11+
ms.date: 08/15/2024
1212
ms.reviewer: cacrest
1313
ms.custom: devplatv2
14+
#customer intent: As an analyst, I want to run batch inference workloads in the most cost efficient way possible.
1415
---
1516

16-
# Using low priority VMs in batch deployments
17+
# Use low priority VMs for batch deployments
1718

1819
[!INCLUDE [cli v2](includes/machine-learning-dev-v2.md)]
1920

20-
Azure Batch Deployments supports low priority VMs to reduce the cost of batch inference workloads. Low priority VMs enable a large amount of compute power to be used for a low cost. Low priority VMs take advantage of surplus capacity in Azure. When you specify low priority VMs in your pools, Azure can use this surplus, when available.
21+
Azure batch deployments support low priority virtual machines (VMs) to reduce the cost of batch inference workloads. Low priority VMs enable a large amount of compute power to be used for a low cost. Low priority virtual machines take advantage of surplus capacity in Azure. When you specify low priority VMs in your pools, Azure can use this surplus, when available.
2122

22-
The tradeoff for using them is that those VMs may not always be available to be allocated, or may be preempted at any time, depending on available capacity. For this reason, __they are most suitable for batch and asynchronous processing workloads__ where the job completion time is flexible and the work is distributed across many VMs.
23+
> [!TIP]
24+
> The tradeoff for using low priority VMs is that those virtual machines might not be available or they might be preempted at any time, depending on available capacity. For this reason, this approach is most suitable for batch and asynchronous processing workloads, where job completion time is flexible and the work is distributed across many virtual machines.
2325
24-
Low priority VMs are offered at a significantly reduced price compared with dedicated VMs. For pricing details, see [Azure Machine Learning pricing](https://azure.microsoft.com/pricing/details/machine-learning/).
26+
Low priority virtual machines are offered at a reduced price compared with dedicated virtual machines. For pricing details, see [Azure Machine Learning pricing](https://azure.microsoft.com/pricing/details/machine-learning/).
2527

2628
## How batch deployment works with low priority VMs
2729

2830
Azure Machine Learning Batch Deployments provides several capabilities that make it easy to consume and benefit from low priority VMs:
2931

30-
- Batch deployment jobs consume low priority VMs by running on Azure Machine Learning compute clusters created with low priority VMs. Once a deployment is associated with a low priority VMs' cluster, all the jobs produced by such deployment will use low priority VMs. Per-job configuration is not possible.
32+
- Batch deployment jobs consume low priority VMs by running on Azure Machine Learning compute clusters created with low priority VMs. After a deployment is associated with a low priority VMs cluster, all the jobs produced by such deployment use low priority VMs. Per-job configuration isn't possible.
3133
- Batch deployment jobs automatically seek the target number of VMs in the available compute cluster based on the number of tasks to submit. If VMs are preempted or unavailable, batch deployment jobs attempt to replace the lost capacity by queuing the failed tasks to the cluster.
32-
- Low priority VMs have a separate vCPU quota that differs from the one for dedicated VMs. Low-priority cores per region have a default limit of 100 to 3,000, depending on your subscription offer type. The number of low-priority cores per subscription can be increased and is a single value across VM families. See [Azure Machine Learning compute quotas](how-to-manage-quotas.md#azure-machine-learning-compute).
34+
- Low priority VMs have a separate vCPU quota that differs from the one for dedicated VMs. Low-priority cores per region have a default limit of 100 to 3,000, depending on your subscription. The number of low-priority cores per subscription can be increased and is a single value across VM families. See [Azure Machine Learning compute quotas](how-to-manage-quotas.md#azure-machine-learning-compute).
3335

34-
## Considerations and use cases
36+
### Considerations and use cases
3537

36-
Many batch workloads are a good fit for low priority VMs. Although this may introduce further execution delays when deallocation of VMs occurs, the potential drops in capacity can be tolerated at expenses of running with a lower cost if there is flexibility in the time jobs have to complete.
38+
Many batch workloads are a good fit for low priority VMs. Using low priority VMs can introduce execution delays when deallocation of VMs occurs. If you have flexibility in the time jobs have to finish, you might tolerate the potential drops in capacity.
3739

38-
When **deploying models** under batch endpoints, rescheduling can be done at the mini batch level. That has the extra benefit that deallocation only impacts those mini-batches that are currently being processed and not finished on the affected node. Every completed progress is kept.
40+
When you deploy models under batch endpoints, rescheduling can be done at the minibatch level. That approach has the benefit that deallocation only impacts those minibatches that are currently being processed and not finished on the affected node. All completed progress is kept.
3941

40-
## Creating batch deployments with low priority VMs
42+
### Limitations
4143

42-
Batch deployment jobs consume low priority VMs by running on Azure Machine Learning compute clusters created with low priority VMs.
44+
- After a deployment is associated with a low priority VMs cluster, all the jobs produced by such deployment use low priority VMs. Per-job configuration isn't possible.
45+
- Rescheduling is done at the mini-batch level, regardless of the progress. No checkpointing capability is provided.
46+
47+
> [!WARNING]
48+
> In the cases where the entire cluster is preempted or running on a single-node cluster, the job is cancelled because there is no capacity available for it to run. Resubmitting is required in this case.
49+
50+
## Create batch deployments that use low priority VMs
4351

44-
> [!NOTE]
45-
> Once a deployment is associated with a low priority VMs' cluster, all the jobs produced by such deployment will use low priority VMs. Per-job configuration is not possible.
52+
Batch deployment jobs consume low priority VMs by running on Azure Machine Learning compute clusters created with low priority VMs.
53+
54+
> [!NOTE]
55+
> After a deployment is associated with a low priority VMs cluster, all the jobs produced by such deployment use low priority VMs. Per-job configuration is not possible.
4656
4757
You can create a low priority Azure Machine Learning compute cluster as follows:
4858

49-
# [Azure CLI](#tab/cli)
50-
51-
Create a compute definition `YAML` like the following one:
52-
53-
__low-pri-cluster.yml__
54-
```yaml
55-
$schema: https://azuremlschemas.azureedge.net/latest/amlCompute.schema.json
56-
name: low-pri-cluster
57-
type: amlcompute
58-
size: STANDARD_DS3_v2
59-
min_instances: 0
60-
max_instances: 2
61-
idle_time_before_scale_down: 120
62-
tier: low_priority
63-
```
64-
65-
Create the compute using the following command:
66-
67-
```azurecli
68-
az ml compute create -f low-pri-cluster.yml
69-
```
70-
71-
# [Python](#tab/sdk)
72-
73-
To create a new compute cluster with low priority VMs where to create the deployment, use the following script:
74-
75-
```python
76-
compute_name = "low-pri-cluster"
77-
compute_cluster = AmlCompute(
78-
name=compute_name,
79-
description="Low priority compute cluster",
80-
min_instances=0,
81-
max_instances=2,
82-
tier='LowPriority'
83-
)
84-
85-
ml_client.begin_create_or_update(compute_cluster)
86-
```
87-
88-
---
89-
90-
Once you have the new compute created, you can create or update your deployment to use the new cluster:
91-
92-
# [Azure CLI](#tab/cli)
93-
94-
To create or update a deployment under the new compute cluster, create a `YAML` configuration like the following:
95-
96-
```yaml
97-
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
98-
endpoint_name: heart-classifier-batch
99-
name: classifier-xgboost
100-
description: A heart condition classifier based on XGBoost
101-
type: model
102-
model: azureml:heart-classifier@latest
103-
compute: azureml:low-pri-cluster
104-
resources:
105-
instance_count: 2
106-
settings:
107-
max_concurrency_per_instance: 2
108-
mini_batch_size: 2
109-
output_action: append_row
110-
output_file_name: predictions.csv
111-
retry_settings:
112-
max_retries: 3
113-
timeout: 300
114-
```
115-
116-
Then, create the deployment with the following command:
117-
118-
```azurecli
119-
az ml batch-endpoint create -f endpoint.yml
120-
```
121-
122-
# [Python](#tab/sdk)
123-
124-
To create or update a deployment under the new compute cluster, use the following script:
125-
126-
```python
127-
deployment = ModelBatchDeployment(
128-
name="classifier-xgboost",
129-
description="A heart condition classifier based on XGBoost",
130-
endpoint_name=endpoint.name,
131-
model=model,
132-
compute=compute_name,
133-
settings=ModelBatchDeploymentSettings(
134-
instance_count=2,
135-
max_concurrency_per_instance=2,
136-
mini_batch_size=2,
137-
output_action=BatchDeploymentOutputAction.APPEND_ROW,
138-
output_file_name="predictions.csv",
139-
retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
140-
)
59+
# [Azure CLI](#tab/cli)
60+
61+
Create a compute definition `YAML` like the following one, *low-pri-cluster.yml*:
62+
63+
```yaml
64+
$schema: https://azuremlschemas.azureedge.net/latest/amlCompute.schema.json
65+
name: low-pri-cluster
66+
type: amlcompute
67+
size: STANDARD_DS3_v2
68+
min_instances: 0
69+
max_instances: 2
70+
idle_time_before_scale_down: 120
71+
tier: low_priority
72+
```
73+
74+
Create the compute using the following command:
75+
76+
```azurecli
77+
az ml compute create -f low-pri-cluster.yml
78+
```
79+
80+
# [Python](#tab/sdk)
81+
82+
To create a new compute cluster with low priority VMs where to create the deployment, use the following script:
83+
84+
```python
85+
from azure.ai.ml.entities import AmlCompute
86+
87+
compute_name = "low-pri-cluster"
88+
compute_cluster = AmlCompute(
89+
name=compute_name,
90+
description="Low priority compute cluster",
91+
min_instances=0,
92+
max_instances=2,
93+
tier='LowPriority'
94+
)
95+
96+
ml_client.begin_create_or_update(compute_cluster)
97+
```
98+
99+
---
100+
101+
After you create the new compute, you can create or update your deployment to use the new cluster:
102+
103+
# [Azure CLI](#tab/cli)
104+
105+
To create or update a deployment under the new compute cluster, create a `YAML` configuration file, *endpoint.yml*:
106+
107+
```yaml
108+
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
109+
endpoint_name: heart-classifier-batch
110+
name: classifier-xgboost
111+
description: A heart condition classifier based on XGBoost
112+
type: model
113+
model: azureml:heart-classifier@latest
114+
compute: azureml:low-pri-cluster
115+
resources:
116+
instance_count: 2
117+
settings:
118+
max_concurrency_per_instance: 2
119+
mini_batch_size: 2
120+
output_action: append_row
121+
output_file_name: predictions.csv
122+
retry_settings:
123+
max_retries: 3
124+
timeout: 300
125+
```
126+
127+
Then, create the deployment with the following command:
128+
129+
```azurecli
130+
az ml batch-endpoint create -f endpoint.yml
131+
```
132+
133+
# [Python](#tab/sdk)
134+
135+
To create or update a deployment under the new compute cluster, use the following script:
136+
137+
```python
138+
deployment = ModelBatchDeployment(
139+
name="classifier-xgboost",
140+
description="A heart condition classifier based on XGBoost",
141+
endpoint_name=endpoint.name,
142+
model=model,
143+
compute=compute_name,
144+
settings=ModelBatchDeploymentSettings(
145+
instance_count=2,
146+
max_concurrency_per_instance=2,
147+
mini_batch_size=2,
148+
output_action=BatchDeploymentOutputAction.APPEND_ROW,
149+
output_file_name="predictions.csv",
150+
retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
141151
)
142-
143-
ml_client.batch_deployments.begin_create_or_update(deployment)
144-
```
145-
---
146-
152+
)
153+
154+
ml_client.batch_deployments.begin_create_or_update(deployment)
155+
```
156+
157+
---
158+
147159
## View and monitor node deallocation
148160

149161
New metrics are available in the [Azure portal](https://portal.azure.com) for low priority VMs to monitor low priority VMs. These metrics are:
150162

151163
- Preempted nodes
152164
- Preempted cores
153165

154-
To view these metrics in the Azure portal
166+
To view these metrics in the Azure portal:
155167

156168
1. Navigate to your Azure Machine Learning workspace in the [Azure portal](https://portal.azure.com).
157-
2. Select **Metrics** from the **Monitoring** section.
158-
3. Select the metrics you desire from the **Metric** list.
159-
160-
:::image type="content" source="./media/how-to-use-low-priority-batch/metrics.png" alt-text="Screenshot of the metrics section in the resource monitoring blade showing the relevant metrics for low priority VMs.":::
161-
162-
## Limitations
169+
1. Select **Metrics** from the **Monitoring** section.
170+
1. Select the metrics you desire from the **Metric** list.
163171

164-
- Once a deployment is associated with a low priority VMs' cluster, all the jobs produced by such deployment will use low priority VMs. Per-job configuration is not possible.
165-
- Rescheduling is done at the mini-batch level, regardless of the progress. No checkpointing capability is provided.
166-
167-
> [!WARNING]
168-
> In the cases where the entire cluster is preempted (or running on a single-node cluster), the job will be cancelled as there is no capacity available for it to run. Resubmitting will be required in this case.
172+
:::image type="content" source="./media/how-to-use-low-priority-batch/metrics.png" lightbox="./media/how-to-use-low-priority-batch/metrics.png" alt-text="Screenshot of the metrics section in the resource monitoring pane that shows the relevant metrics for low priority VMs.":::
169173

174+
## Related content
170175

176+
- [Create an Azure Machine Learning compute cluster](how-to-create-attach-compute-cluster.md)
177+
- [Deploy MLflow models in batch deployments](how-to-mlflow-batch.md)
178+
- [Manage compute resources for model training](how-to-create-attach-compute-studio.md)

0 commit comments

Comments
 (0)