Skip to content

Commit 7d1ebb8

Browse files
authored
Merge pull request #277535 from Padmalathas/autoscale-cleanup
Cleaning retired service-defined autoscale variables table
2 parents 0ec3bf6 + 468e0bf commit 7d1ebb8

9 files changed

+27
-82
lines changed

articles/batch/batch-automatic-scaling.md

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Autoscale compute nodes in an Azure Batch pool
33
description: Enable automatic scaling on an Azure Batch cloud pool to dynamically adjust the number of compute nodes in the pool.
44
ms.topic: how-to
5-
ms.date: 04/02/2024
5+
ms.date: 06/11/2024
66
ms.custom: H1Hack27Feb2017, fasttrack-edit, devx-track-csharp
77
---
88

@@ -113,16 +113,6 @@ You can get the value of these service-defined variables to make adjustments tha
113113
| Variable | Description |
114114
| --- | --- |
115115
| $CPUPercent |The average percentage of CPU usage. |
116-
| $WallClockSeconds |The number of seconds consumed. Retiring after 2024-Mar-31. |
117-
| $MemoryBytes |The average number of megabytes used. Retiring after 2024-Mar-31. |
118-
| $DiskBytes |The average number of gigabytes used on the local disks. Retiring after 2024-Mar-31. |
119-
| $DiskReadBytes |The number of bytes read. Retiring after 2024-Mar-31. |
120-
| $DiskWriteBytes |The number of bytes written. Retiring after 2024-Mar-31. |
121-
| $DiskReadOps |The count of read disk operations performed. Retiring after 2024-Mar-31. |
122-
| $DiskWriteOps |The count of write disk operations performed. Retiring after 2024-Mar-31. |
123-
| $NetworkInBytes |The number of inbound bytes. Retiring after 2024-Mar-31. |
124-
| $NetworkOutBytes |The number of outbound bytes. Retiring after 2024-Mar-31. |
125-
| $SampleNodeCount |The count of compute nodes. Retiring after 2024-Mar-31. |
126116
| $ActiveTasks |The number of tasks that are ready to execute but aren't yet executing. This includes all tasks that are in the active state and whose dependencies have been satisfied. Any tasks that are in the active state but whose dependencies haven't been satisfied are excluded from the `$ActiveTasks` count. For a multi-instance task, `$ActiveTasks` includes the number of instances set on the task.|
127117
| $RunningTasks |The number of tasks in a running state. |
128118
| $PendingTasks |The sum of `$ActiveTasks` and `$RunningTasks`. |
@@ -239,7 +229,7 @@ You can use both resource and task metrics when you define a formula. You adjust
239229

240230
| Metric | Description |
241231
|----------|--------------|
242-
| Resource | Resource metrics are based on the CPU, the bandwidth, the memory usage of compute nodes, and the number of nodes.<br><br>These service-defined variables are useful for making adjustments based on node count:<br>- $TargetDedicatedNodes <br>- $TargetLowPriorityNodes <br>- $CurrentDedicatedNodes <br>- $CurrentLowPriorityNodes <br>- $PreemptedNodeCount <br>- $UsableNodeCount <br><br>These service-defined variables are useful for making adjustments based on node resource usage: <br>- $CPUPercent <br>- $WallClockSeconds <br>- $MemoryBytes <br>- $DiskBytes <br>- $DiskReadBytes <br>- $DiskWriteBytes <br>- $DiskReadOps <br>- $DiskWriteOps <br>- $NetworkInBytes <br>- $NetworkOutBytes |
232+
| Resource | Resource metrics are based on the CPU, the bandwidth, the memory usage of compute nodes, and the number of nodes.<br><br>These service-defined variables are useful for making adjustments based on node count:<br>- $TargetDedicatedNodes <br>- $TargetLowPriorityNodes <br>- $CurrentDedicatedNodes <br>- $CurrentLowPriorityNodes <br>- $PreemptedNodeCount <br>- $UsableNodeCount <br><br>These service-defined variables are useful for making adjustments based on node resource usage: <br>- $CPUPercent |
243233
| Task | Task metrics are based on the status of tasks, such as Active, Pending, and Completed. The following service-defined variables are useful for making pool-size adjustments based on task metrics: <br>- $ActiveTasks <br>- $RunningTasks <br>- $PendingTasks <br>- $SucceededTasks <br>- $FailedTasks |
244234

245235
## Obtain sample data

articles/batch/batch-docker-container-workloads.md

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,13 @@
22
title: Container workloads on Azure Batch
33
description: Learn how to run and scale apps from container images on Azure Batch. Create a pool of compute nodes that support running container tasks.
44
ms.topic: how-to
5-
ms.date: 06/04/2024
5+
ms.date: 06/10/2024
66
ms.devlang: csharp
77
ms.custom: devx-track-csharp, linux-related-content
88
---
99

1010
# Use Azure Batch to run container workloads
1111

12-
> [!CAUTION]
13-
> This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and planning accordingly. For more information, see the [CentOS End Of Life guidance](~/articles/virtual-machines/workloads/centos/centos-end-of-life.md).
14-
1512
Azure Batch lets you run and scale large numbers of batch computing jobs on Azure. Batch tasks can run directly on virtual machines (nodes) in a Batch pool, but you can also set up a Batch pool to run tasks in Docker-compatible containers on the nodes. This article shows you how to create a pool of compute nodes that support running container tasks, and then run container tasks on the pool.
1613

1714
The code examples here use the Batch .NET and Python SDKs. You can also use other Batch SDKs and tools, including the Azure portal, to create container-enabled Batch pools and to run container tasks.
@@ -86,8 +83,6 @@ without the need for a custom image.
8683
Currently there are other images published by `microsoft-azure-batch` that support container workloads:
8784

8885
- Publisher: `microsoft-azure-batch`
89-
- Offer: `centos-container`
90-
- Offer: `centos-container-rdma` (For use exclusively on VM SKUs with Infiniband)
9186
- Offer: `ubuntu-server-container`
9287
- Offer: `ubuntu-server-container-rdma` (For use exclusively on VM SKUs with Infiniband)
9388

@@ -97,7 +92,6 @@ Currently there are other images published by `microsoft-azure-batch` that suppo
9792
9893
#### Notes
9994
The docker data root of the above images lies in different places:
100-
- For the Azure Batch published `microsoft-azure-batch` images (Offer: `centos-container-rdma`, etc.), the docker data root is mapped to _/mnt/batch/docker_, which is located on the temporary disk.
10195
- For the HPC image, or `microsoft-dsvm` (Offer: `ubuntu-hpc`, etc.), the docker data root is unchanged from the Docker default, which is _/var/lib/docker_ on Linux and _C:\ProgramData\Docker_ on Windows. These folders are located on the OS disk.
10296

10397
For non-Batch published images, the OS disk has the potential risk of being filled up quickly as container images are downloaded.

articles/batch/batch-pool-compute-intensive-sizes.md

Lines changed: 11 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,14 @@ title: Use compute-intensive Azure VMs with Batch
33
description: How to take advantage of HPC and GPU virtual machine sizes in Azure Batch pools. Learn about OS dependencies and see several scenario examples.
44
ms.topic: how-to
55
ms.custom: linux-related-content
6-
ms.date: 05/01/2023
6+
ms.date: 06/07/2024
77
---
88
# Use RDMA or GPU instances in Batch pools
99

10-
> [!CAUTION]
11-
> This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and planning accordingly. For more information, see the [CentOS End Of Life guidance](~/articles/virtual-machines/workloads/centos/centos-end-of-life.md).
1210

1311
To run certain Batch jobs, you can take advantage of Azure VM sizes designed for large-scale computation. For example:
1412

15-
* To run multi-instance [MPI workloads](batch-mpi.md), choose H-series or other sizes that have a network interface for Remote Direct Memory Access (RDMA). These sizes connect to an InfiniBand network for inter-node communication, which can accelerate MPI applications.
13+
* To run multi-instance [MPI workloads](batch-mpi.md), choose HB, HC, NC, or ND series or other sizes that have a network interface for Remote Direct Memory Access (RDMA). These sizes connect to an InfiniBand network for inter-node communication, which can accelerate MPI applications.
1614

1715
* For CUDA applications, choose N-series sizes that include NVIDIA Tesla graphics processing unit (GPU) cards.
1816

@@ -27,15 +25,15 @@ This article provides guidance and examples to use some of Azure's specialized s
2725
2826
## Dependencies
2927

30-
The RDMA or GPU capabilities of compute-intensive sizes in Batch are supported only in certain operating systems. (The list of supported operating systems is a subset of those supported for virtual machines created in these sizes.) Depending on how you create your Batch pool, you might need to install or configure additional driver or other software on the nodes. The following tables summarize these dependencies. See linked articles for details. For options to configure Batch pools, see later in this article.
28+
The RDMA or GPU capabilities of compute-intensive sizes in Batch are supported only in certain operating systems. The supported operating systems for these VM sizes include only a subset of those available for virtual machine creation. Depending on how you create your Batch pool, you might need to install or configure extra driver or other software on the nodes. The following tables summarize these dependencies. See linked articles for details. For options to configure Batch pools, see later in this article.
3129

3230
### Linux pools - Virtual machine configuration
3331

3432
| Size | Capability | Operating systems | Required software | Pool settings |
3533
| -------- | -------- | ----- | -------- | ----- |
36-
| [H16r, H16mr](../virtual-machines/sizes-hpc.md)<br/>[NC24r, NC24rs_v2, NC24rs_v3, ND24rs<sup>*</sup>](../virtual-machines/linux/n-series-driver-setup.md#rdma-network-connectivity) | RDMA | Ubuntu 22.04 LTS, or<br/>CentOS-based HPC<br/>(Azure Marketplace) | Intel MPI 5<br/><br/>Linux RDMA drivers | Enable inter-node communication, disable concurrent task execution |
37-
| [NC, NCv2, NCv3, NDv2 series](../virtual-machines/linux/n-series-driver-setup.md) | NVIDIA Tesla GPU (varies by series) | Ubuntu 22.04 LTS, or<br/>CentOS 8.1<br/>(Azure Marketplace) | NVIDIA CUDA or CUDA Toolkit drivers | N/A |
38-
| [NV, NVv2, NVv4 series](../virtual-machines/linux/n-series-driver-setup.md) | NVIDIA Tesla M60 GPU | Ubuntu 22.04 LTS, or<br/>CentOS 8.1<br/>(Azure Marketplace) | NVIDIA GRID drivers | N/A |
34+
| [H16r, H16mr](../virtual-machines/sizes-hpc.md)<br/>[NC24r, NC24rs_v2, NC24rs_v3, ND24rs<sup>*</sup>](../virtual-machines/linux/n-series-driver-setup.md#rdma-network-connectivity) | RDMA | Ubuntu 22.04 LTS <br/> (Azure Marketplace) | Intel MPI 5<br/><br/>Linux RDMA drivers | Enable inter-node communication, disable concurrent task execution |
35+
| [NCv3, NDv2, NDv4, NDv5 series](../virtual-machines/linux/n-series-driver-setup.md) | NVIDIA Tesla GPU (varies by series) | Ubuntu 22.04 LTS <br/> (Azure Marketplace) | NVIDIA CUDA or CUDA Toolkit drivers | N/A |
36+
| [NVv3, NVv4, NVv5 series](../virtual-machines/linux/n-series-driver-setup.md) | Accelerated Visualization GPU | Ubuntu 22.04 LTS <br/> (Azure Marketplace) | NVIDIA GRID drivers (if required) | N/A |
3937

4038
<sup>*</sup>RDMA-capable N-series sizes also include NVIDIA Tesla GPUs
4139

@@ -71,19 +69,15 @@ To configure a specialized VM size for your Batch pool, you have several options
7169

7270
* For pools in the virtual machine configuration, choose a preconfigured [Azure Marketplace](https://azuremarketplace.microsoft.com/marketplace/) VM image that has drivers and software preinstalled. Examples:
7371

74-
* [CentOS-based 8.1 HPC](https://azuremarketplace.microsoft.com/marketplace/apps/openlogic.centos-hpc?tab=Overview) - includes RDMA drivers and Intel MPI 5.1
72+
* [Data Science Virtual Machine](../machine-learning/data-science-virtual-machine/overview.md) for Linux or Windows - includes NVIDIA CUDA drivers
7573

76-
* [Data Science Virtual Machine](../machine-learning/data-science-virtual-machine/overview.md) for Linux or Windows - includes NVIDIA CUDA drivers
74+
* Linux images for Batch container workloads that also include GPU and RDMA drivers:
7775

78-
* Linux images for Batch container workloads that also include GPU and RDMA drivers:
76+
* [Ubuntu Server (with GPU and RDMA drivers) for Azure Batch container pools](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-azure-batch.ubuntu-server-container-rdma?tab=Overview)
7977

80-
* [CentOS (with GPU and RDMA drivers) for Azure Batch container pools](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-azure-batch.centos-container-rdma?tab=Overview)
78+
* Create a [custom Windows or Linux VM image](batch-sig-images.md) with installed drivers, software, or other settings required for the VM size.
8179

82-
* [Ubuntu Server (with GPU and RDMA drivers) for Azure Batch container pools](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-azure-batch.ubuntu-server-container-rdma?tab=Overview)
83-
84-
* Create a [custom Windows or Linux VM image](batch-sig-images.md) on which you have installed drivers, software, or other settings required for the VM size.
85-
86-
* Create a Batch [application package](batch-application-packages.md) from a zipped driver or application installer, and configure Batch to deploy the package to pool nodes and install once when each node is created. For example, if the application package is an installer, create a [start task](jobs-and-tasks.md#start-task) command line to silently install the app on all pool nodes. Consider using an application package and a pool start task if your workload depends on a particular driver version.
80+
* Create a Batch [application package](batch-application-packages.md) from a zipped driver or application installer. Then, configure Batch to deploy this package to pool nodes and install once when each node is created. For example, if the application package is an installer, create a [start task](jobs-and-tasks.md#start-task) command line to silently install the app on all pool nodes. Consider using an application package and a pool start task if your workload depends on a particular driver version.
8781

8882
> [!NOTE]
8983
> The start task must run with elevated (admin) permissions, and it must wait for success. Long-running tasks will increase the time to provision a Batch pool.
@@ -145,22 +139,6 @@ To run Windows MPI applications on a pool of Azure H16r VM nodes, you need to co
145139
| **Internode communication enabled** | True |
146140
| **Max tasks per node** | 1 |
147141

148-
## Example: Intel MPI on a Linux H16r VM pool
149-
150-
To run MPI applications on a pool of Linux HB-series nodes, one option is to use the [CentOS-based 8.1 HPC](https://azuremarketplace.microsoft.com/marketplace/apps/openlogic.centos-hpc?tab=Overview) image from the Azure Marketplace. Linux RDMA drivers and Intel MPI are preinstalled. This image also supports Docker container workloads.
151-
152-
Using the Batch APIs or Azure portal, create a pool using this image and with the desired number of nodes and scale. The following table shows sample pool settings:
153-
154-
| Setting | Value |
155-
| ---- | ---- |
156-
| **Image Type** | Marketplace (Linux/Windows) |
157-
| **Publisher** | OpenLogic |
158-
| **Offer** | CentOS-HPC |
159-
| **Sku** | 8.1 |
160-
| **Node size** | H16r Standard |
161-
| **Internode communication enabled** | True |
162-
| **Max tasks per node** | 1 |
163-
164142
## Next steps
165143

166144
* To run MPI jobs on an Azure Batch pool, see the [Windows](batch-mpi.md) or [Linux](/archive/blogs/windowshpc/introducing-mpi-support-for-linux-on-azure-batch) examples.

articles/batch/batch-pool-node-error-checking.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,12 @@
11
---
22
title: Pool and node errors
33
description: Learn about background operations, errors to check for, and how to avoid errors when you create Azure Batch pools and nodes.
4-
ms.date: 04/11/2023
4+
ms.date: 06/10/2024
55
ms.topic: how-to
66
---
77

88
# Azure Batch pool and node errors
99

10-
> [!CAUTION]
11-
> This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and planning accordingly. For more information, see the [CentOS End Of Life guidance](~/articles/virtual-machines/workloads/centos/centos-end-of-life.md).
12-
1310
Some Azure Batch pool creation and management operations happen immediately. Detecting failures for these operations is straightforward, because errors usually return immediately from the API, command line, or user interface. However, some operations are asynchronous, run in the background, and take several minutes to complete. This article describes ways to detect and avoid failures that can occur in the background operations for pools and nodes.
1411

1512
Make sure to set your applications to implement comprehensive error checking, especially for asynchronous operations. Comprehensive error checking can help you promptly identify and diagnose issues.
@@ -106,7 +103,7 @@ Other reasons for `unusable` nodes might include the following causes:
106103

107104
- A custom VM image is invalid. For example, the image isn't properly prepared.
108105
- A VM is moved because of an infrastructure failure or a low-level upgrade. Batch recovers the node.
109-
- A VM image has been deployed on hardware that doesn't support it. For example, a CentOS HPC image is deployed on a [Standard_D1_v2](/azure/virtual-machines/dv2-dsv2-series) VM.
106+
- A VM image has been deployed on hardware that doesn't support it.
110107
- The VMs are in an [Azure virtual network](batch-virtual-network.md), and traffic has been blocked to key ports.
111108
- The VMs are in a virtual network, but outbound traffic to Azure Storage is blocked.
112109
- The VMs are in a virtual network with a custom DNS configuration, and the DNS server can't resolve Azure storage.

articles/batch/batch-rendering-functionality.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,12 @@
11
---
22
title: Rendering capabilities
33
description: Standard Azure Batch capabilities are used to run rendering workloads and apps. Batch includes specific features to support rendering workloads.
4-
ms.date: 02/28/2024
4+
ms.date: 06/10/2024
55
ms.topic: how-to
66
---
77

88
# Azure Batch rendering capabilities
99

10-
> [!CAUTION]
11-
> This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and planning accordingly. For more information, see the [CentOS End Of Life guidance](~/articles/virtual-machines/workloads/centos/centos-end-of-life.md).
12-
1310
Standard Azure Batch capabilities are used to run rendering workloads and applications. Batch also includes specific features to support rendering workloads.
1411

1512
For an overview of Batch concepts, including pools, jobs, and tasks, see [this article](./batch-service-workflow-features.md).

0 commit comments

Comments
 (0)