Skip to content

Commit 52b4eb9

Browse files
authored
Merge pull request #275673 from darkwhite29/patch-6
Create hpc-linux-vm-images.md
2 parents 6084496 + e096c3d commit 52b4eb9

File tree

2 files changed

+140
-0
lines changed

2 files changed

+140
-0
lines changed

articles/virtual-machines/TOC.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,9 @@
415415
- name: Configure and optimize VMs
416416
displayName: Configure and optimize Infiniband enabled VMs
417417
href: configure.md
418+
- name: Azure HPC VM images
419+
displayName: Azure HPC VM images
420+
href: azure-hpc-vm-images.md
418421
- name: HB-series
419422
href: hb-series.md
420423
items:
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
---
2+
title: Azure HPC VM images
3+
description: HPC VM images to be used on InfiniBand enabled H-series and GPU enabled N-series VMs.
4+
ms.service: virtual-machines
5+
ms.subservice: hpc
6+
ms.custom: linux-related-content
7+
ms.topic: article
8+
ms.date: 05/17/2024
9+
ms.reviewer: padmalathas
10+
ms.author: litan2
11+
author: litan2
12+
---
13+
14+
# Azure HPC VM images
15+
16+
**Applies to:** :heavy_check_mark: Linux VMs :heavy_check_mark: Flexible scale sets :heavy_check_mark: Uniform scale sets
17+
18+
This article shares some information on HPC VM images to be used to launch InfiniBand enabled [H-series](sizes-hpc.md) and GPU enabled [N-series](sizes-gpu.md) VMs.
19+
20+
The Azure HPC team is pleased to announce the availability of optimized and pre-configured Linux VM images for HPC and AI workloads. These VM images are:
21+
22+
- Based on the vanilla Ubuntu and AlmaLinux marketplace VM images.
23+
- Pre-configured with NVIDIA Mellanox OFED driver for InfiniBand, NVIDIA GPU drivers, popular MPI libraries, vendor tuned HPC libraries, and recommended performance optimizations.
24+
- Including optimizations and recommended configurations to deliver optimal performance, consistency, and reliability.
25+
26+
## Availability on Azure
27+
28+
You may use the HPC images when creating a VM from either Azure Marketplace or Azure CLI. For other deployment methods, refer to the section of Deploying HPC VM Images.
29+
30+
### Azure Marketplace
31+
32+
Search for "Ubuntu HPC" by the publisher "Microsoft-DSVM", or "AlmaLinux HPC" by the publisher "AlmaLinux".
33+
34+
### Azure CLI
35+
36+
Run the following commands to find image URNs of the HPC images:
37+
38+
#### Ubuntu-HPC
39+
40+
```
41+
az vm image list --publisher microsoft-dsvm --offer ubuntu-hpc --output table --all
42+
```
43+
44+
All images support [Gen 2 VMs]](generation-2.md).
45+
46+
#### AlmaLinux-HPC
47+
48+
```
49+
az vm image list --publisher almalinux --offer almalinux-hpc --output table --all
50+
```
51+
52+
All images support both Gen 1 and Gen 2 VMs.
53+
54+
## Supported VM sizes
55+
56+
The HPC VM images support the following VM sizes:
57+
58+
- Standard_HB60rs
59+
- Standard_HB120rs_v2
60+
- Standard_HB120rs_v3
61+
- Standard_HB120rs_v4
62+
- Standard_HC44rs
63+
- Standard_ND40rs_v2
64+
- Standard_ND96asr_v4
65+
- Standard_ND96amsr_A100_v4
66+
- Standard_ND96isr_H100_v5
67+
68+
Refer to [Azure VM sizes](sizes.md) for the latest H- and N-series VM size support matrix.
69+
70+
## Installed software packages
71+
72+
- Mellanox OFED 24.01-0.3.3.1
73+
- Pre-configured IPoIB (IP-over-InfiniBand)
74+
- Popular InfiniBand based MPI Libraries
75+
- HPC-X v2.18 with/without PMIx-4
76+
- Intel MPI 2021.12.0
77+
- MVAPICH2 2.3.7-1
78+
- OpenMPI 5.0.2 with PMIx-4
79+
- Communication Runtimes
80+
- Libfabric
81+
- OpenUCX
82+
- NCCL 2.21.5-1
83+
- NCCL RDMA Sharp Plugin
84+
- Optimized libraries
85+
- AMD Optimizing C/C++ and Fortran Compilers 4.0.0-1
86+
- Intel MKL 2024.0.0.49673
87+
- GPU Drivers
88+
- NVIDIA GPU Driver 535.161.08
89+
- NVIDIA Peer Memory (GPU Direct RDMA)
90+
- NVIDIA Fabric Manager
91+
- CUDA 12.4
92+
- GDRCopy 2.3
93+
- Data Center GPU Manager 3.3.3
94+
- Azure HPC Diagnostics Tool
95+
- SKU based customizations
96+
- Topology files
97+
- NCCL configuration
98+
- Moby 24.0.7-ubuntu22.04u1
99+
- NVIDIA Docker container 24.0.7-1
100+
- Azure Managed Lustre 2.15.4-42-gd6d405d
101+
- Moneo v0.3.5
102+
- Azure HPC Health Checks v0.4.2
103+
104+
An installed version index within the VM image is located at this location: ```/opt/azurehpc/component_versions.txt```.
105+
106+
MPI libraries and software packages are available as environment modules. To load an MPI library/package, run:
107+
108+
```
109+
module load <package-name>
110+
```
111+
112+
## Configuration and optimization
113+
114+
Refer to the [azhpc-images](https://github.com/Azure/azhpc-images) repo at GitHub for the latest details on what packages and configuration is included in each VM image. The included configurations are based on optimization recommendations from vendors and partners, as well as learnings from common HPC workloads and usage practices in traditional HPC systems.
115+
116+
- Azure Linux Agent (WAAgent)
117+
- Limit waagent's (VM agent running on every Azure Linux VM) usage of CPU/memory resources.
118+
- Optionally, consider disabling waagent at the beginning of your job script, and enabling it back at the end, for CPU sensitive workloads as follows:
119+
120+
```
121+
sudo systemctl stop waagent
122+
<HPC job>
123+
sudo systemctl restart waagent
124+
```
125+
126+
- Higher Memory Limits
127+
- Set max-locked-memory limit to unlimited
128+
- Set number of open files limit to 65535
129+
130+
- Zone Reclaim mode
131+
- Set zone_reclaim_mode to 1
132+
133+
- Disable firewall daemon to help MPI job launchers
134+
135+
## Deploying HPC VM images
136+
137+
As shown, the HPC VM images are available from Azure Marketplace and Azure CLI. They can be deployed through a variety of deployment vehicles on Azure (Azure CycleCloud, Azure Batch, ARM templates, etc.). [AzureHPC scripts](https://github.com/Azure/azurehpc/) provide an easy way to quickly deploy an HPC cluster using these images.

0 commit comments

Comments
 (0)