Skip to content

Commit 75bd5dc

Browse files
Merge pull request #212893 from fannyou/patch-7
Update hb-hc-known-issues.md
2 parents 2a52a6d + 2041cd5 commit 75bd5dc

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

articles/virtual-machines/workloads/hpc/hb-hc-known-issues.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,12 @@ To prevent low-level hardware access that can result in security vulnerabilities
2929
On Ubuntu-18.04 based marketplace VM images with kernels version `5.4.0-1039-azure #42` and newer, some older Mellanox OFED are incompatible causing an increase in VM boot time up to 30 minutes in some cases. This has been reported for both Mellanox OFED versions 5.2-1.0.4.0 and 5.2-2.2.0.0. The issue is resolved with Mellanox OFED 5.3-1.0.0.1.
3030
If it is necessary to use the incompatible OFED, a solution is to use the **Canonical:UbuntuServer:18_04-lts-gen2:18.04.202101290** marketplace VM image, or older and not to update the kernel.
3131

32-
## Accelerated Networking on HB, HC, HBv2, HBv3 and NDv2
32+
## Accelerated Networking on HB, HC, HBv2, HBv3, NDv2 and NDv4
33+
34+
[Azure Accelerated Networking](https://azure.microsoft.com/blog/maximize-your-vm-s-performance-with-accelerated-networking-now-generally-available-for-both-windows-and-linux/) is now available on the RDMA and InfiniBand capable and SR-IOV enabled VM sizes [HB](../../hb-series.md), [HC](../../hc-series.md), [HBv2](../../hbv2-series.md), [HBv3](../../hbv3-series.md), [NDv2](../../ndv2-series.md) and [NDv4](../../nda100-v4-series.md). This capability now allows enhanced throughout (up to 30 Gbps) and latencies over the Azure Ethernet network. Though this is separate from the RDMA capabilities over the InfiniBand network, some platform changes for this capability may impact behavior of certain MPI implementations when running jobs over InfiniBand. Specifically the InfiniBand interface on some VMs may have a slightly different name (mlx5_1 as opposed to earlier mlx5_0). This may require tweaking of the MPI command lines especially when using the UCX interface (commonly with OpenMPI and HPC-X).
35+
36+
The simplest solution currently is to use the latest HPC-X on the CentOS-HPC VM images where we rename the IB/AN interfaces accordingly or to run the [script](https://github.com/Azure/azhpc-images/blob/master/common/install_azure_persistent_rdma_naming.sh) to rename the InfiniBand interface.
3337

34-
[Azure Accelerated Networking](https://azure.microsoft.com/blog/maximize-your-vm-s-performance-with-accelerated-networking-now-generally-available-for-both-windows-and-linux/) is now available on the RDMA and InfiniBand capable and SR-IOV enabled VM sizes [HB](../../hb-series.md), [HC](../../hc-series.md), [HBv2](../../hbv2-series.md), [HBv3](../../hbv3-series.md) and [NDv2](../../ndv2-series.md). This capability now allows enhanced throughout (up to 30 Gbps) and latencies over the Azure Ethernet network. Though this is separate from the RDMA capabilities over the InfiniBand network, some platform changes for this capability may impact behavior of certain MPI implementations when running jobs over InfiniBand. Specifically the InfiniBand interface on some VMs may have a slightly different name (mlx5_1 as opposed to earlier mlx5_0). This may require tweaking of the MPI command lines especially when using the UCX interface (commonly with OpenMPI and HPC-X). The simplest solution currently may be to use the latest HPC-X on the CentOS-HPC VM images or disable Accelerated Networking if not required.
3538
More details on this are available on this [TechCommunity article](https://techcommunity.microsoft.com/t5/azure-compute/accelerated-networking-on-hb-hc-and-hbv2/ba-p/2067965) with instructions on how to address any observed issues.
3639

3740
## InfiniBand driver installation on non-SR-IOV VMs

articles/virtual-machines/workloads/hpc/setup-mpi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ make -j 8 && make install
4848
```
4949

5050
> [!NOTE]
51-
> Recent builds of UCX have fixed an [issue](https://github.com/openucx/ucx/pull/5965) whereby the right InfiniBand interface is chosen in the presence of multiple NIC interfaces. For more information, see [Troubleshooting known issues with HPC and GPU VMs](hb-hc-known-issues.md#accelerated-networking-on-hb-hc-hbv2-hbv3-and-ndv2) on running MPI over InfiniBand when Accelerated Networking is enabled on the VM.
51+
> Recent builds of UCX have fixed an [issue](https://github.com/openucx/ucx/pull/5965) whereby the right InfiniBand interface is chosen in the presence of multiple NIC interfaces. For more information, see [Troubleshooting known issues with HPC and GPU VMs](hb-hc-known-issues.md) on running MPI over InfiniBand when Accelerated Networking is enabled on the VM.
5252
5353
## HPC-X
5454

0 commit comments

Comments
 (0)