Skip to content

Commit 71d49c2

Browse files
authored
Update resource-known-issues.md
1 parent 722a901 commit 71d49c2

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

articles/machine-learning/service/resource-known-issues.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@ ms.custom: seodec18
1717

1818
This article helps you find and correct errors or failures encountered when using Azure Machine Learning.
1919

20+
## Upcoming SR-IOV upgrade to NCv3 machines in AmlCompute
21+
22+
Azure Compute will be updating the NCv3 SKUs starting early November to support all MPI implementations and versions, and RDMA verbs for InfiniBand-equipped virtual machines. This will require a short downtime - [read more about the SR-IOV upgrade](https://azure.microsoft.com/updates/sriov-availability-on-ncv3-virtual-machines-sku).
23+
24+
As a customer of Azure Machine Learning's managed compute offering (AmlCompute), you are not required to make any changes at this time. Based on the [update schedule](https://azure.microsoft.com/updates/sr-iov-availability-schedule-on-ncv3-virtual-machines-sku) you would need to plan for a short break in your training. The service will take responsibility to update the VM images on your cluster nodes and automatically scale up your cluster. Once the upgrade completes you may be able to use all other MPI discibutions (like OpenMPI with Pytorch) besides getting higher InfiniBand bandwidth, lower latencies, and better distributed application performance.
25+
2026
## Visual interface issues
2127

2228
Visual interface for machine learning service issues.

0 commit comments

Comments
 (0)