You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -39,8 +39,9 @@ Along with managing quotas, you can learn how to [plan and manage costs for Azur
39
39
In this section, you learn about the default and maximum quota limits for the following resources:
40
40
41
41
+ Azure Machine Learning assets
42
-
+ Azure Machine Learning compute
43
-
+ Azure Machine Learning pipelines
42
+
+ Azure Machine Learning compute
43
+
+ Azure Machine Learning managed online endpoints
44
+
+ Azure Machine Learning pipelines
44
45
+ Virtual machines
45
46
+ Azure Container Instances
46
47
+ Azure Storage
@@ -63,7 +64,7 @@ The following limits on assets apply on a per-workspace basis.
63
64
In addition, the maximum **run time** is 30 days and the maximum number of **metrics logged per run** is 1 million.
64
65
65
66
### Azure Machine Learning Compute
66
-
[Azure Machine Learning Compute](concept-compute-target.md#azure-machine-learning-compute-managed) has a default quota limit on both the number of cores (split by each VM Family and cumulative total cores) as well as the number of unique compute resources allowed per region in a subscription. This quota is separate from the VM core quota listed in the previous section as it applies only to the managed compute resources of Azure Machine Learning.
67
+
[Azure Machine Learning Compute](concept-compute-target.md#azure-machine-learning-compute-managed) has a default quota limit on both the number of cores (split by each VM Family and cumulative total cores) and the number of unique compute resources allowed per region in a subscription. This quota is separate from the VM core quota listed in the previous section as it applies only to the managed compute resources of Azure Machine Learning.
67
68
68
69
[Request a quota increase](#request-quota-increases) to raise the limits for various VM family core quotas, total subscription core quotas, cluster quota and resources in this section.
69
70
@@ -72,26 +73,26 @@ Available resources:
72
73
73
74
+**Low-priority cores per region** have a default limit of 100 to 3,000, depending on your subscription offer type. The number of low-priority cores per subscription can be increased and is a single value across VM families.
74
75
75
-
+**Clusters per region** have a default limit of 200. These are shared between training clusters, compute instances and MIR endpoint deployments. (A compute instance is considered a single-node cluster for quota purposes.) Cluster quota can be increased up to a value of 500 per region within a given subscription.
76
+
+**Clusters per region** have a default limit of 200. This limit is shared between training clusters, compute instances and MIR endpoint deployments. (A compute instance is considered a single-node cluster for quota purposes.) Cluster quota can be increased up to a value of 500 per region within a given subscription.
76
77
77
78
> [!TIP]
78
79
> To learn more about which VM family to request a quota increase for, check out [virtual machine sizes in Azure](../virtual-machines/sizes.md). For instance GPU VM families start with an "N" in their family name (eg. NCv3 series)
79
80
80
-
The following table shows additional limits in the platform. Please reach out to the AzureML product team through a **technical** support ticket to request an exception.
81
+
The following table shows more limits in the platform. Reach out to the AzureML product team through a **technical** support ticket to request an exception.
81
82
82
83
|**Resource or Action**|**Maximum limit**|
83
84
| --- | --- |
84
85
| Workspaces per resource group | 800 |
85
-
| Nodes in a single Azure Machine Learning Compute (AmlCompute) **cluster** set up as a non communication-enabled pool (i.e. cannot run MPI jobs) | 100 nodes but configurable up to 65000 nodes |
86
-
| Nodes in a single Parallel Run Step **run** on an Azure Machine Learning Compute (AmlCompute) cluster | 100 nodes but configurable up to 65000 nodes if your cluster is set up to scale per above |
86
+
| Nodes in a single Azure Machine Learning Compute (AmlCompute) **cluster** set up as a non communication-enabled pool (that is, can't run MPI jobs) | 100 nodes but configurable up to 65,000 nodes |
87
+
| Nodes in a single Parallel Run Step **run** on an Azure Machine Learning Compute (AmlCompute) cluster | 100 nodes but configurable up to 65,000 nodes if your cluster is set up to scale per above |
87
88
| Nodes in a single Azure Machine Learning Compute (AmlCompute) **cluster** set up as a communication-enabled pool | 300 nodes but configurable up to 4000 nodes |
88
89
| Nodes in a single Azure Machine Learning Compute (AmlCompute) **cluster** set up as a communication-enabled pool on an RDMA enabled VM Family | 100 nodes |
89
90
| Nodes in a single MPI **run** on an Azure Machine Learning Compute (AmlCompute) cluster | 100 nodes but can be increased to 300 nodes |
90
91
| Job lifetime | 21 days<sup>1</sup> |
91
92
| Job lifetime on a low-priority node | 7 days<sup>2</sup> |
92
93
| Parameter servers per node | 1 |
93
94
94
-
<sup>1</sup> Maximum lifetime is the duration between when a job starts and when it finishes. Completed jobs persist indefinitely. Data for jobs not completed within the maximum lifetime is not accessible.
95
+
<sup>1</sup> Maximum lifetime is the duration between when a job starts and when it finishes. Completed jobs persist indefinitely. Data for jobs not completed within the maximum lifetime isn't accessible.
95
96
96
97
<sup>2</sup> Jobs on a low-priority node can be preempted whenever there's a capacity constraint. We recommend that you implement checkpoints in your job.
97
98
@@ -107,37 +108,22 @@ Azure Machine Learning managed online endpoints have limits described in the fol
107
108
| Number of deployments per subscription | 200 |
108
109
| Number of deployments per endpoint | 20 |
109
110
| Number of instances per deployment | 20 <sup>2</sup> |
110
-
| Max request timeout at endpoint level | 90 seconds |
111
+
| Max request time-out at endpoint level | 90 seconds |
111
112
| Total requests per second at endpoint level for all deployments | 500 <sup>3</sup> |
112
113
| Total connections per second at endpoint level for all deployments | 500 <sup>3</sup> |
113
114
| Total connections active at endpoint level for all deployments | 500 <sup>3</sup> |
114
115
| Total bandwidth at endpoint level for all deployments | 5 MBPS <sup>3</sup> |
115
116
116
117
<sup>1</sup> Single dashes like, `my-endpoint-name`, are accepted in endpoint and deployment names.
117
118
118
-
<sup>2</sup> We reserve 20% extra compute resources for performing upgrades. For example, if you request 10 instances in a deployment, you must have a quota for 12. Otherwise, you will receive an error.
119
+
<sup>2</sup> We reserve 20% extra compute resources for performing upgrades. For example, if you request 10 instances in a deployment, you must have a quota for 12. Otherwise, you'll receive an error.
119
120
120
121
<sup>3</sup> If you request a limit increase, be sure to calculate related limit increases you might need. For example, if you request a limit increase for requests per second, you might also want to compute the required connections and bandwidth limits and include these limit increases in the same request.
121
122
122
123
To determine the current usage for an endpoint, [view the metrics](how-to-monitor-online-endpoints.md#metrics).
123
124
124
-
To request an exception from the Azure Machine Learning product team, use the steps in the [Request quota increases](#request-quota-increases) section and provide the following information:
125
+
To request an exception from the Azure Machine Learning product team, use the steps in the [Request quota increases](#request-quota-increases).
125
126
126
-
1. When opening the support request, __do not select Service and subscription limits (quotas)__. Instead, select __Technical__ as the issue type.
127
-
1. Provide the Azure __subscriptions__ and __regions__ where you want to increase the quota.
128
-
1. Provide the __tenant ID__ and __customer name__.
129
-
1. Provide the __quota type__ and __new limit__. Use the following table as a guide:
130
-
131
-
| Quota Type | New Limit |
132
-
| ----- | ----- |
133
-
| MaxEndpointsPerSub (Number of endpoints per subscription) | ? |
134
-
| MaxDeploymentsPerSub (Number of deployments per subscription) | ? |
135
-
| MaxDeploymentsPerEndpoint (Number of deployments per endpoint) | ? |
136
-
| MaxInstancesPerDeployment (Number of instances per deployment) | ? |
137
-
| EndpointRequestRateLimitPerSec (Total requests per second at endpoint level for all deployments) | ? |
138
-
| EndpointConnectionRateLimitPerSec (Total connections per second at endpoint level for all deployments) | ? |
139
-
| EndpointConnectionLimit (Total connections active at endpoint level for all deployments) | ? |
140
-
| EndpointBandwidthLimitKBps (Total bandwidth at endpoint level for all deployments (MBPS)) | ? |
141
127
142
128
### Azure Machine Learning pipelines
143
129
[Azure Machine Learning pipelines](concept-ml-pipelines.md) have the following limits.
@@ -150,7 +136,7 @@ To request an exception from the Azure Machine Learning product team, use the st
150
136
### Virtual machines
151
137
Each Azure subscription has a limit on the number of virtual machines across all services. Virtual machine cores have a regional total limit and a regional limit per size series. Both limits are separately enforced.
152
138
153
-
For example, consider a subscription with a US East total VM core limit of 30, an A series core limit of 30, and a D series core limit of 30. This subscription would be allowed to deploy 30 A1 VMs, or 30 D1 VMs, or a combination of the two that does not exceed a total of 30 cores.
139
+
For example, consider a subscription with a US East total VM core limit of 30, an A series core limit of 30, and a D series core limit of 30. This subscription would be allowed to deploy 30 A1 VMs, or 30 D1 VMs, or a combination of the two that doesn't exceed a total of 30 cores.
154
140
155
141
You can't raise limits for virtual machines above the values shown in the following table.
156
142
@@ -191,7 +177,7 @@ You can't set a negative value or a value higher than the subscription-level quo
191
177
192
178
:::image type="content" source="media/how-to-manage-quotas/select-all-options.png" alt-text="Screenshot shows select all options to see compute resources that need more quota":::
193
179
194
-
1. Scroll down until you see the list of VM sizes you do not have quota for.
180
+
1. Scroll down until you see the list of VM sizes you don't have quota for.
195
181
196
182
:::image type="content" source="media/how-to-manage-quotas/scroll-to-zero-quota.png" alt-text="Screenshot shows list of zero quota":::
197
183
@@ -228,6 +214,15 @@ When you're requesting a quota increase, select the service that you have in min
228
214
> [!NOTE]
229
215
> [Free trial subscriptions](https://azure.microsoft.com/offers/ms-azr-0044p) are not eligible for limit or quota increases. If you have a free trial subscription, you can upgrade to a [pay-as-you-go](https://azure.microsoft.com/offers/ms-azr-0003p/) subscription. For more information, see [Upgrade Azure free trial to pay-as-you-go](../cost-management-billing/manage/upgrade-azure-subscription.md) and [Azure free account FAQ](https://azure.microsoft.com/free/free-account-faq).
230
216
217
+
### Endpoint quota increases
218
+
219
+
When requesting the quota increase, provide the following information:
220
+
221
+
1. When opening the support request, select __Machine Learning Service: Endpoint Limits__ as the __Quota type__.
222
+
1. On the __Additional details__ tab, select __Enter details__ and then provide the quota you'd like to increase and the new value, the reason for the quota increase request, and __location(s)__ where you need the quota increase. Finally, select __Save and continue__ to continue.
223
+
224
+
:::image type="content" source="./media/how-to-manage-quotas/quota-details.png" lightbox="./media/how-to-manage-quotas/quota-details.png" alt-text="Screenshot of the quota details form.":::
225
+
231
226
## Next steps
232
227
233
228
+[Plan and manage costs for Azure Machine Learning](concept-plan-manage-cost.md)
0 commit comments