You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-manage-quotas.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -137,12 +137,12 @@ To request an exception from the Azure Machine Learning product team, use the st
137
137
| Number of deployments per endpoint | 20 | Yes | All types of endpoints <sup>3</sup> |
138
138
| Number of deployments per cluster | 100 | - | Kubernetes online endpoint |
139
139
| Number of instances per deployment | 50 <sup>4</sup> | Yes | Managed online endpoint |
140
-
| Max request time-out at endpoint level | 180 seconds | - | Managed online endpoint |
140
+
| Max request time-out at endpoint level | 180 seconds <sup>5</sup> | - | Managed online endpoint |
141
141
| Max request time-out at endpoint level | 300 seconds | - | Kubernetes online endpoint |
142
-
| Total requests per second at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
143
-
| Total connections per second at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
144
-
| Total connections active at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
145
-
| Total bandwidth at endpoint level for all deployments | 5 MBPS <sup>5</sup> | Yes | Managed online endpoint |
142
+
| Total requests per second at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
143
+
| Total connections per second at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
144
+
| Total connections active at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
145
+
| Total bandwidth at endpoint level for all deployments | 5 MBPS <sup>6</sup> | Yes | Managed online endpoint |
146
146
147
147
<sup>1</sup> This is a regional limit. For example, if current limit on number of endpoints is 100, you can create 100 endpoints in the East US region, 100 endpoints in the West US region, and 100 endpoints in each of the other supported regions in a single subscription. Same principle applies to all the other limits.
148
148
@@ -152,7 +152,9 @@ To request an exception from the Azure Machine Learning product team, use the st
152
152
153
153
<sup>4</sup> We reserve 20% extra compute resources for performing upgrades. For example, if you request 10 instances in a deployment, you must have a quota for 12. Otherwise, you receive an error. There are some VM SKUs that are exempt from extra quota. For more information on quota allocation, see [virtual machine quota allocation for deployment](#virtual-machine-quota-allocation-for-deployment).
154
154
155
-
<sup>5</sup> Requests per second, connections, bandwidth, etc. are related. If you request to increase any of these limits, ensure that you estimate/calculate other related limits together.
155
+
<sup>5</sup> The request timeout maximum is 180 seconds unless it is a flow (prompt flow) deployment. The maximum request timeout for a flow deployment is 300 seconds. For more information on the timeout with flow deployments, see [deploy a flow in prompt flow](./prompt-flow/how-to-deploy-to-code.md#upstream-request-timeout-issue-when-consuming-the-endpoint).
156
+
157
+
<sup>6</sup> Requests per second, connections, bandwidth, etc. are related. If you request to increase any of these limits, ensure that you estimate/calculate other related limits together.
156
158
157
159
#### Virtual machine quota allocation for deployment
0 commit comments