Skip to content

Commit 817995e

Browse files
committed
quota
1 parent 3a462d6 commit 817995e

File tree

2 files changed

+12
-8
lines changed

2 files changed

+12
-8
lines changed

articles/machine-learning/how-to-manage-quotas.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -137,12 +137,12 @@ To request an exception from the Azure Machine Learning product team, use the st
137137
| Number of deployments per endpoint | 20 | Yes | All types of endpoints <sup>3</sup> |
138138
| Number of deployments per cluster | 100 | - | Kubernetes online endpoint |
139139
| Number of instances per deployment | 50 <sup>4</sup> | Yes | Managed online endpoint |
140-
| Max request time-out at endpoint level | 180 seconds | - | Managed online endpoint |
140+
| Max request time-out at endpoint level | 180 seconds <sup>5</sup> | - | Managed online endpoint |
141141
| Max request time-out at endpoint level | 300 seconds | - | Kubernetes online endpoint |
142-
| Total requests per second at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
143-
| Total connections per second at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
144-
| Total connections active at endpoint level for all deployments | 500 <sup>5</sup> | Yes | Managed online endpoint |
145-
| Total bandwidth at endpoint level for all deployments | 5 MBPS <sup>5</sup> | Yes | Managed online endpoint |
142+
| Total requests per second at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
143+
| Total connections per second at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
144+
| Total connections active at endpoint level for all deployments | 500 <sup>6</sup> | Yes | Managed online endpoint |
145+
| Total bandwidth at endpoint level for all deployments | 5 MBPS <sup>6</sup> | Yes | Managed online endpoint |
146146

147147
<sup>1</sup> This is a regional limit. For example, if current limit on number of endpoints is 100, you can create 100 endpoints in the East US region, 100 endpoints in the West US region, and 100 endpoints in each of the other supported regions in a single subscription. Same principle applies to all the other limits.
148148

@@ -152,7 +152,9 @@ To request an exception from the Azure Machine Learning product team, use the st
152152

153153
<sup>4</sup> We reserve 20% extra compute resources for performing upgrades. For example, if you request 10 instances in a deployment, you must have a quota for 12. Otherwise, you receive an error. There are some VM SKUs that are exempt from extra quota. For more information on quota allocation, see [virtual machine quota allocation for deployment](#virtual-machine-quota-allocation-for-deployment).
154154

155-
<sup>5</sup> Requests per second, connections, bandwidth, etc. are related. If you request to increase any of these limits, ensure that you estimate/calculate other related limits together.
155+
<sup>5</sup> The request timeout maximum is 180 seconds unless it is a flow (prompt flow) deployment. The maximum request timeout for a flow deployment is 300 seconds. For more information on the timeout with flow deployments, see [deploy a flow in prompt flow](./prompt-flow/how-to-deploy-to-code.md#upstream-request-timeout-issue-when-consuming-the-endpoint).
156+
157+
<sup>6</sup> Requests per second, connections, bandwidth, etc. are related. If you request to increase any of these limits, ensure that you estimate/calculate other related limits together.
156158

157159
#### Virtual machine quota allocation for deployment
158160

articles/machine-learning/prompt-flow/how-to-deploy-to-code.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -493,9 +493,11 @@ request_settings:
493493
request_timeout_ms: 300000
494494
```
495495

496-
> [!NOTE]
496+
> [!IMPORTANT]
497+
>
498+
> The 300,000 ms timeout _only works for managed online deployments from prompt flow_. The maximum for a non-prompt flow managed online endpoint is 180 seconds.
497499
>
498-
> 300,000 ms timeout only works for maanged online deployments from prompt flow. You need to make sure that you have added properties for your model as below (either inline model specification in the deployment yaml or standalone model specification yaml) to indicate this is a deployment from prompt flow.
500+
> You need to make sure that you have added properties for your model as below (either inline model specification in the deployment yaml or standalone model specification yaml) to indicate this is a deployment from prompt flow.
499501

500502
```yaml
501503
properties:

0 commit comments

Comments
 (0)