Skip to content

Commit 505596f

Browse files
authored
Merge pull request #2273 from MicrosoftDocs/main
1/14/2025 11:00 AM IST Publish
2 parents de17d62 + 0185afc commit 505596f

22 files changed

+496
-122
lines changed

articles/ai-services/document-intelligence/faq.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ sections:
242242
243243
- Increase the workload gradually. Avoid sharp changes.
244244
245-
- [Create a support request](service-limits.md#create-and-submit-support-request) to increase transactions per second (TPS) limit.
245+
- [Create a support request](service-limits.md#create-and-submit-support-request-for-tps-increase) to increase transactions per second (TPS) limit.
246246
247247
Learn more about Document Intelligence [service quotas and limits](service-limits.md).
248248

articles/ai-services/document-intelligence/service-limits.md

Lines changed: 29 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: laujan
77
manager: nitinme
88
ms.service: azure-ai-document-intelligence
99
ms.topic: conceptual
10-
ms.date: 11/19/2024
10+
ms.date: 01/13/2025
1111
ms.author: lajanuar
1212
monikerRange: '<=doc-intel-4.0.0'
1313
---
@@ -64,7 +64,7 @@ Document Intelligence billing is calculated monthly based on the model type and
6464

6565
- Container pricing is the same as cloud service pricing.
6666

67-
- Document Intelligence offers a free tier (F0) where you can test all the Document Intelligence features.
67+
- Document Intelligence offers a free tier (F0) where you can test all the Document Intelligence features. The free tier limits analyze response to only the first two pages in a request.
6868

6969
- Document Intelligence has a commitment-based pricing model for large workloads.
7070

@@ -93,7 +93,13 @@ Document Intelligence billing is calculated monthly based on the model type and
9393

9494
|Quota|Free (F0)<sup>1</sup>|Standard (S0)|
9595
|--|--|--|
96-
| **Transactions Per Second limit** | 1 | 15 (default value) |
96+
| **Analyze transactions Per Second limit** | 1 | 15 (default value) |
97+
| Adjustable | No | Yes <sup>2</sup> |
98+
| **Get operations Per Second limit** | 1 | 50 (default value) |
99+
| Adjustable | No | Yes <sup>2</sup> |
100+
| **Model management operations Per Second limit** | 1 | 5 (default value) |
101+
| Adjustable | No | Yes <sup>2</sup> |
102+
| **List operations Per Second limit** | 1 | 10 (default value) |
97103
| Adjustable | No | Yes <sup>2</sup> |
98104
| **Max document size** | 4 MB | 500 MB |
99105
| Adjustable | No | No |
@@ -131,7 +137,7 @@ Document Intelligence billing is calculated monthly based on the model type and
131137
| Adjustable | No | No |
132138
| **Max number of pages (Training) * Neural and Generative** | 50,000 | 50,000 (default value) |
133139
| Adjustable | No | No |
134-
| **Custom neural model train** | 10 hours per month <sup>5</sup> | no limit (pay by the hour) |
140+
| **Custom neural model train** | 10 hours per month <sup>5</sup> | no limit (pay by the hour), start with 10 free hours each month |
135141
| Adjustable | No |Yes <sup>3</sup>|
136142
| **Max number of pages (Training) * Classifier** | 10,000 | 10,000 (default value) |
137143
| Adjustable | No | No |
@@ -238,9 +244,9 @@ Document Intelligence billing is calculated monthly based on the model type and
238244

239245
::: moniker range=">=doc-intel-2.1.0"
240246

241-
> <sup>1</sup> For **Free (F0)** pricing tier see also monthly allowances at the [pricing page](https://azure.microsoft.com/pricing/details/form-recognizer/).</br>
242-
> <sup>2</sup> See [best practices](#example-of-a-workload-pattern-best-practice), and [adjustment instructions](#create-and-submit-support-request).</br>
243-
> <sup>3</sup> Neural models training count is reset every calendar month. Open a support request to increase the monthly training limit.
247+
> <sup>1</sup> For **Free (F0)** pricing tier see also monthly allowances at the [pricing page](https://azure.microsoft.com/pricing/details/ai-document-intelligence/).</br>
248+
> <sup>2</sup> See [best practices](#example-of-a-workload-pattern-best-practice), and [adjustment instructions](#create-and-submit-support-request-for-tps-increase).</br>
249+
> <sup>3</sup> Neural models training count is reset every calendar month. Open a support request to increase the monthly training limit. Starting with the v4.0 API, training requests over 20 requests in a calendar month are billed on the training tier. See [pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/) for details.
244250
::: moniker-end
245251
::: moniker range=">=doc-intel-3.0.0"
246252
> <sup>4</sup> This limit applies to all documents found in your training dataset folder prior to any labeling-related updates.
@@ -251,49 +257,40 @@ Document Intelligence billing is calculated monthly based on the model type and
251257

252258
## Detailed description, Quota adjustment, and best practices
253259

254-
Before requesting a quota increase (where applicable), ensure that it's necessary. Document Intelligence service uses autoscaling to bring the required computational resources `on-demand`, keep the customer costs low, and deprovision unused resources by not maintaining an excessive amount of hardware capacity.
260+
The default limits can be extended by requesting an increase via a support ticket. Before requesting a quota increase (where applicable), ensure that it's necessary. Document Intelligence service uses autoscaling to bring the required computational resources `on-demand`, keep the customer costs low, and deprovision unused resources by not maintaining an excessive amount of hardware capacity.
255261

256-
If your application returns Response Code 429 (*Too many requests*) and your workload is within the defined limits: most likely, the service is scaling up to your demand, but has yet to reach the required scale. Thus the service doesn't immediately have enough resources to serve the request. This state is transient and shouldn't last long.
262+
If your application returns Response Code 429 (*Too many requests*) you are over the threshold for one or more of the transactions per second limits (TPS):
263+
* **Analyze transactions Per Second limit** The TPS for submitting analyze requests (POST)
264+
* **Get operations Per Second limit** The TPS for polling for results on analyze operations (GET)
265+
* **Model management operations Per Second limit** Operations related to model management like build/train and copy.
266+
* **List operations Per Second limit** Operations related to listing models, operations.
257267

258268
### General best practices to mitigate throttling during autoscaling
259269

260270
To minimize issues related to throttling (Response Code 429), we recommend using the following techniques:
261271

262272
* Implement retry logic in your application
263273
* Avoid sharp changes in the workload. Increase the workload gradually <br/>
264-
*Example.* Your application is using Document Intelligence and your current workload is 10 TPS (transactions per second). The next second you increase the load to 40 TPS (that is four times more). The Service immediately starts scaling up to fulfill the new load, but likely it can't do it within a second, so some of the requests get Response Code 429.
274+
*Example.* Your application is using Document Intelligence and your current workload is 10 TPS (transactions per second). The next second you increase the load to 40 TPS. The result is a 429 response code for some requests as you are over the 15 TPS limit for submitting analyze operations. You could either back off the processing to stay under the 15 TPS or request an increase on the TPS to support your higher volumes.
265275

266276
The next sections describe specific cases of adjusting quotas.
267-
Jump to [Document Intelligence: increasing concurrent request limit](#create-and-submit-support-request)
277+
Jump to [Document Intelligence: increasing concurrent request limit](#create-and-submit-support-request-for-tps-increase)
268278

269279
### Increasing transactions per second request limit
270280

271281
By default the number of transactions per second is limited to 15 transactions per second for a Document Intelligence resource. For the Standard pricing tier, this amount can be increased. Before submitting the request, ensure you're familiar with the material in [this section](#detailed-description-quota-adjustment-and-best-practices) and aware of these [best practices](#example-of-a-workload-pattern-best-practice).
272282

273-
Increasing the Concurrent Request limit does **not** directly affect your costs. Document Intelligence service uses "Pay only for what you use" model. The limit defines how high the Service can scale before it starts throttle your requests.
283+
The fist step would be to enable auto scaling. Follow this document to enable auto scaling on your resource * [enable auto scaling](../../ai-services/autoscale.md). With auto scaling enabled your resource can continue to accept requests over the TPS limits configured if there's capacity on the service. It can still result in request throttled.
274284

275-
Existing value of Concurrent Request limit parameter is **not** visible via Azure portal, Command-Line tools, or API requests. To verify the existing value, create an Azure Support Request.
276-
277-
If you would like to increase your transactions per second, you can enable auto scaling on your resource. Follow this document to enable auto scaling on your resource * [enable auto scaling](../../ai-services/autoscale.md). You can also submit an increase TPS support request.
278-
279-
#### Have the required information ready
285+
Increasing the Concurrent Request limit does **not** directly affect your costs. Document Intelligence service uses "Pay only for what you use" model. The limit defines how high the Service can scale before it starts throttle your requests.
280286

281-
- Document Intelligence Resource ID
282-
- Region
287+
The existing value of different request limit categories is available via Azure portal, under the monitoring tab on the resource overview blade.
283288

284-
- Base model information:
285-
- Sign in to the [Azure portal](https://portal.azure.com)
286-
- Select the Document Intelligence Resource for which you would like to increase the transaction limit
287-
- Select -Properties- (-Resource Management- group)
288-
- Copy and save the values of the following fields:
289-
- Resource ID
290-
- Location (your endpoint Region)
291289

292-
#### Create and submit support request
290+
#### Create and submit support request for TPS increase
293291

294292
Initiate the increase of transactions per second(TPS) limit for your resource by submitting the Support Request:
295293

296-
- Ensure you have the [required information](#have-the-required-information-ready)
297294
- Sign in to the [Azure portal](https://portal.azure.com)
298295
- Select the Document Intelligence Resource for which you would like to increase the TPS limit
299296
- Select -New support request- (-Support + troubleshooting- group). A new window appears with autopopulated information about your Azure Subscription and Azure Resource
@@ -303,16 +300,16 @@ Initiate the increase of transactions per second(TPS) limit for your resource by
303300
- Proceed further with the request creation
304301
- Enter the following information in the -Description- field, under the Details tab:
305302
- a note, that the request is about Document Intelligence quota.
306-
- Provide a TPS expectation you would like to scale to meet.
307-
- Azure resource information you [collected](#have-the-required-information-ready).
303+
- Provide a TPS expectation you would like to scale to meet. While TPS increases are free, you should only request a TPS that is reasonable for your workload.
304+
- Azure resource information
308305
- Complete entering the required information and select -Create- button in -Review + create- tab
309306
- Note the support request number in Azure portal notifications. Look for Support to contact you shortly for further processing.
310307

311308
## Example of a workload pattern best practice
312309

313310
This example presents the approach we recommend following to mitigate possible request throttling due to [Autoscaling being in progress](#detailed-description-quota-adjustment-and-best-practices). It isn't an *exact recipe*, but merely a template we invite to follow and adjust as necessary.
314311

315-
Let us suppose that a Document Intelligence resource has the default limit set. Start the workload to submit your analyze requests. If you find that you're seeing frequent throttling with response code 429, start by implementing an exponential backoff on the GET analyze response request. By using a progressively longer wait time between retries for consecutive error responses, for example a 2-5-13-34 pattern of delays between requests. In general, we recommended not calling the get analyze response more than once every 2 seconds for a corresponding POST request.
312+
Let us suppose that a Document Intelligence resource has the default limit set. Start the workload to submit your analyze requests. If you find that you're seeing frequent throttling with response code 429 when checking for completion, start by implementing an exponential backoff on the GET analyze response request. By using a progressively longer wait time between retries for consecutive error responses, for example a 2-5-13-34 pattern of delays between requests. In general, we recommended not calling the get analyze response more than once every 2 seconds for a corresponding POST request. The `analyze` response also contains a **retry-after** header that indicates how long you should wait in seconds before checking for completion of that request.
316313

317314
If you find that you're being throttled on the number of POST requests for documents being submitted, consider adding a delay between the requests. If your workload requires a higher degree of concurrent processing, you then need to create a support request to increase your service limits on transactions per second.
318315

@@ -321,4 +318,4 @@ Generally, we recommended testing the workload and the workload patterns before
321318
## Next steps
322319

323320
> [!div class="nextstepaction"]
324-
> [Learn about error codes and troubleshooting](v3-error-guide.md)
321+
> [Learn about error codes and troubleshooting](v3-error-guide.md)

0 commit comments

Comments
 (0)