Skip to content

Commit 319723b

Browse files
committed
clarified adjustable
1 parent fa228d3 commit 319723b

File tree

1 file changed

+14
-16
lines changed

1 file changed

+14
-16
lines changed

articles/cognitive-services/Speech-Service/speech-services-quotas-and-limits.md

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: conceptual
11-
ms.date: 04/22/2022
11+
ms.date: 02/17/2023
1212
ms.author: alexeyo
1313
---
1414

@@ -20,25 +20,25 @@ For the free (F0) pricing tier, see also the monthly allowances at the [pricing
2020

2121
## Quotas and limits reference
2222

23-
The following sections provide you with a quick guide to the quotas and limits that apply to Speech service.
23+
The following sections provide you with a quick guide to the quotas and limits that apply to the Speech service.
2424

2525
For information about adjustable quotas for Standard (S0) Speech resources, see [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#speech-to-text-increase-online-transcription-concurrent-request-limit). The quotas and limits for Free (F0) Speech resources aren't adjustable.
2626

2727
### Speech-to-text quotas and limits per resource
2828

29-
This section describes speech-to-text quotas and limits per Speech resource.
29+
This section describes speech-to-text quotas and limits per Speech resource. Unless otherwise specified, the limits aren't adjustable.
3030

3131
#### Online transcription
3232

3333
You can use online transcription with the [Speech SDK](speech-sdk.md) or the [speech-to-text REST API for short audio](rest-speech-to-text-short.md).
3434

3535
> [!IMPORTANT]
36-
> These limits apply to concurrent speech-to-text [online transcription](#online-transcription) requests and [speech translation](#speech-translation-quotas-and-limits-per-resource) requests combined. For example, if you have 50 concurrent speech-to-text requests and 50 concurrent speech translation requests, you'll reach the limit of 100 concurrent requests.
36+
> These limits apply to concurrent speech-to-text [online transcription](#online-transcription) requests and [speech translation](#speech-translation-quotas-and-limits-per-resource) requests combined. For example, if you have 60 concurrent speech-to-text requests and 40 concurrent speech translation requests, you'll reach the limit of 100 concurrent requests.
3737
3838
| Quota | Free (F0) | Standard (S0) |
3939
|--|--|--|
40-
| Concurrent request limit - base model endpoint | 1 | 100 (default value) |
41-
| Concurrent request limit - custom endpoint | 1 | 100 (default value) |
40+
| Concurrent request limit - base model endpoint | 1 <br/><br/>This limit isn't adjustable. | 100 (default value)<br/><br/>The rate is adjustable for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#speech-to-text-increase-online-transcription-concurrent-request-limit). |
41+
| Concurrent request limit - custom endpoint | 1 <br/><br/>This limit isn't adjustable. | 100 (default value)<br/><br/>The rate is adjustable for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#speech-to-text-increase-online-transcription-concurrent-request-limit). |
4242

4343
#### Batch transcription
4444

@@ -66,13 +66,13 @@ The limits in this table apply per Speech resource when you create a Custom Spee
6666

6767
### Text-to-speech quotas and limits per resource
6868

69-
This section describes text-to-speech quotas and limits per Speech resource.
69+
This section describes text-to-speech quotas and limits per Speech resource. Unless otherwise specified, the limits aren't adjustable.
7070

7171
#### Common text-to-speech quotas and limits
7272

7373
| Quota | Free (F0) | Standard (S0) |
7474
|--|--|--|
75-
| Maximum number of transactions per time period for prebuilt neural voices and custom neural voices. | 20 transactions per 60 seconds | 200 transactions per second (TPS) (default value)<br/><br/>The rate is adjustable up to 1000 TPS for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#text-to-speech-increase-concurrent-request-limit). |
75+
| Maximum number of transactions per time period for prebuilt neural voices and custom neural voices. | 20 transactions per 60 seconds<br/><br/>This limit isn't adjustable. | 200 transactions per second (TPS) (default value)<br/><br/>The rate is adjustable up to 1000 TPS for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#text-to-speech-increase-concurrent-request-limit). |
7676
| Max audio length produced per request | 10 min | 10 min |
7777
| Max total number of distinct `<voice>` and `<audio>` tags in SSML | 50 | 50 |
7878
| Max SSML message size per turn for websocket | 64 KB | 64 KB |
@@ -101,12 +101,12 @@ This section describes text-to-speech quotas and limits per Speech resource.
101101
This section describes speech translation quotas and limits per Speech resource.
102102

103103
> [!IMPORTANT]
104-
> These limits apply to concurrent speech-to-text [online transcription](#online-transcription) requests and [speech translation](#speech-translation-quotas-and-limits-per-resource) requests combined. For example, if you have 50 concurrent speech-to-text requests and 50 concurrent speech translation requests, you'll reach the limit of 100 concurrent requests.
104+
> These limits apply to concurrent speech-to-text [online transcription](#online-transcription) requests and [speech translation](#speech-translation-quotas-and-limits-per-resource) requests combined. For example, if you have 60 concurrent speech-to-text requests and 40 concurrent speech translation requests, you'll reach the limit of 100 concurrent requests.
105105
106106
| Quota | Free (F0) | Standard (S0) |
107107
|--|--|--|
108-
| Concurrent request limit - base model endpoint | 1 | 100 (default value) |
109-
| Concurrent request limit - custom endpoint | 1 | 100 (default value) |
108+
| Concurrent request limit - base model endpoint | 1 <br/><br/>This limit isn't adjustable. | 100 (default value)<br/><br/>The rate is adjustable for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#speech-to-text-increase-online-transcription-concurrent-request-limit). |
109+
| Concurrent request limit - custom endpoint | 1 <br/><br/>This limit isn't adjustable. | 100 (default value)<br/><br/>The rate is adjustable for Standard (S0) resources. See [additional explanations](#detailed-description-quota-adjustment-and-best-practices), [best practices](#general-best-practices-to-mitigate-throttling-during-autoscaling), and [adjustment instructions](#speech-to-text-increase-online-transcription-concurrent-request-limit). |
110110

111111
### Speaker recognition quotas and limits per resource
112112

@@ -139,14 +139,12 @@ The next sections describe specific cases of adjusting quotas.
139139

140140
### Speech-to-text: increase online transcription concurrent request limit
141141

142-
By default, the number of concurrent requests is limited to 100 per resource in the base model, and 100 per custom endpoint in the custom model. For the standard pricing tier, you can increase this amount. Before submitting the request, ensure that you're familiar with the material discussed earlier in this article, such as the best practices to mitigate throttling.
142+
By default, the number of concurrent speech-to-text [online transcription](#online-transcription) requests and [speech translation](#speech-translation-quotas-and-limits-per-resource) requests combined is limited to 100 per resource in the base model, and 100 per custom endpoint in the custom model. For the standard pricing tier, you can increase this amount. Before submitting the request, ensure that you're familiar with the material discussed earlier in this article, such as the best practices to mitigate throttling.
143143

144144
>[!NOTE]
145-
> If you use custom models, be aware that one Speech service resource might be associated with many custom endpoints hosting many custom model deployments. Each custom endpoint has the default limit of concurrent requests (100) set by creation. If you need to adjust it, you need to make the adjustment of each custom endpoint *separately*. Note also that the value of the limit of concurrent requests for the base model of a resource has *no* effect to the custom endpoints associated with this resource.
145+
> Concurrent request limits for base and custom models need to be adjusted separately. You can have a Speech service resource that's associated with many custom endpoints hosting many custom model deployments. As needed, the limit adjustments per custom endpoint must be requested separately.
146146
147-
Increasing the limit of concurrent requests doesn't directly affect your costs. Speech service uses a payment model that requires that you pay only for what you use. The limit defines how high the service can scale before it starts throttle your requests.
148-
149-
Concurrent request limits for base and custom models need to be adjusted separately.
147+
Increasing the limit of concurrent requests doesn't directly affect your costs. The Speech service uses a payment model that requires that you pay only for what you use. The limit defines how high the service can scale before it starts throttle your requests.
150148

151149
You aren't able to see the existing value of the concurrent request limit parameter in the Azure portal, the command-line tools, or API requests. To verify the existing value, create an Azure support request.
152150

0 commit comments

Comments
 (0)