Skip to content

Commit ac6062e

Browse files
Merge pull request #1458 from mattgotteiner/matt/fix-semantic-ranker-throttling-docs
Throttling doc fixes for semantic ranker
2 parents 8b0dfe9 + 4da150a commit ac6062e

File tree

1 file changed

+15
-2
lines changed

1 file changed

+15
-2
lines changed

articles/search/search-limits-quotas-capacity.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -197,9 +197,22 @@ Static rate request limits for operations related to a service:
197197

198198
+ Service Statistics (GET /servicestats): 4 per second per search unit
199199

200-
L2 reranking using the semantic reranker has an expected volume:
200+
### Semantic Ranker Throttling limits
201201

202-
+ Up to 10 concurrent queries per replica. If you anticipate consistent throughput requirements near, at, or higher than this level, please file a support ticket so that we can provision for your workload.
202+
[Semantic ranker](search-get-started-semantic.md) uses a queuing system to manage concurrent requests. This sytem allows search services get the highest amount of queries per second possible. When the limit of concurrent requests is reached, additional requests are placed in a queue. If the queue is full, further requests are rejected and must be retried.
203+
204+
Total semantic ranker queries per second varies based on the following factors:
205+
+ The SKU of the search service. Both queue capacity and concurrent request limits vary by SKU.
206+
+ The number of search units in the search service. The simplest way to increase the maximum amount of concurrent semantic ranker queries is to [add additional search units to your search service](search-capacity-planning.md#how-to-change-capacity).
207+
+ The total available semantic ranker capacity in the region.
208+
+ The amount of time it takes to serve a query using semantic ranker. This varies based on how busy the search service is.
209+
210+
The following table describes the semantic ranker throttling limits by SKU. Subject to available capacity in the region, contact support to request a limit increase.
211+
212+
| Resource | Basic | S1 | S2 | S3 | S3-HD | L1 | L2 |
213+
|----------|-------|----|----|----|-------|----|----|
214+
| Maximum Concurrent Requests (per Search Unit) | 2 | 3 | 4 | 4 | 4 | 4 | 4 |
215+
| Maximum Request Queue Size (per Search Unit) | 4 | 6 | 8 | 8 | 8 | 8 | 8 |
203216

204217
## API request limits
205218

0 commit comments

Comments
 (0)