Throttling

HeidiSteen · HeidiSteen · commit 7a81fc2c3d66 · 2024-06-13T08:49:10.000-07:00
diff --git a/articles/search/search-limits-quotas-capacity.md b/articles/search/search-limits-quotas-capacity.md
@@ -8,7 +8,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: cognitive-search
 ms.topic: conceptual
-ms.date: 05/21/2024
+ms.date: 06/13/2024
 ms.custom:
   - references_regions
   - build-2024
@@ -230,6 +230,10 @@ Static rate request limits for operations related to a service:
 
 + Service Statistics (GET /servicestats): 4 per second per search unit
 
+L2 reranking using the semantic reranker has an expected volume:
+
++ Up to 10 concurrent queries per replica. If you anticipate consistent throughput requirements near, at, or higher than this level, please file a support ticket so that we can provision for your workload.
+
 ## API request limits
 
 * Maximum of 16 MB per request <sup>1</sup>
diff --git a/articles/search/semantic-how-to-configure.md b/articles/search/semantic-how-to-configure.md
@@ -10,24 +10,26 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 02/08/2024
+ms.date: 06/13/2024
 ---
 
 # Configure semantic ranking and return captions in search results
 
-In this article, learn how to invoke a semantic ranking over a result set, promoting the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
+This article explains how to configure a search index for semantic reranking. 
+
+Semantic ranking iterates over an initial result set, applying an L2 ranking methodology that promotes the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
 
 ## Prerequisites
 
-+ A search service on Basic, Standard tier (S1, S2, S3), or Storage Optimized tier (L1, L2), subject to [region availability](https://azure.microsoft.com/global-infrastructure/services/?products=search).
++ A search service on a basic tier or higher, subject to [region availability](https://azure.microsoft.com/global-infrastructure/services/?products=search).
 
 + Semantic ranker [enabled on your search service](semantic-how-to-enable-disable.md).
 
-+ An existing search index with rich text content. Semantic ranking applies to text (nonvector) fields and works best on content that is informational or descriptive.
++ An existing search index with rich text content. Semantic ranking applies to strings (nonvector) fields and works best on content that is informational or descriptive.
 
 ## Choose a client
 
-Choose a search client that supports semantic ranking. Here are some options:
+You can use any of the following tools and SDKs to add a semantic configuration:
 
 + [Azure portal](https://portal.azure.com), using the index designer to add a semantic configuration.
 + [Visual Studio Code](https://code.visualstudio.com/download) with the [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client)
diff --git a/articles/search/semantic-how-to-query-request.md b/articles/search/semantic-how-to-query-request.md
@@ -10,16 +10,18 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 02/08/2024
+ms.date: 06/13/2024
 ---
 
 # Create a semantic query in Azure AI Search
 
-In this article, learn how to invoke a semantic ranking over a result set, promoting the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
+This article explains how to invoke the semantic ranker on queries. You can apply semantic ranking to text queries, hybrid queries, and vector queries if your search documents contain string fields and the [vector query has a text representation](vector-search-how-to-query.md#query-with-integrated-vectorization-preview).
+
+Semantic ranking iterates over an initial result set, applying an L2 ranking methodology that promotes the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
 
 ## Prerequisites
 
-+ A search service, Basic tier or higher, with [semantic ranking](semantic-how-to-enable-disable.md).
++ A search service, basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
 
 + An existing search index with a [semantic configuration](semantic-how-to-configure.md) and rich text content.
 
@@ -30,7 +32,7 @@ In this article, learn how to invoke a semantic ranking over a result set, promo
 
 ## Choose a client
 
-Choose a search client that supports semantic ranking. Here are some options:
+You can use any of the following tools and SDKs to build a query that uses semantic ranking:
 
 + [Azure portal](https://portal.azure.com), using the index designer to add a semantic configuration.
 + [Visual Studio Code](https://code.visualstudio.com/download) with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client)
@@ -86,7 +88,7 @@ In this step, add parameters to the query request. To be successful, your query
         "count": true
     }
    ```
-   
+
 ### [**REST API**](#tab/rest-query)
 
 Use [Search Documents](/rest/api/searchservice/documents/search-post) to formulate the request.
@@ -220,6 +222,14 @@ The response for the above example query returns the following match as the top
 ]
 ```
 
+## Expected workloads
+
+For semantic ranking, you should expect a search service to support up to 10 concurrent queries per replica. 
+
+The service throttles semantic ranking requests if volumes are too high. Error messages that include `Error in search query: Operation returned an invalid status 'Partial Content'` with a code of `@search.semanticPartialResponseReason` and `CapacityOverloaded` indicate the service is at capacity for semantic ranking.
+
+If you anticipate consistent throughput requirements near, at, or higher than this level, please file a support ticket so that we can provision for your workload.
+
 ## Next steps
 
 Semantic ranking can be used in hybrid queries that combine keyword search and vector search into a single request and a unified response.