Merge pull request #278136 from HeidiSteen/heidist-semantic

prmerger-automator[bot] · web-flow · commit dbeb4d6e54bc · 2024-06-13T16:32:05.000Z
[azure search] Semantic scoring distribution + throttling
diff --git a/articles/search/search-limits-quotas-capacity.md b/articles/search/search-limits-quotas-capacity.md
@@ -8,7 +8,7 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: cognitive-search
 ms.topic: conceptual
-ms.date: 05/21/2024
+ms.date: 06/13/2024
 ms.custom:
   - references_regions
   - build-2024
@@ -230,6 +230,10 @@ Static rate request limits for operations related to a service:
 
 + Service Statistics (GET /servicestats): 4 per second per search unit
 
+L2 reranking using the semantic reranker has an expected volume:
+
++ Up to 10 concurrent queries per replica. If you anticipate consistent throughput requirements near, at, or higher than this level, please file a support ticket so that we can provision for your workload.
+
 ## API request limits
 
 * Maximum of 16 MB per request <sup>1</sup>
diff --git a/articles/search/semantic-how-to-configure.md b/articles/search/semantic-how-to-configure.md
@@ -10,24 +10,26 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 02/08/2024
+ms.date: 06/13/2024
 ---
 
 # Configure semantic ranking and return captions in search results
 
-In this article, learn how to invoke a semantic ranking over a result set, promoting the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
+This article explains how to configure a search index for semantic reranking. 
+
+Semantic ranking iterates over an initial result set, applying an L2 ranking methodology that promotes the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
 
 ## Prerequisites
 
-+ A search service on Basic, Standard tier (S1, S2, S3), or Storage Optimized tier (L1, L2), subject to [region availability](https://azure.microsoft.com/global-infrastructure/services/?products=search).
++ A search service on a basic tier or higher, subject to [region availability](https://azure.microsoft.com/global-infrastructure/services/?products=search).
 
 + Semantic ranker [enabled on your search service](semantic-how-to-enable-disable.md).
 
-+ An existing search index with rich text content. Semantic ranking applies to text (nonvector) fields and works best on content that is informational or descriptive.
++ An existing search index with rich text content. Semantic ranking applies to strings (nonvector) fields and works best on content that is informational or descriptive.
 
 ## Choose a client
 
-Choose a search client that supports semantic ranking. Here are some options:
+You can use any of the following tools and SDKs to add a semantic configuration:
 
 + [Azure portal](https://portal.azure.com), using the index designer to add a semantic configuration.
 + [Visual Studio Code](https://code.visualstudio.com/download) with the [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client)
diff --git a/articles/search/semantic-how-to-query-request.md b/articles/search/semantic-how-to-query-request.md
@@ -10,16 +10,18 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 02/08/2024
+ms.date: 06/13/2024
 ---
 
 # Create a semantic query in Azure AI Search
 
-In this article, learn how to invoke a semantic ranking over a result set, promoting the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
+This article explains how to invoke the semantic ranker on queries. You can apply semantic ranking to text queries, hybrid queries, and vector queries if your search documents contain string fields and the [vector query has a text representation](vector-search-how-to-query.md#query-with-integrated-vectorization-preview).
+
+Semantic ranking iterates over an initial result set, applying an L2 ranking methodology that promotes the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and [semantic answers](semantic-answers.md).
 
 ## Prerequisites
 
-+ A search service, Basic tier or higher, with [semantic ranking](semantic-how-to-enable-disable.md).
++ A search service, basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
 
 + An existing search index with a [semantic configuration](semantic-how-to-configure.md) and rich text content.
 
@@ -30,7 +32,7 @@ In this article, learn how to invoke a semantic ranking over a result set, promo
 
 ## Choose a client
 
-Choose a search client that supports semantic ranking. Here are some options:
+You can use any of the following tools and SDKs to build a query that uses semantic ranking:
 
 + [Azure portal](https://portal.azure.com), using the index designer to add a semantic configuration.
 + [Visual Studio Code](https://code.visualstudio.com/download) with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client)
@@ -86,7 +88,7 @@ In this step, add parameters to the query request. To be successful, your query
         "count": true
     }
    ```
-   
+
 ### [**REST API**](#tab/rest-query)
 
 Use [Search Documents](/rest/api/searchservice/documents/search-post) to formulate the request.
@@ -220,6 +222,20 @@ The response for the above example query returns the following match as the top
 ]
 ```
 
+## Expected workloads
+
+For semantic ranking, you should expect a search service to support up to 10 concurrent queries per replica. 
+
+The service throttles semantic ranking requests if volumes are too high. An error message that includes these phrases indicate the service is at capacity for semantic ranking:
+
+```json
+Error in search query: Operation returned an invalid status 'Partial Content'`
+@search.semanticPartialResponseReason`
+CapacityOverloaded
+```
+
+If you anticipate consistent throughput requirements near, at, or higher than this level, please file a support ticket so that we can provision for your workload.
+
 ## Next steps
 
 Semantic ranking can be used in hybrid queries that combine keyword search and vector search into a single request and a unified response.
diff --git a/articles/search/semantic-search-overview.md b/articles/search/semantic-search-overview.md
@@ -10,12 +10,12 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: conceptual
-ms.date: 02/08/2024
+ms.date: 06/12/2024
 ---
 
 # Semantic ranking in Azure AI Search
 
-In Azure AI Search, *semantic ranking* measurably improves search relevance by using language understanding to rerank search results. This article is a high-level introduction. The section at the end covers [availability and pricing](#availability-and-pricing).
+In Azure AI Search, *semantic ranking* is a feature that measurably improves search relevance by using Microsoft's language understanding models to rerank search results. This article is a high-level introduction. The section at the end covers [availability and pricing](#availability-and-pricing).
 
 Semantic ranker is a premium feature, billed by usage. We recommend this article for background, but if you'd rather get started, follow these steps:
 
@@ -32,7 +32,7 @@ Semantic ranker is a premium feature, billed by usage. We recommend this article
 
 ## What is semantic ranking?
 
-Semantic ranker is a collection of query-related capabilities that improve the quality of an initial [BM25-ranked](index-similarity-and-scoring.md) or [RRF-ranked](hybrid-search-ranking.md) search result for text-based queries. When you enable it on your search service, semantic ranking extends the query execution pipeline in two ways: 
+Semantic ranker is a collection of query-side capabilities that improve the quality of an initial [BM25-ranked](index-similarity-and-scoring.md) or [RRF-ranked](hybrid-search-ranking.md) search result for text-based queries. When you enable it on your search service, semantic ranking extends the query execution pipeline in two ways: 
 
 * First, it adds secondary ranking over an initial result set that was scored using BM25 or RRF. This secondary ranking uses multi-lingual, deep learning models adapted from Microsoft Bing to promote the most semantically relevant results. 
 
@@ -101,7 +101,7 @@ Scoring is done over the caption, and any other content from the summary string
 1. Matches are listed in descending order by score and included in the query response payload. The payload includes answers, plain text and highlighted captions, and any fields that you marked as retrievable or specified in a select clause.
 
 > [!NOTE]
-> Beginning on July 14, 2023, the **@search.rerankerScore** distribution is changing. The effect on scores can't be determined except through testing. If you have a hard threshold dependency on this response property, rerun your tests to understand what the new values should be for your threshold.
+> For any given query, the distributions of **@search.rerankerScore** can exhibit slight variations due to conditions at the infrastructure level. Ranking model updates have also been known to affect the distribution. For these reasons, if you're writing custom code for minimum thresholds, or [setting the threshold property](vector-search-how-to-query.md#set-thresholds-to-exclude-low-scoring-results-preview) for vector and hybrid queries, don't make the limits too granular.
 
 ## Semantic capabilities and limitations