Merge pull request #7344 from HeidiSteen/heidist-0901

JamesJBarnett · web-flow · commit c703a19b4bf2 · 2025-09-29T13:53:42.000-07:00
vector rescoring updates
diff --git a/articles/search/index-add-scoring-profiles.md b/articles/search/index-add-scoring-profiles.md
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 07/25/2025
+ms.date: 09/29/2025
 ms.update-cycle: 365-days
 ---
 
@@ -31,11 +31,11 @@ You can add a scoring profile to an index by editing its JSON definition in the
 
 ## Rules for scoring profiles
 
-You can use scoring profiles in [keyword search](search-lucene-query-architecture.md), [vector search](vector-search-overview.md), [hybrid search](hybrid-search-overview.md), and [semantic search (reranking)](semantic-search-overview.md). However, scoring profiles only apply to nonvector fields, so make sure your index has text or numeric fields that can be boosted or weighted. 
+You can use scoring profiles in [keyword search](search-lucene-query-architecture.md), [vector search](vector-search-overview.md), [hybrid search](hybrid-search-overview.md), and [semantic reranking)](semantic-search-overview.md). However, scoring profiles only apply to nonvector fields, so make sure your index has text or numeric fields that can be boosted or weighted. 
 
 You can have up to 100 scoring profiles within an index (see [service Limits](search-limits-quotas-capacity.md)), but you can only specify one profile at time in any given query.
 
-You can use [semantic ranker](semantic-how-to-query-request.md) with scoring profiles. Currently in preview, you can apply a [scoring profile after semantic ranking](semantic-how-to-enable-scoring-profiles.md). Otherwise, when multiple ranking or relevance features are in play, semantic ranking is the last step. [How search scoring works](search-relevance-overview.md#diagram-of-ranking-algorithms) provides an illustration.
+You can use [semantic ranker](semantic-how-to-query-request.md) with scoring profiles and apply a [scoring profile after semantic ranking](semantic-how-to-enable-scoring-profiles.md). Otherwise, when multiple ranking or relevance features are in play, semantic ranking is the last step. [How search scoring works](search-relevance-overview.md#diagram-of-ranking-algorithms) provides an illustration.
 
 [Extra rules](#rules-for-using-functions) apply specifically to functions.
 
diff --git a/articles/search/vector-search-how-to-quantization.md b/articles/search/vector-search-how-to-quantization.md
@@ -64,14 +64,14 @@ It's particularly effective for embeddings with dimensions greater than 1024. Fo
 
 Rescoring is an optional technique used to offset information loss due to vector quantization. During query execution, it uses oversampling to pick up extra vectors, and supplemental information to rescore initial results found by the query. Supplemental information is either uncompressed original full-precision vectors - or for binary quantization only - you have the option of rescoring using the binary quantized document candidates against the query vector. 
 
-Only HNSW graphs allow rescoring. Exhaustive KNN doesn't support rescoring because by definition, all vectors are scanned at query time, which makes oversampling irrelevant.
+Only HNSW graphs allow rescoring. Exhaustive KNN doesn't support rescoring because by definition, all vectors are scanned at query time, which makes rescoring and oversampling irrelevant.
 
 Rescoring options are specified in the index, but you can invoke rescoring at query time by adding the oversampling query parameter.
 
 | Object | Properties |
 |--------|------------|
-| Index | Add [`RescoringOptions`](/rest/api/searchservice/indexes/create-or-update#rescoringoptions) to the vector compressions section: `rescoringOptions.enableRescoring` (true or false), `rescoringOptions.defaultOversampling` (an integer), `rescoringOptions.rescoreStorageMethod` (preserveOriginals or discardOriginals). We recommend preserveOriginals for scalar quantization and discardOriginals for binary quantization. |
-| Query | Add `oversampling` on [`RawVectorQuery`](/rest/api/searchservice/documents/search-post#rawvectorquery) or [`VectorizableTextQuery`](/rest/api/searchservice/documents/search-post#vectorizabletextquery) definitions. |
+| Index | Add [`RescoringOptions`](/rest/api/searchservice/indexes/create-or-update#rescoringoptions) to the vector compressions section. The examples in this article use `RescoringOptions`. |
+| Query | Add `oversampling` on [`RawVectorQuery`](/rest/api/searchservice/documents/search-post#rawvectorquery) or [`VectorizableTextQuery`](/rest/api/searchservice/documents/search-post#vectorizabletextquery) definitions. Adding `oversampling` invokes rescoring at query time. |
 
 > [!NOTE]
 > Rescoring parameter names have changed over the last several releases. If you're using an older preview API, review the [upgrade instructions](search-api-migration.md#upgrade-to-2024-11-01-preview) for addressing breaking changes.
@@ -80,9 +80,11 @@ The generalized process for rescoring is:
 
 1. The vector query executes over compressed vector fields.
 1. The vector query returns the top k oversampled candidates.
-1. Oversampled k candidates are rescored using either the uncompressed original vectors, or the dot product of binary quantization.
+1. Oversampled k candidates are rescored using either the uncompressed original vectors for scalar quantization, or the dot product of binary quantization.
 1. After rescoring, results are adjusted so that more relevant matches appear first.
 
+Oversampling for scalar quantized vectors requires the availability of the original full precision vectors. Oversampling for binary quantized vectors can use either full precision vectors (`preserveOriginals`) or the dot product of the binary vector (`discardOriginals`). If you're optimizing vector storage, make sure to keep the full precision vectors in the index if you need them for rescoring purposes. For more information, see [Eliminate optional vector instances from storage](vector-search-how-to-storage-options.md).
+
 ## Add "compressions" to a search index
 
 This section explains how to specify a `vectorsSearch.compressions` section in the index. The following example shows a partial index definition with a fields collection that includes a vector field.
diff --git a/articles/search/vector-search-how-to-query.md b/articles/search/vector-search-how-to-query.md
@@ -133,9 +133,9 @@ api-key: {{admin-api-key}}
 }
 ```
 
-### [**2024-05-01-preview**](#tab/query-2024-05-01-preview)
+### [**2025-08-01-preview**](#tab/query-2025-08-01-preview)
 
-[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) is the latest preview API version of [Search - POST](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&tabs=HTTP&preserve-view=true). It supports the same vector query syntax as **2025-09-01**, but it has extra parameters for hybrid search and minimum thresholds for excluding weaker results.
+[**2025-08-01-preview**](/rest/api/searchservice/search-service-api-versions#2025-08-01-preview) is the latest preview API version of [Search - POST](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-08-01-preview&tabs=HTTP&preserve-view=true). It supports the same vector query syntax as **2025-09-01**, but it has extra parameters for hybrid search and minimum thresholds for excluding weaker results.
 
 This preview adds:
 
@@ -145,7 +145,7 @@ This preview adds:
 In the following example, the vector is a representation of this string: `"what Azure services support full text search"`. The query targets the `contentVector` field and returns `k` results. The actual vector has 1,536 embeddings, which are trimmed in this example for readability.
 
 ```http
-POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2024-05-01-preview
+POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2025-08-01-preview
 Content-Type: application/json
 api-key: {{admin-api-key}}
 {
diff --git a/articles/search/vector-search-how-to-storage-options.md b/articles/search/vector-search-how-to-storage-options.md