Skip to content

Commit c0a9270

Browse files
committed
Addressed Robert's feedback
1 parent ff281b5 commit c0a9270

File tree

3 files changed

+70
-57
lines changed

3 files changed

+70
-57
lines changed

articles/search/search-api-migration.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ See [Migrate from preview version](semantic-how-to-configure.md#migrate-from-pre
7777

7878
[`2024-11-01-preview`](/rest/api/searchservice/search-service-api-versions#2024-11-01-preview) query rewrite, Document Layout skill, keyless billing for skills processing, Markdown parsing mode, and rescoring options for compressed vectors.
7979

80-
If you're upgrading from `2024-09-01-preview`, you can use the new preview APIs with no change to existing code. However, the new version introduces change to vectorSearch.compressions, replacing `rerankWithOriginalVectors` with `enableRescoring`, and moving `defaultOversampling` to a new `rescoringOptions` property object. For a comparison of the syntax, see [Compress vectors using scalar or binary quantization](vector-search-how-to-quantization.md#add-compressions-to-a-search-index).
80+
If you're upgrading from `2024-09-01-preview`, you can use the new preview APIs with no change to existing code. However, the new version introduces changes to `vectorSearch.compressions`, replacing `rerankWithOriginalVectors` with `enableRescoring`, and moving `defaultOversampling` to a new `rescoringOptions` property object. For a comparison of the syntax, see [Compress vectors using scalar or binary quantization](vector-search-how-to-quantization.md#add-compressions-to-a-search-index).
8181

8282
## Upgrade to 2024-09-01-preview
8383

@@ -129,7 +129,7 @@ If you're upgrading from the previous version, the next section has the steps.
129129

130130
## Upgrade from 2023-07-01-preview
131131

132-
Do not use this API version. It implements a vector query syntax that's incompatible with any newer API version.
132+
Don't use this API version. It implements a vector query syntax that's incompatible with any newer API version.
133133

134134
`2023-07-01-preview` is now deprecated, so you shouldn't base new code on this version, nor should you upgrade *to* this version under any circumstances. This section explains the migration path from `2023-07-01-preview` to any newer API version.
135135

articles/search/vector-search-how-to-quantization.md

Lines changed: 62 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ To use built-in quantization, follow these steps:
2828
2929
## Prerequisites
3030

31-
- [Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the HNSW algorithm and a new vector profile.
31+
- [Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the Hierarchical Navigable Small Worlds (HNSW) algorithm and a new vector profile.
3232

3333
## Supported quantization techniques
3434

@@ -152,7 +152,7 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-pre
152152

153153
- `rescoringOptions` are a collection of properties used to offset lossy compression by rescoring query results using the original full-precision vectors that exist prior to quantization. For rescoring to work, you must have the vector instance that provides this content. Setting `rescoreStorageMethod` to `discardOriginals` prevents you from using `enableRescoring` or `defaultOversampling`. For more information about vector storage, see [Eliminate optional vector instances from storage](vector-search-how-to-storage-options.md).
154154

155-
- `"enableRescoring": "preserveOriginals"` is the API equivalent of `"rerankWithOriginalVectors": true`. Rescoring vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step.
155+
- `"rescoreStorageMethod": "preserveOriginals"` is the API equivalent of `"rerankWithOriginalVectors": true`. Rescoring vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step.
156156

157157
- `defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
158158

@@ -162,9 +162,9 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-pre
162162

163163
---
164164

165-
## Add the HNSW algorithm
165+
## Add the vector search algorithm
166166

167-
Make sure your index has the Hierarchical Navigable Small Worlds (HNSW) algorithm. Built-in quantization isn't supported with exhaustive KNN.
167+
You can use HNSW algorithm or exhaustive KNN in the 2024-11-01-preview REST API. For the stable version, use HNSW only.
168168

169169
```json
170170
"vectorSearch": {
@@ -243,61 +243,44 @@ It's particularly effective for embeddings with dimensions greater than 1024. Fo
243243

244244
## Example: vector compression techniques
245245

246-
Here's Python code that demonstrates quantization, [narrow data types](vector-search-how-to-assign-narrow-data-types.md), and use of the [stored property](vector-search-how-to-storage-options.md).
247-
248-
This code is borrowed from [Code sample: Vector quantization and storage options using Python](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/vector-quantization-and-storage/README.md).
249-
250-
This code creates and compares storage and vector index size for each option.
251-
252-
```bash
253-
****************************************
254-
Index Name: compressiontest-baseline
255-
Storage Size: 21.3613MB
256-
Vector Size: 4.8277MB
257-
****************************************
258-
Index Name: compressiontest-compression
259-
Storage Size: 17.7604MB
260-
Vector Size: 1.2242MB
261-
****************************************
262-
Index Name: compressiontest-narrow
263-
Storage Size: 16.5567MB
264-
Vector Size: 2.4254MB
265-
****************************************
266-
Index Name: compressiontest-no-stored
267-
Storage Size: 10.9224MB
268-
Vector Size: 4.8277MB
269-
****************************************
270-
Index Name: compressiontest-all-options
271-
Storage Size: 4.9192MB
272-
Vector Size: 1.2242MB
273-
```
246+
[Code sample: Vector quantization and storage options using Python](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/vector-quantization-and-storage/README.md) provides Python code that demonstrates quantization, [narrow data types](vector-search-how-to-assign-narrow-data-types.md), and use of the [stored property](vector-search-how-to-storage-options.md).
247+
248+
This code creates and compares storage and vector index size for each vector storage optimization option. From these results, you can see that quantization reduces vector size the most, but the greatest storage savings are achieved if you use multiple options.
249+
250+
| Index name | Storage size | Vector size |
251+
|------------|--------------|-------------|
252+
| compressiontest-baseline | 21.3613MB | 4.8277MB |
253+
| compressiontest-scalar-compression | 17.7604MB | 1.2242MB |
254+
| compressiontest-narrow | 16.5567MB | 2.4254MB |
255+
| compressiontest-no-stored | 10.9224MB | 4.8277MB |
256+
| compressiontest-all-options | 4.9192MB | 1.2242MB |
274257

275258
Search APIs report storage and vector size at the index level, so indexes and not fields must be the basis of comparison. Use the [GET Index Statistics](/rest/api/searchservice/indexes/get-statistics) or an equivalent API in the Azure SDKs to obtain vector size.
276259

277260
## Query a quantized vector field using oversampling
278261

279-
Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling or reranking with original vectors.
262+
Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling or rescoring with original vectors.
280263

281-
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `rerankWithOriginalVectors` and `defaultOversampling` to mitigate the effects of a smaller vector index. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
264+
### [**2024-07-01**](#tab/query-2024-07-01)
265+
266+
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `rerankWithOriginalVectors` and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
282267

283268
You can set the oversampling parameter even if the index doesn't explicitly have a `rerankWithOriginalVectors` or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `rerankWithOriginalVectors` as true.
284269

285270
```http
286-
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-07-01  
287-
  Content-Type: application/json  
288-
  api-key: [admin key]  
289-
290-
{   
291-
"vectorQueries": [
292-
{   
293-
    "kind": "vector",   
294-
    "vector": [8, 2, 3, 4, 3, 5, 2, 1],   
295-
    "fields": "myvector",
296-
    "oversampling": 12.0,
297-
    "k": 5  
298-
}
299-
]   
300-
}
271+
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-07-01
272+
273+
{   
274+
"vectorQueries": [
275+
{   
276+
    "kind": "vector",   
277+
    "vector": [8, 2, 3, 4, 3, 5, 2, 1],   
278+
    "fields": "myvector",
279+
    "oversampling": 12.0,
280+
    "k": 5  
281+
}
282+
]   
283+
}
301284
```
302285

303286
**Key points**:
@@ -306,6 +289,36 @@ POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?ap
306289

307290
- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
308291

292+
### [**2024-11-01-preview**](#tab/query-2024-11-01-preview)
293+
294+
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `enableRescoring`, `rescoreStorageMethod`, and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
295+
296+
You can set the oversampling parameter even if the index doesn't explicitly have rescoring options or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `enableRescoring` as true.
297+
298+
```http
299+
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-11-01-preview
300+
301+
{   
302+
"vectorQueries": [
303+
{   
304+
    "kind": "vector",   
305+
    "vector": [8, 2, 3, 4, 3, 5, 2, 1],   
306+
    "fields": "myvector",
307+
    "oversampling": 12.0,
308+
    "k": 5  
309+
}
310+
]   
311+
}
312+
```
313+
314+
**Key points**:
315+
316+
- Applies to vector fields that undergo vector compression, per the vector profile assignment.
317+
318+
- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
319+
320+
---
321+
309322
<!--
310323
RESCORE WITH ORIGINAL VECTORS -- NEEDS AN H2 or H3
311324
It's used to rescore search results obtained used compressed vectors.

articles/search/vector-search-how-to-storage-options.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,19 +16,19 @@ Azure AI Search stores multiple copies of vector fields that are used in specifi
1616

1717
## Prerequisites
1818

19-
- [Vvector fields](vector-search-how-to-create-index.md) in a search index.
19+
- [Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the Hierarchical Navigable Small Worlds (HNSW) algorithm and a new vector profile.
2020

2121
## How vector fields are stored
2222

23-
For every vector field, there are three copies of vectors:
23+
For every vector field, there are three copies of the vectors:
2424

2525
| Instance | Usage |
2626
|----------|-------|
27-
| source vectors (in JSON) as received from an embedding model or push request to the index | Used if you want "retrievable" vectors returned in the query response. |
28-
| original full-precision vectors | Used if you want to rescore the query results obtained over compressed vectors. Applies only to vector fields subject to [scalar or binary quantization](vector-search-how-to-quantization.md). |
29-
| vectors and graph information created by the HNSW library | Used for query execution. |
27+
| Source vectors (in JSON) as received from an embedding model or push request to the index | Used for incremental data refresh, and if you want "retrievable" vectors returned in the query response. |
28+
| Original full-precision vectors | Used for scoring if vectors are uncompressed, or optional rescoring if query results obtained over compressed vectors. Rescoring applies only if vector fields undergo [scalar or binary quantization](vector-search-how-to-quantization.md). |
29+
| Vectors in the [HNSW graph for Approximate Nearest Neighbors (ANN) search](vector-search-overview.md) | Used for query execution. |
3030

31-
The last instance (vectors and graph) is required for vector query execution. The first two instances can be discarded if you don't need them. Compression techniques like scalar or binary quantization are applied to the vectors used during query execution.
31+
The last instance (vectors and graph) is required for ANN vector query execution. The first two instances can be discarded if you don't need them. Compression techniques like scalar or binary quantization are applied to the vectors used during query execution.
3232

3333
## Set the `stored` property
3434

0 commit comments

Comments
 (0)