You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-api-migration.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,7 +77,7 @@ See [Migrate from preview version](semantic-how-to-configure.md#migrate-from-pre
77
77
78
78
[`2024-11-01-preview`](/rest/api/searchservice/search-service-api-versions#2024-11-01-preview) query rewrite, Document Layout skill, keyless billing for skills processing, Markdown parsing mode, and rescoring options for compressed vectors.
79
79
80
-
If you're upgrading from `2024-09-01-preview`, you can use the new preview APIs with no change to existing code. However, the new version introduces change to vectorSearch.compressions, replacing `rerankWithOriginalVectors` with `enableRescoring`, and moving `defaultOversampling` to a new `rescoringOptions` property object. For a comparison of the syntax, see [Compress vectors using scalar or binary quantization](vector-search-how-to-quantization.md#add-compressions-to-a-search-index).
80
+
If you're upgrading from `2024-09-01-preview`, you can use the new preview APIs with no change to existing code. However, the new version introduces changes to `vectorSearch.compressions`, replacing `rerankWithOriginalVectors` with `enableRescoring`, and moving `defaultOversampling` to a new `rescoringOptions` property object. For a comparison of the syntax, see [Compress vectors using scalar or binary quantization](vector-search-how-to-quantization.md#add-compressions-to-a-search-index).
81
81
82
82
## Upgrade to 2024-09-01-preview
83
83
@@ -129,7 +129,7 @@ If you're upgrading from the previous version, the next section has the steps.
129
129
130
130
## Upgrade from 2023-07-01-preview
131
131
132
-
Do not use this API version. It implements a vector query syntax that's incompatible with any newer API version.
132
+
Don't use this API version. It implements a vector query syntax that's incompatible with any newer API version.
133
133
134
134
`2023-07-01-preview` is now deprecated, so you shouldn't base new code on this version, nor should you upgrade *to* this version under any circumstances. This section explains the migration path from `2023-07-01-preview` to any newer API version.
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-quantization.md
+62-49Lines changed: 62 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@ To use built-in quantization, follow these steps:
28
28
29
29
## Prerequisites
30
30
31
-
-[Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the HNSW algorithm and a new vector profile.
31
+
-[Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the Hierarchical Navigable Small Worlds (HNSW) algorithm and a new vector profile.
32
32
33
33
## Supported quantization techniques
34
34
@@ -152,7 +152,7 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-pre
152
152
153
153
-`rescoringOptions` are a collection of properties used to offset lossy compression by rescoring query results using the original full-precision vectors that exist prior to quantization. For rescoring to work, you must have the vector instance that provides this content. Setting `rescoreStorageMethod` to `discardOriginals` prevents you from using `enableRescoring` or `defaultOversampling`. For more information about vector storage, see [Eliminate optional vector instances from storage](vector-search-how-to-storage-options.md).
154
154
155
-
-`"enableRescoring": "preserveOriginals"` is the API equivalent of `"rerankWithOriginalVectors": true`. Rescoring vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step.
155
+
-`"rescoreStorageMethod": "preserveOriginals"` is the API equivalent of `"rerankWithOriginalVectors": true`. Rescoring vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step.
156
156
157
157
-`defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
158
158
@@ -162,9 +162,9 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-pre
162
162
163
163
---
164
164
165
-
## Add the HNSW algorithm
165
+
## Add the vector search algorithm
166
166
167
-
Make sure your index has the Hierarchical Navigable Small Worlds (HNSW) algorithm. Built-in quantization isn't supported with exhaustive KNN.
167
+
You can use HNSW algorithm or exhaustive KNN in the 2024-11-01-preview REST API. For the stable version, use HNSW only.
168
168
169
169
```json
170
170
"vectorSearch": {
@@ -243,61 +243,44 @@ It's particularly effective for embeddings with dimensions greater than 1024. Fo
243
243
244
244
## Example: vector compression techniques
245
245
246
-
Here's Python code that demonstrates quantization, [narrow data types](vector-search-how-to-assign-narrow-data-types.md), and use of the [stored property](vector-search-how-to-storage-options.md).
247
-
248
-
This code is borrowed from [Code sample: Vector quantization and storage options using Python](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/vector-quantization-and-storage/README.md).
249
-
250
-
This code creates and compares storage and vector index size for each option.
251
-
252
-
```bash
253
-
****************************************
254
-
Index Name: compressiontest-baseline
255
-
Storage Size: 21.3613MB
256
-
Vector Size: 4.8277MB
257
-
****************************************
258
-
Index Name: compressiontest-compression
259
-
Storage Size: 17.7604MB
260
-
Vector Size: 1.2242MB
261
-
****************************************
262
-
Index Name: compressiontest-narrow
263
-
Storage Size: 16.5567MB
264
-
Vector Size: 2.4254MB
265
-
****************************************
266
-
Index Name: compressiontest-no-stored
267
-
Storage Size: 10.9224MB
268
-
Vector Size: 4.8277MB
269
-
****************************************
270
-
Index Name: compressiontest-all-options
271
-
Storage Size: 4.9192MB
272
-
Vector Size: 1.2242MB
273
-
```
246
+
[Code sample: Vector quantization and storage options using Python](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/vector-quantization-and-storage/README.md) provides Python code that demonstrates quantization, [narrow data types](vector-search-how-to-assign-narrow-data-types.md), and use of the [stored property](vector-search-how-to-storage-options.md).
247
+
248
+
This code creates and compares storage and vector index size for each vector storage optimization option. From these results, you can see that quantization reduces vector size the most, but the greatest storage savings are achieved if you use multiple options.
Search APIs report storage and vector size at the index level, so indexes and not fields must be the basis of comparison. Use the [GET Index Statistics](/rest/api/searchservice/indexes/get-statistics) or an equivalent API in the Azure SDKs to obtain vector size.
276
259
277
260
## Query a quantized vector field using oversampling
278
261
279
-
Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling or reranking with original vectors.
262
+
Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling or rescoring with original vectors.
280
263
281
-
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `rerankWithOriginalVectors` and `defaultOversampling` to mitigate the effects of a smaller vector index. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
264
+
### [**2024-07-01**](#tab/query-2024-07-01)
265
+
266
+
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `rerankWithOriginalVectors` and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
282
267
283
268
You can set the oversampling parameter even if the index doesn't explicitly have a `rerankWithOriginalVectors` or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `rerankWithOriginalVectors` as true.
284
269
285
270
```http
286
-
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-07-01
287
-
Content-Type: application/json
288
-
api-key: [admin key]
289
-
290
-
{
291
-
"vectorQueries": [
292
-
{
293
-
"kind": "vector",
294
-
"vector": [8, 2, 3, 4, 3, 5, 2, 1],
295
-
"fields": "myvector",
296
-
"oversampling": 12.0,
297
-
"k": 5
298
-
}
299
-
]
300
-
}
271
+
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-07-01
272
+
273
+
{
274
+
"vectorQueries": [
275
+
{
276
+
"kind": "vector",
277
+
"vector": [8, 2, 3, 4, 3, 5, 2, 1],
278
+
"fields": "myvector",
279
+
"oversampling": 12.0,
280
+
"k": 5
281
+
}
282
+
]
283
+
}
301
284
```
302
285
303
286
**Key points**:
@@ -306,6 +289,36 @@ POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?ap
306
289
307
290
- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `enableRescoring`, `rescoreStorageMethod`, and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
295
+
296
+
You can set the oversampling parameter even if the index doesn't explicitly have rescoring options or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `enableRescoring` as true.
297
+
298
+
```http
299
+
POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-11-01-preview
300
+
301
+
{
302
+
"vectorQueries": [
303
+
{
304
+
"kind": "vector",
305
+
"vector": [8, 2, 3, 4, 3, 5, 2, 1],
306
+
"fields": "myvector",
307
+
"oversampling": 12.0,
308
+
"k": 5
309
+
}
310
+
]
311
+
}
312
+
```
313
+
314
+
**Key points**:
315
+
316
+
- Applies to vector fields that undergo vector compression, per the vector profile assignment.
317
+
318
+
- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
319
+
320
+
---
321
+
309
322
<!--
310
323
RESCORE WITH ORIGINAL VECTORS -- NEEDS AN H2 or H3
311
324
It's used to rescore search results obtained used compressed vectors.
Copy file name to clipboardExpand all lines: articles/search/vector-search-how-to-storage-options.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,19 +16,19 @@ Azure AI Search stores multiple copies of vector fields that are used in specifi
16
16
17
17
## Prerequisites
18
18
19
-
-[Vvector fields](vector-search-how-to-create-index.md)in a search index.
19
+
-[Vector fields in a search index](vector-search-how-to-create-index.md)with a `vectorSearch` configuration, using the Hierarchical Navigable Small Worlds (HNSW) algorithm and a new vector profile.
20
20
21
21
## How vector fields are stored
22
22
23
-
For every vector field, there are three copies of vectors:
23
+
For every vector field, there are three copies of the vectors:
24
24
25
25
| Instance | Usage |
26
26
|----------|-------|
27
-
|source vectors (in JSON) as received from an embedding model or push request to the index | Used if you want "retrievable" vectors returned in the query response. |
28
-
|original full-precision vectors | Used if you want to rescore the query results obtained over compressed vectors. Applies only to vector fields subject to[scalar or binary quantization](vector-search-how-to-quantization.md). |
29
-
|vectors and graph information created by the HNSW library| Used for query execution. |
27
+
|Source vectors (in JSON) as received from an embedding model or push request to the index | Used for incremental data refresh, and if you want "retrievable" vectors returned in the query response. |
28
+
|Original full-precision vectors | Used for scoring if vectors are uncompressed, or optional rescoring if query results obtained over compressed vectors. Rescoring applies only if vector fields undergo[scalar or binary quantization](vector-search-how-to-quantization.md). |
29
+
|Vectors in the [HNSW graph for Approximate Nearest Neighbors (ANN) search](vector-search-overview.md)| Used for query execution. |
30
30
31
-
The last instance (vectors and graph) is required for vector query execution. The first two instances can be discarded if you don't need them. Compression techniques like scalar or binary quantization are applied to the vectors used during query execution.
31
+
The last instance (vectors and graph) is required for ANN vector query execution. The first two instances can be discarded if you don't need them. Compression techniques like scalar or binary quantization are applied to the vectors used during query execution.
0 commit comments