You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/mapping/types/dense-vector.asciidoc
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -121,12 +121,12 @@ The three following quantization strategies are supported:
121
121
* `bbq` - experimental:[] Better binary quantization which reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, oversampling during query time and reranking can help mitigate the accuracy loss.
122
122
123
123
124
-
When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See <<dense-vector-knn-search-reranking, oversampling and rescoring>> for more information.
124
+
When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See <<dense-vector-knn-search-rescoring, oversampling and rescoring>> for more information.
125
125
126
126
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default
127
127
index type is `int8_hnsw`.
128
128
129
-
Quantized vectors can use <<dense-vector-knn-search-reranking,oversampling and rescoring>> to improve accuracy on approximate kNN search results.
129
+
Quantized vectors can use <<dense-vector-knn-search-rescoring,oversampling and rescoring>> to improve accuracy on approximate kNN search results.
130
130
131
131
NOTE: Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data.
132
132
This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.
Copy file name to clipboardExpand all lines: docs/reference/rest-api/common-parms.asciidoc
+8-4Lines changed: 8 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1367,12 +1367,16 @@ tag::knn-rescore-vector[]
1367
1367
NOTE: Rescoring only makes sense for quantized vectors; when <<dense-vector-quantization,quantization>> is not used, the original vectors are used for scoring.
1368
1368
Rescore option will be ignored for non-quantized `dense_vector` fields.
1369
1369
1370
-
`num_candidates_factor`::
1370
+
`oversample`::
1371
1371
(Required, float)
1372
1372
+
1373
-
Applies the specified oversample factor to the number of candidates on the approximate kNN search.
1374
-
The approximate kNN search will retrieve `num_candidates * num_candidates_factor` candidates per shard, and then use the original vectors for rescoring them.
1373
+
Applies the specified oversample factor to `k` on the approximate kNN search.
1374
+
The approximate kNN search will:
1375
1375
1376
-
See <<dense-vector-knn-search-reranking,oversampling and rescoring quantized vectors>> for details.
1376
+
* Retrieve `num_candidates` candidates per shard.
1377
+
* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors.
1378
+
* The top `k` rescored candidates will be returned.
1379
+
1380
+
See <<dense-vector-knn-search-rescoring,oversampling and rescoring quantized vectors>> for details.
Copy file name to clipboardExpand all lines: docs/reference/search/search-your-data/knn-search.asciidoc
+14-10Lines changed: 14 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1070,7 +1070,7 @@ the global top `k` matches across shards. You cannot set the
1070
1070
1071
1071
1072
1072
[discrete]
1073
-
[[dense-vector-knn-search-reranking]]
1073
+
[[dense-vector-knn-search-rescoring]]
1074
1074
==== Oversampling and rescoring for quantized vectors
1075
1075
1076
1076
When using <<dense-vector-quantization,quantized vectors>> for kNN search, you can optionally rescore results to balance performance and accuracy, by doing:
@@ -1091,10 +1091,13 @@ Generally, we have found that:
1091
1091
* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required.
1092
1092
1093
1093
You can use the `rescore_vector` preview:[] option to automatically perform reranking.
1094
-
When a rescore `num_candidates_factor` parameter is specified, the approximate kNN search will retrieve the top `num_candidates * oversample` candidates per shard.
1095
-
It will then use the original vectors to rescore them, and return the top `k` results.
1094
+
When a rescore `oversample` parameter is specified, the approximate kNN search will:
1096
1095
1097
-
Here is an example of using the `rescore_vector` option with the `num_candidates_factor` parameter:
1096
+
* Retrieve `num_candidates` candidates per shard.
1097
+
* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors.
1098
+
* The top `k` rescored candidates will be returned.
1099
+
1100
+
Here is an example of using the `rescore_vector` option with the `oversample` parameter:
1098
1101
1099
1102
[source,console]
1100
1103
----
@@ -1106,7 +1109,7 @@ POST image-index/_search
1106
1109
"k": 10,
1107
1110
"num_candidates": 100,
1108
1111
"rescore_vector": {
1109
-
"num_candidates_factor": 2.0
1112
+
"oversample": 2.0
1110
1113
}
1111
1114
},
1112
1115
"fields": [ "title", "file-type" ]
@@ -1118,18 +1121,19 @@ POST image-index/_search
1118
1121
1119
1122
This example will:
1120
1123
1121
-
* Search using approximate kNN with `num_candidates` set to 200 (`num_candidates` * `num_candidates_factor`).
1122
-
* Rescore the top 200 candidates per shard using the original, non quantized vectors.
1124
+
* Search using approximate kNN for the top 100 candidates.
1125
+
* Rescore the top 20 candidates (`oversample * k`) per shard using the original, non quantized vectors.
1126
+
* Return the top 10 (`k`) rescored candidates.
1123
1127
* Merge the rescored canddidates from all shards, and return the top 10 (`k`) results.
0 commit comments