Skip to content

Commit fb366e0

Browse files
committed
Add docs
1 parent 8720d28 commit fb366e0

File tree

5 files changed

+84
-3
lines changed

5 files changed

+84
-3
lines changed

docs/reference/mapping/types/dense-vector.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,8 @@ When using a quantized format, you may want to oversample and rescore the result
127127
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default
128128
index type is `int8_hnsw`.
129129

130+
Quantized vectors can use <<knn-quantized-vector-rescoring,rescoring>> to improve accuracy on approximate kNN search results.
131+
130132
NOTE: Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data.
131133
This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.
132134

docs/reference/query-dsl/knn-query.asciidoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,13 @@ documents are then scored according to <<dense-vector-similarity, `similarity`>>
134134
and the provided `boost` is applied.
135135
--
136136

137+
`rescore`::
138+
+
139+
--
140+
(Optional, object) Rescoring to apply to quantized vectors.
141+
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore]
142+
--
143+
137144
`boost`::
138145
+
139146
--

docs/reference/rest-api/common-parms.asciidoc

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1346,3 +1346,19 @@ tag::rrf-filter[]
13461346
Applies the specified <<query-dsl-bool-query, boolean query filter>> to all of the specified sub-retrievers,
13471347
according to each retriever's specifications.
13481348
end::rrf-filter[]
1349+
1350+
tag::knn-rescore[]
1351+
1352+
NOTE: Rescoring only makes sense for quantized vectors; when <<dense-vector-quantization,quantization>> is not used, the original vectors are used for scoring.
1353+
Rescore option will be ignored for non-quantized `dense_vector` fields.
1354+
1355+
`oversample`::
1356+
(Required, float)
1357+
+
1358+
Applies the specified oversample factor to the approximate kNN search.
1359+
The approximate kNN search will retrieve the top `k * oversample` candidates per shard,
1360+
and then use the original vectors for rescoring.
1361+
The top `k` rescored candidates will be returned as results.
1362+
1363+
See <<knn-quantized-vector-rescoring,rescoring quantized vectors>> for details.
1364+
end::knn-rescore[]

docs/reference/search/retriever.asciidoc

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,13 @@ include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-filter]
224224
+
225225
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-similarity]
226226

227+
`rescore`::
228+
+
229+
--
230+
(Optional, object) Rescoring to apply to quantized vectors.
231+
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore]
232+
--
233+
227234
===== Restrictions
228235

229236
The parameters `query_vector` and `query_vector_builder` cannot be used together.
@@ -446,15 +453,15 @@ This examples demonstrates how to deploy the Elastic Rerank model and use it to
446453

447454
Follow these steps:
448455

449-
. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
456+
. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
450457
+
451458
[source,console]
452459
----
453460
PUT _inference/rerank/my-elastic-rerank
454461
{
455462
"service": "elasticsearch",
456463
"service_settings": {
457-
"model_id": ".rerank-v1",
464+
"model_id": ".rerank-v1",
458465
"num_threads": 1,
459466
"adaptive_allocations": { <1>
460467
"enabled": true,
@@ -465,7 +472,7 @@ PUT _inference/rerank/my-elastic-rerank
465472
}
466473
----
467474
// TEST[skip:uses ML]
468-
<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
475+
<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
469476
+
470477
. Define a `text_similarity_rerank` retriever:
471478
+

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1012,6 +1012,55 @@ Now the result will contain the nearest found paragraph when searching.
10121012
// TESTRESPONSE[s/"took": 4/"took" : "$body.took"/]
10131013

10141014

1015+
[discrete]
1016+
[[knn-quantized-vector-rescoring]]
1017+
==== Rescoring results for quantized vectors
1018+
1019+
When using <<dense-vector-quantization,quantized vectors>> for kNN search, you can optionally rescore results to balance performance and accuracy.
1020+
Rescoring works by retrieving more results per shard using approximate kNN, and then use the original vector values for rescoring these results.
1021+
As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines:
1022+
- The performance and memory gains of approximate retrieval using quantized vectors on the top candidates.
1023+
- The accuracy of using the original vectors for rescoring the top candidates.
1024+
1025+
Rescoring won't be as accurate as an <<exact-knn,exact kNN search>>, as some of the top results may not be retrieved using approximate kNN search.
1026+
But the results retrieved by rescoring from the top candidates will have the same score and relative ordering as would be retrieved using exact kNN search.
1027+
1028+
You can use the `rescore` option to specify an `oversample` parameter.
1029+
When `oversample` is specified, the approximate kNN search will retrieve the top `k * oversample` candidates per shard.
1030+
It will then use the original vectors to rescore them, and return the top `k` results.
1031+
1032+
`num_candidates` will not be affected by oversample, besides ensuring that there are at least `k * oversample` candidates per shard.
1033+
1034+
Here is an example of using the `rescore` option with the `oversample` parameter:
1035+
1036+
[source,console]
1037+
----
1038+
POST image-index/_search
1039+
{
1040+
"knn": {
1041+
"field": "image-vector",
1042+
"query_vector": [-5, 9, -12],
1043+
"k": 10,
1044+
"num_candidates": 100,
1045+
"rescore": {
1046+
"oversample": 2.0
1047+
}
1048+
},
1049+
"fields": [ "title", "file-type" ]
1050+
}
1051+
----
1052+
//TEST[continued]
1053+
// TEST[s/"k": 10/"k": 3/]
1054+
// TEST[s/"num_candidates": 100/"num_candidates": 3/]
1055+
1056+
This example will effectively:
1057+
- Search using approximate kNN with `num_candidates` set to 100.
1058+
- Rescore the top 20 (`k * oversample`) candidates per shard using the original vectors.
1059+
- Return the top 10 (`k`) results from the rescored candidates.
1060+
1061+
NOTE: Rescoring only makes sense for quantized vectors; when <<dense-vector-quantization,quantization>> is not used, the original vectors are used for scoring.
1062+
Rescore option will be ignored for non-quantized `dense_vector` fields.
1063+
10151064
[discrete]
10161065
[[knn-indexing-considerations]]
10171066
==== Indexing considerations

0 commit comments

Comments
 (0)