Skip to content

Commit 009a86a

Browse files
authored
Allow zero for rescore_vector.oversample to indicate by-passing oversample and rescoring (#125599)
This allows a `rescore_vector: {oversample: 0}` to indicate bypassing oversampling and rescoring. This is useful for: - Updating a quantized mapping to turn off automatic rescoring - Bypassing oversampling at query time in an ad-hoc manner if its on by default in the mapping closes: #125157
1 parent 80125a4 commit 009a86a

File tree

14 files changed

+832
-74
lines changed

14 files changed

+832
-74
lines changed

docs/changelog/125599.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 125599
2+
summary: Allow zero for `rescore_vector.oversample` to indicate by-passing oversample
3+
and rescoring
4+
area: Vector Search
5+
type: enhancement
6+
issues: []

docs/reference/elasticsearch/mapping-reference/dense-vector.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,7 @@ $$$dense-vector-index-options$$$
291291
: (Optional, object) Functionality in [preview]. An optional section that configures automatic vector rescoring on knn queries for the given field. Only applicable to quantized index types.
292292
:::::{dropdown} Properties of `rescore_vector`
293293
`oversample`
294-
: (required, float) The amount to oversample the search results by. This value should be greater than `1.0` and less than `10.0`. The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
294+
: (required, float) The amount to oversample the search results by. This value should be greater than `1.0` and less than `10.0` or exactly `0` to indicate no oversampling & rescoring should occur. The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
295295
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
296296
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
297297
:::::

docs/reference/query-languages/query-dsl/query-dsl-knn-query.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ Rescoring only makes sense for quantized vectors; when [quantization](/reference
113113
* Retrieve `num_candidates` candidates per shard.
114114
* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors.
115115
* The top `k` rescored candidates will be returned.
116+
Must be >= 1f to indicate oversample factor, or exactly `0` to indicate that no oversampling and rescoring should occur.
116117

117118

118119
See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/41_knn_search_bbq_hnsw.yml

Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -337,3 +337,260 @@ setup:
337337
- match: { hits.hits.0._score: $rescore_score0 }
338338
- match: { hits.hits.1._score: $rescore_score1 }
339339
- match: { hits.hits.2._score: $rescore_score2 }
340+
---
341+
"Test index configured rescore vector updateable and settable to 0":
342+
- requires:
343+
cluster_features: ["mapper.dense_vector.rescore_zero_vector"]
344+
reason: Needs rescore_zero_vector feature
345+
346+
- do:
347+
indices.create:
348+
index: bbq_rescore_0_hnsw
349+
body:
350+
settings:
351+
index:
352+
number_of_shards: 1
353+
mappings:
354+
properties:
355+
vector:
356+
type: dense_vector
357+
index_options:
358+
type: bbq_hnsw
359+
rescore_vector:
360+
oversample: 0
361+
362+
- do:
363+
indices.create:
364+
index: bbq_rescore_update_hnsw
365+
body:
366+
settings:
367+
index:
368+
number_of_shards: 1
369+
mappings:
370+
properties:
371+
vector:
372+
type: dense_vector
373+
index_options:
374+
type: bbq_hnsw
375+
rescore_vector:
376+
oversample: 1
377+
378+
- do:
379+
indices.put_mapping:
380+
index: bbq_rescore_update_hnsw
381+
body:
382+
properties:
383+
vector:
384+
type: dense_vector
385+
index_options:
386+
type: bbq_hnsw
387+
rescore_vector:
388+
oversample: 0
389+
390+
- do:
391+
indices.get_mapping:
392+
index: bbq_rescore_update_hnsw
393+
394+
- match: { .bbq_rescore_update_hnsw.mappings.properties.vector.index_options.rescore_vector.oversample: 0 }
395+
---
396+
"Test index configured rescore vector score consistency":
397+
- requires:
398+
cluster_features: ["mapper.dense_vector.rescore_zero_vector"]
399+
reason: Needs rescore_zero_vector feature
400+
- skip:
401+
features: "headers"
402+
- do:
403+
indices.create:
404+
index: bbq_rescore_zero_hnsw
405+
body:
406+
settings:
407+
index:
408+
number_of_shards: 1
409+
mappings:
410+
properties:
411+
vector:
412+
type: dense_vector
413+
dims: 64
414+
index: true
415+
similarity: max_inner_product
416+
index_options:
417+
type: bbq_hnsw
418+
rescore_vector:
419+
oversample: 0
420+
421+
- do:
422+
bulk:
423+
index: bbq_rescore_zero_hnsw
424+
refresh: true
425+
body: |
426+
{ "index": {"_id": "1"}}
427+
{ "vector": [0.077, 0.32 , -0.205, 0.63 , 0.032, 0.201, 0.167, -0.313, 0.176, 0.531, -0.375, 0.334, -0.046, 0.078, -0.349, 0.272, 0.307, -0.083, 0.504, 0.255, -0.404, 0.289, -0.226, -0.132, -0.216, 0.49 , 0.039, 0.507, -0.307, 0.107, 0.09 , -0.265, -0.285, 0.336, -0.272, 0.369, -0.282, 0.086, -0.132, 0.475, -0.224, 0.203, 0.439, 0.064, 0.246, -0.396, 0.297, 0.242, -0.028, 0.321, -0.022, -0.009, -0.001 , 0.031, -0.533, 0.45, -0.683, 1.331, 0.194, -0.157, -0.1 , -0.279, -0.098, -0.176] }
428+
{ "index": {"_id": "2"}}
429+
{ "vector": [0.196, 0.514, 0.039, 0.555, -0.042, 0.242, 0.463, -0.348, -0.08 , 0.442, -0.067, -0.05 , -0.001, 0.298, -0.377, 0.048, 0.307, 0.159, 0.278, 0.119, -0.057, 0.333, -0.289, -0.438, -0.014, 0.361, -0.169, 0.292, -0.229, 0.123, 0.031, -0.138, -0.139, 0.315, -0.216, 0.322, -0.445, -0.059, 0.071, 0.429, -0.602, -0.142, 0.11 , 0.192, 0.259, -0.241, 0.181, -0.166, 0.082, 0.107, -0.05 , 0.155, 0.011, 0.161, -0.486, 0.569, -0.489, 0.901, 0.208, 0.011, -0.209, -0.153, -0.27 , -0.013] }
430+
{ "index": {"_id": "3"}}
431+
{ "vector": [0.196, 0.514, 0.039, 0.555, -0.042, 0.242, 0.463, -0.348, -0.08 , 0.442, -0.067, -0.05 , -0.001, 0.298, -0.377, 0.048, 0.307, 0.159, 0.278, 0.119, -0.057, 0.333, -0.289, -0.438, -0.014, 0.361, -0.169, 0.292, -0.229, 0.123, 0.031, -0.138, -0.139, 0.315, -0.216, 0.322, -0.445, -0.059, 0.071, 0.429, -0.602, -0.142, 0.11 , 0.192, 0.259, -0.241, 0.181, -0.166, 0.082, 0.107, -0.05 , 0.155, 0.011, 0.161, -0.486, 0.569, -0.489, 0.901, 0.208, 0.011, -0.209, -0.153, -0.27 , -0.013] }
432+
433+
- do:
434+
headers:
435+
Content-Type: application/json
436+
search:
437+
rest_total_hits_as_int: true
438+
index: bbq_rescore_zero_hnsw
439+
body:
440+
knn:
441+
field: vector
442+
query_vector: [0.128, 0.067, -0.08 , 0.395, -0.11 , -0.259, 0.473, -0.393,
443+
0.292, 0.571, -0.491, 0.444, -0.288, 0.198, -0.343, 0.015,
444+
0.232, 0.088, 0.228, 0.151, -0.136, 0.236, -0.273, -0.259,
445+
-0.217, 0.359, -0.207, 0.352, -0.142, 0.192, -0.061, -0.17 ,
446+
-0.343, 0.189, -0.221, 0.32 , -0.301, -0.1 , 0.005, 0.232,
447+
-0.344, 0.136, 0.252, 0.157, -0.13 , -0.244, 0.193, -0.034,
448+
-0.12 , -0.193, -0.102, 0.252, -0.185, -0.167, -0.575, 0.582,
449+
-0.426, 0.983, 0.212, 0.204, 0.03 , -0.276, -0.425, -0.158]
450+
k: 3
451+
num_candidates: 3
452+
453+
- match: { hits.total: 3 }
454+
- set: { hits.hits.0._score: raw_score0 }
455+
- set: { hits.hits.1._score: raw_score1 }
456+
- set: { hits.hits.2._score: raw_score2 }
457+
458+
459+
- do:
460+
headers:
461+
Content-Type: application/json
462+
search:
463+
rest_total_hits_as_int: true
464+
index: bbq_rescore_zero_hnsw
465+
body:
466+
knn:
467+
field: vector
468+
query_vector: [0.128, 0.067, -0.08 , 0.395, -0.11 , -0.259, 0.473, -0.393,
469+
0.292, 0.571, -0.491, 0.444, -0.288, 0.198, -0.343, 0.015,
470+
0.232, 0.088, 0.228, 0.151, -0.136, 0.236, -0.273, -0.259,
471+
-0.217, 0.359, -0.207, 0.352, -0.142, 0.192, -0.061, -0.17 ,
472+
-0.343, 0.189, -0.221, 0.32 , -0.301, -0.1 , 0.005, 0.232,
473+
-0.344, 0.136, 0.252, 0.157, -0.13 , -0.244, 0.193, -0.034,
474+
-0.12 , -0.193, -0.102, 0.252, -0.185, -0.167, -0.575, 0.582,
475+
-0.426, 0.983, 0.212, 0.204, 0.03 , -0.276, -0.425, -0.158]
476+
k: 3
477+
num_candidates: 3
478+
rescore_vector:
479+
oversample: 2
480+
481+
- match: { hits.total: 3 }
482+
- set: { hits.hits.0._score: override_score0 }
483+
- set: { hits.hits.1._score: override_score1 }
484+
- set: { hits.hits.2._score: override_score2 }
485+
486+
- do:
487+
indices.put_mapping:
488+
index: bbq_rescore_zero_hnsw
489+
body:
490+
properties:
491+
vector:
492+
type: dense_vector
493+
dims: 64
494+
index: true
495+
similarity: max_inner_product
496+
index_options:
497+
type: bbq_hnsw
498+
rescore_vector:
499+
oversample: 2
500+
501+
- do:
502+
headers:
503+
Content-Type: application/json
504+
search:
505+
rest_total_hits_as_int: true
506+
index: bbq_rescore_zero_hnsw
507+
body:
508+
knn:
509+
field: vector
510+
query_vector: [0.128, 0.067, -0.08 , 0.395, -0.11 , -0.259, 0.473, -0.393,
511+
0.292, 0.571, -0.491, 0.444, -0.288, 0.198, -0.343, 0.015,
512+
0.232, 0.088, 0.228, 0.151, -0.136, 0.236, -0.273, -0.259,
513+
-0.217, 0.359, -0.207, 0.352, -0.142, 0.192, -0.061, -0.17 ,
514+
-0.343, 0.189, -0.221, 0.32 , -0.301, -0.1 , 0.005, 0.232,
515+
-0.344, 0.136, 0.252, 0.157, -0.13 , -0.244, 0.193, -0.034,
516+
-0.12 , -0.193, -0.102, 0.252, -0.185, -0.167, -0.575, 0.582,
517+
-0.426, 0.983, 0.212, 0.204, 0.03 , -0.276, -0.425, -0.158]
518+
k: 3
519+
num_candidates: 3
520+
521+
- match: { hits.total: 3 }
522+
- set: { hits.hits.0._score: default_rescore0 }
523+
- set: { hits.hits.1._score: default_rescore1 }
524+
- set: { hits.hits.2._score: default_rescore2 }
525+
526+
- do:
527+
indices.put_mapping:
528+
index: bbq_rescore_zero_hnsw
529+
body:
530+
properties:
531+
vector:
532+
type: dense_vector
533+
dims: 64
534+
index: true
535+
similarity: max_inner_product
536+
index_options:
537+
type: bbq_hnsw
538+
rescore_vector:
539+
oversample: 0
540+
541+
- do:
542+
headers:
543+
Content-Type: application/json
544+
search:
545+
rest_total_hits_as_int: true
546+
index: bbq_rescore_zero_hnsw
547+
body:
548+
query:
549+
script_score:
550+
query: {match_all: {} }
551+
script:
552+
source: "double similarity = dotProduct(params.query_vector, 'vector'); return similarity < 0 ? 1 / (1 + -1 * similarity) : similarity + 1"
553+
params:
554+
query_vector: [0.128, 0.067, -0.08 , 0.395, -0.11 , -0.259, 0.473, -0.393,
555+
0.292, 0.571, -0.491, 0.444, -0.288, 0.198, -0.343, 0.015,
556+
0.232, 0.088, 0.228, 0.151, -0.136, 0.236, -0.273, -0.259,
557+
-0.217, 0.359, -0.207, 0.352, -0.142, 0.192, -0.061, -0.17 ,
558+
-0.343, 0.189, -0.221, 0.32 , -0.301, -0.1 , 0.005, 0.232,
559+
-0.344, 0.136, 0.252, 0.157, -0.13 , -0.244, 0.193, -0.034,
560+
-0.12 , -0.193, -0.102, 0.252, -0.185, -0.167, -0.575, 0.582,
561+
-0.426, 0.983, 0.212, 0.204, 0.03 , -0.276, -0.425, -0.158]
562+
563+
# Compare scores as hit IDs may change depending on how things are distributed
564+
- match: { hits.total: 3 }
565+
- match: { hits.hits.0._score: $override_score0 }
566+
- match: { hits.hits.0._score: $default_rescore0 }
567+
- match: { hits.hits.1._score: $override_score1 }
568+
- match: { hits.hits.1._score: $default_rescore1 }
569+
- match: { hits.hits.2._score: $override_score2 }
570+
- match: { hits.hits.2._score: $default_rescore2 }
571+
572+
- do:
573+
headers:
574+
Content-Type: application/json
575+
search:
576+
rest_total_hits_as_int: true
577+
index: bbq_rescore_zero_hnsw
578+
body:
579+
knn:
580+
field: vector
581+
query_vector: [0.128, 0.067, -0.08 , 0.395, -0.11 , -0.259, 0.473, -0.393,
582+
0.292, 0.571, -0.491, 0.444, -0.288, 0.198, -0.343, 0.015,
583+
0.232, 0.088, 0.228, 0.151, -0.136, 0.236, -0.273, -0.259,
584+
-0.217, 0.359, -0.207, 0.352, -0.142, 0.192, -0.061, -0.17 ,
585+
-0.343, 0.189, -0.221, 0.32 , -0.301, -0.1 , 0.005, 0.232,
586+
-0.344, 0.136, 0.252, 0.157, -0.13 , -0.244, 0.193, -0.034,
587+
-0.12 , -0.193, -0.102, 0.252, -0.185, -0.167, -0.575, 0.582,
588+
-0.426, 0.983, 0.212, 0.204, 0.03 , -0.276, -0.425, -0.158]
589+
k: 3
590+
num_candidates: 3
591+
592+
# Compare scores as hit IDs may change depending on how things are distributed
593+
- match: { hits.total: 3 }
594+
- match: { hits.hits.0._score: $raw_score0 }
595+
- match: { hits.hits.1._score: $raw_score1 }
596+
- match: { hits.hits.2._score: $raw_score2 }

0 commit comments

Comments
 (0)