-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Adding new experimental bbq index types #114439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
benwtrent
merged 8 commits into
elastic:main
from
benwtrent:feature/add-bbq-index-types
Oct 15, 2024
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
7649ff8
Adding new bbq index types behind a feature flag
benwtrent 1f857c9
adding tests, correcting scoring
benwtrent 0c311aa
adding docs
benwtrent 2c66653
fixing docs
benwtrent 9ecd815
Merge remote-tracking branch 'upstream/main' into feature/add-bbq-ind…
benwtrent b33ee9b
addressing PR comments, adding experimental tags
benwtrent 70f3c44
Update docs/changelog/114439.yaml
benwtrent 13f15f3
Merge branch 'main' into feature/add-bbq-index-types
elasticmachine File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 114439 | ||
summary: Adding new bbq index types behind a feature flag | ||
area: Vector Search | ||
type: feature | ||
issues: [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1149,3 +1149,95 @@ POST product-index/_search | |
---- | ||
//TEST[continued] | ||
|
||
[discrete] | ||
[[dense-vector-knn-search-reranking]] | ||
==== Oversampling and rescoring for quantized vectors | ||
|
||
All forms of quantization will result in some accuracy loss and as the quantization level increases the accuracy loss will also increase. | ||
Generally, we have found that: | ||
- `int8` requires minimal if any rescoring | ||
- `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss. | ||
- `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. | ||
|
||
There are two main ways to oversample and rescore. The first is to utilize the <<rescore, rescore section>> in the `_search` request. | ||
|
||
Here is an example using the top level `knn` search with oversampling and using `rescore` to rerank the results: | ||
|
||
[source,console] | ||
-------------------------------------------------- | ||
POST /my-index/_search | ||
{ | ||
"size": 10, <1> | ||
"knn": { | ||
"query_vector": [0.04283529, 0.85670587, -0.51402352, 0], | ||
"field": "my_int4_vector", | ||
"k": 20, <2> | ||
"num_candidates": 50 | ||
}, | ||
"rescore": { | ||
"window_size": 20, <3> | ||
"query": { | ||
"rescore_query": { | ||
"script_score": { | ||
"query": { | ||
"match_all": {} | ||
}, | ||
"script": { | ||
"source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> | ||
"params": { | ||
"queryVector": [0.04283529, 0.85670587, -0.51402352, 0] | ||
} | ||
} | ||
} | ||
}, | ||
"query_weight": 0, <5> | ||
"rescore_query_weight": 1 <6> | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
// TEST[skip: setup not provided] | ||
<1> The number of results to return, note its only 10 and we will oversample by 2x, gathering 20 nearest neighbors. | ||
<2> The number of results to return from the KNN search. This will do an approximate KNN search with 50 candidates | ||
per HNSW graph and use the quantized vectors, returning the 20 most similar vectors | ||
according to the quantized score. Additionally, since this is the top-level `knn` object, the global top 20 results | ||
will from all shards will be gathered before rescoring. Combining with `rescore`, this is oversampling by `2x`, meaning | ||
gathering 20 nearest neighbors according to quantized scoring and rescoring with higher fidelity float vectors. | ||
<3> The number of results to rescore, if you want to rescore all results, set this to the same value as `k` | ||
<4> The script to rescore the results. Script score will interact directly with the originally provided float32 vector. | ||
<5> The weight of the original query, here we simply throw away the original score | ||
<6> The weight of the rescore query, here we only use the rescore query | ||
|
||
The second way is to score per shard with the <<query-dsl-knn-query, knn query>> and <<query-dsl-script-score-query, script_score query >>. Generally, this means that there will be more rescoring per shard, but this | ||
can increase overall recall at the cost of compute. | ||
|
||
[source,console] | ||
-------------------------------------------------- | ||
POST /my-index/_search | ||
{ | ||
"size": 10, <1> | ||
"query": { | ||
"script_score": { | ||
"query": { | ||
"knn": { <2> | ||
"query_vector": [0.04283529, 0.85670587, -0.51402352, 0], | ||
"field": "my_int4_vector", | ||
"num_candidates": 20 <3> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use "k" instead of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can, but I would just set it to |
||
} | ||
}, | ||
"script": { | ||
"source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> | ||
"params": { | ||
"queryVector": [0.04283529, 0.85670587, -0.51402352, 0] | ||
} | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
// TEST[skip: setup not provided] | ||
<1> The number of results to return | ||
<2> The `knn` query to perform the initial search, this is executed per-shard | ||
<3> The number of candidates to use for the initial approximate `knn` search. This will search using the quantized vectors | ||
and return the top 20 candidates per shard to then be scored | ||
<4> The script to score the results. Script score will interact directly with the originally provided float32 vector. |
160 changes: 160 additions & 0 deletions
160
...c/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/41_knn_search_bbq_hnsw.yml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
setup: | ||
- requires: | ||
cluster_features: "mapper.vectors.bbq" | ||
reason: 'kNN float to better-binary quantization is required' | ||
- do: | ||
indices.create: | ||
index: bbq_hnsw | ||
body: | ||
settings: | ||
index: | ||
number_of_shards: 1 | ||
mappings: | ||
properties: | ||
name: | ||
type: keyword | ||
vector: | ||
type: dense_vector | ||
dims: 64 | ||
index: true | ||
similarity: l2_norm | ||
index_options: | ||
type: bbq_hnsw | ||
another_vector: | ||
type: dense_vector | ||
dims: 64 | ||
index: true | ||
similarity: l2_norm | ||
index_options: | ||
type: bbq_hnsw | ||
|
||
- do: | ||
index: | ||
index: bbq_hnsw | ||
id: "1" | ||
body: | ||
name: cow.jpg | ||
vector: [300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0, 230.0, 300.33, -34.8988, 15.555, -200.0] | ||
another_vector: [115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0, 130.0, 115.0, -1.02, 15.555, -100.0] | ||
# Flush in order to provoke a merge later | ||
- do: | ||
indices.flush: | ||
index: bbq_hnsw | ||
|
||
- do: | ||
index: | ||
index: bbq_hnsw | ||
id: "2" | ||
body: | ||
name: moose.jpg | ||
vector: [100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0, -0.5, 100.0, -13, 14.8, -156.0] | ||
another_vector: [50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120, -0.5, 50.0, -1, 1, 120] | ||
# Flush in order to provoke a merge later | ||
- do: | ||
indices.flush: | ||
index: bbq_hnsw | ||
|
||
- do: | ||
index: | ||
index: bbq_hnsw | ||
id: "3" | ||
body: | ||
name: rabbit.jpg | ||
vector: [111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0, 0.5, 111.3, -13.0, 14.8, -156.0] | ||
another_vector: [11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0, -0.5, 11.0, 0, 12, 111.0] | ||
# Flush in order to provoke a merge later | ||
- do: | ||
indices.flush: | ||
index: bbq_hnsw | ||
|
||
- do: | ||
indices.forcemerge: | ||
index: bbq_hnsw | ||
max_num_segments: 1 | ||
--- | ||
"Test knn search": | ||
- do: | ||
search: | ||
index: bbq_hnsw | ||
body: | ||
knn: | ||
field: vector | ||
query_vector: [ 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0, -0.5, 90.0, -10, 14.8, -156.0] | ||
k: 3 | ||
num_candidates: 3 | ||
|
||
# Depending on how things are distributed, docs 2 and 3 might be swapped | ||
# here we verify that are last hit is always the worst one | ||
- match: { hits.hits.2._id: "1" } | ||
|
||
--- | ||
"Test bad quantization parameters": | ||
- do: | ||
catch: bad_request | ||
indices.create: | ||
index: bad_bbq_hnsw | ||
body: | ||
mappings: | ||
properties: | ||
vector: | ||
type: dense_vector | ||
dims: 64 | ||
element_type: byte | ||
index: true | ||
index_options: | ||
type: bbq_hnsw | ||
|
||
- do: | ||
catch: bad_request | ||
indices.create: | ||
index: bad_bbq_hnsw | ||
body: | ||
mappings: | ||
properties: | ||
vector: | ||
type: dense_vector | ||
dims: 64 | ||
index: false | ||
index_options: | ||
type: bbq_hnsw | ||
--- | ||
"Test few dimensions fail indexing": | ||
- do: | ||
catch: bad_request | ||
indices.create: | ||
index: bad_bbq_hnsw | ||
body: | ||
mappings: | ||
properties: | ||
vector: | ||
type: dense_vector | ||
dims: 42 | ||
index: true | ||
index_options: | ||
type: bbq_hnsw | ||
|
||
- do: | ||
indices.create: | ||
index: dynamic_dim_bbq_hnsw | ||
body: | ||
mappings: | ||
properties: | ||
vector: | ||
type: dense_vector | ||
index: true | ||
similarity: l2_norm | ||
index_options: | ||
type: bbq_hnsw | ||
|
||
- do: | ||
catch: bad_request | ||
index: | ||
index: dynamic_dim_bbq_hnsw | ||
body: | ||
vector: [1.0, 2.0, 3.0, 4.0, 5.0] | ||
|
||
- do: | ||
index: | ||
index: dynamic_dim_bbq_hnsw | ||
body: | ||
vector: [1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.