Skip to content

Conversation

benwtrent
Copy link
Member

@benwtrent benwtrent commented Oct 9, 2024

new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.

@benwtrent benwtrent added >non-issue cloud-deploy Publish cloud docker image for Cloud-First-Testing :Search Relevance/Vectors Vector search v8.16.0 v9.0.0 labels Oct 9, 2024
"knn": { <2>
"query_vector": [0.04283529, 0.85670587, -0.51402352, 0],
"field": "my_int4_vector",
"num_candidates": 20 <3>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use "k" instead of num_candidates to retrieve 20 results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can, but I would just set it to k. I don't see the value.

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept all the math in scorers which I don't completely follow, the rest LGTM.

Thanks @benwtrent. Great addition!

@benwtrent benwtrent marked this pull request as ready for review October 14, 2024 16:41
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent benwtrent added the auto-backport Automatically create backport pull requests when merged label Oct 15, 2024
@benwtrent benwtrent merged commit 6c752ab into elastic:main Oct 15, 2024
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 114439

@benwtrent
Copy link
Member Author

Did a larger scale benchmark over 134,705,698 1024 Coherev3 vectors on a single 64 GB node (about 32GB of off-heap allocated).

This was done with rally track: https://github.com/elastic/rally-tracks/tree/master/msmarco-v2-vector

I would expect rescoring per shard to have a slightly bigger latency hit, but better recall.

Took about 12hrs to ingest with aggressive merging enabled.

fmt: knn-recall-k-num_candidates-number_rescored

Here are some of the results:

Shows that oversampling and rescoring helps, but can hurt qps.
knn-recall-10-20_no_rescore
recall: 0.45
Avg Nodes Visited: 15,915.211
78% best ndgc
Single client latency: 9ms
Multi-client QPS: 1,134.649

knn-recall-10-100-50.
Its interesting that even at 5x oversampling, its latency isn't much worse and at 10x num candidates, it only visits 2x more vectors.
Recall: 0.704
Avg Nodes Visited: 36,079.801
90% of best NDGC
Multi-client QPS: 451.596 (30ms latency)
Single client latency: 18ms

knn-recall-10-1000-200: its pretty neat to see that even visiting >100k vectors over multiple segments has 42ms latency (we ain't even done optimizing this stuff yet).
recall: 0.895
AvgNodesVisited: 115,598.117
97% best ndgc
Single client latency: 42.534ms
Multi-client QPS: 167.806 (93ms)

@benwtrent benwtrent deleted the feature/add-bbq-index-types branch October 15, 2024 00:26
@benwtrent
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Oct 15, 2024
new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.

(cherry picked from commit 6c752ab)
davidkyle pushed a commit that referenced this pull request Oct 15, 2024
new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.
benwtrent added a commit that referenced this pull request Oct 15, 2024
…4783)

* Adding new bbq index types behind a feature flag (#114439)

new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.

(cherry picked from commit 6c752ab)

* spotless
@benwtrent benwtrent changed the title Adding new bbq index types behind a feature flag Adding new experimental bbq index types Oct 15, 2024
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.
@flobernd
Copy link
Member

flobernd commented Feb 3, 2025

Hi @benwtrent, would it be possible to add these types to the specification?

The types are currently missing in the client and there is no documentation available for them either.

Happy to add the new enum members myself, if somebody could provide a proper documentation string 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged cloud-deploy Publish cloud docker image for Cloud-First-Testing >feature :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants