DiskBBQ Updates #4037

john-wagster · 2025-11-21T17:21:13Z

Added content around DiskBBQ to the ANN docs. Specifically needed to do this so we could address sizing for DiskBBQ.

I have a somewhat related PR here: elastic/elasticsearch#138433, which should probably go in first.

…etter basis for diskbbq in the docs at all; added that

github-actions · 2025-11-21T17:23:21Z

🔍 Preview links for changed docs

github-actions · 2025-11-21T17:24:13Z

Vale Linting Results

Summary: 20 suggestions found

💡 Suggestions (20)

File	Line	Rule	Message
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	49	Elastic.Acronyms	'HNSW' has no definition.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	51	Elastic.Acronyms	'HNSW' has no definition.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.Acronyms	'HNSW' has no definition.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.Acronyms	'HNSW' has no definition.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.FutureTense	'will typically' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.FutureTense	'will scale' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.FutureTense	'will be' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	53	Elastic.FutureTense	'will be' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	68	Elastic.FutureTense	'will need' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	68	Elastic.FutureTense	'will be' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	93	Elastic.Acronyms	'HNSW' has no definition.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	95	Elastic.FutureTense	'will display' might be in future tense. Write in the present tense to describe the state of the product as it is now.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	95	Elastic.WordChoice	Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'see', unless the term is in the UI.
deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md	97	Elastic.Capitalization	'[source,console]' should use sentence-style capitalization.
solutions/search/vector/knn.md	64	Elastic.Acronyms	'HNSW' has no definition.
solutions/search/vector/knn.md	136	Elastic.Acronyms	'HNSW' has no definition.
solutions/search/vector/knn.md	136	Elastic.Semicolons	Use semicolons judiciously.
solutions/search/vector/knn.md	136	Elastic.WordChoice	Consider using 'can, might' instead of 'may', unless the term is in the UI.
solutions/search/vector/knn.md	138	Elastic.Wordiness	Consider using 'also' instead of 'In addition'.
solutions/search/vector/knn.md	138	Elastic.Acronyms	'HNSW' has no definition.

benwtrent · 2025-11-21T20:28:16Z

deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md

+
+HNSW is a graph-based algorithm which only works efficiently when most vector data is held in memory. You should ensure that data nodes have at least enough RAM to hold the vector data and index structures.
+
+DiskBBQ is a clustering algorithm which can scale effeciently on a fraction of the total memory. You can start with enough RAM to hold the vector data and index structures but it's likely you will be able to use signifigantly less than this and still maintain good performance. In testing we find this will be between 1-5% of the index structure size (centroids and quantized vector data) per unique query where unique queries access non-overlapping clusters.  


Let's be careful here, obviously, going to disk is always slower than just reading things in memory. We need to clarify that the performance degrades more linearly than with HNSW, which degrades exponentially.

I think calling about these percentages is OK.

makes sense I'll reword to clarify

reworded this a bit let me know what you think now

benwtrent · 2025-11-21T20:30:06Z

deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md


 If utilizing HNSW, the graph must also be in memory, to estimate the required bytes use `num_vectors * 4 * HNSW.m`. The default value for `HNSW.m` is 16, so by default `num_vectors * 4 * 16`.

+If utilizing DiskBBQ, a fraction of the clusters and centroids will need to be in memory.  When doing this estimation it makes more sense to include both the index structure and the quantized vectors together as the structures are dependent. To estimate the total bytes we compute the cost of the centroids as `num_clusters * num_dimensions * 2 * 4 + num_clusters * (num_dimensions + 14)` plus the cost of the quantized vectors within the clusters as `num_vectors * ((num_dimensions/8 + 14 + 2) * 2)` where `num_clusters` is defined as `num_vectors / vectors_per_cluster` which by default will be `num_vectors / 384`


This seems pretty complicated...

num_clusters * num_dimensions * 2 * 4 you need two floating point representations of the clusters?

That 2 is definitely not right. Removing that.

Went back and double checked what bytes are there.

posting list w/o vectors = num_clusters * (centroid_bytes + dotProd(centroid,centroid) + clusterSize + encoding) posting list w/o vectors = num_clusters * ((dims * 4) + 4 + 4 + 1)

I dont think it's worth it to describe the additional bytes. That would just make this even more complicated.

Suggestions for simplifying it are welcome. It probably seems more complex because it includes both the "index structure" for the centroids and the vectors because of SOAR. Maybe I can reference the bbq quantized calculation from above and somewhat simplify the centroids figures. So something like this instead. Thoughts?

the cost of the centroids as `num_clusters * (num_dimensions * 4 + (num_dimensions + 14))`

plus the cost of the quantized vectors within the clusters as `bbq_quantizated_vectors * 2`

The downside is this ignores the doc id (2 byte) cost per vector. I almost feel like it makes it more complicated to mention this rather than not if we break it up for instance like this:

plus the cost of the quantized vectors within the clusters as `bbq_quantizated_vectors * 2` + `num_vectors * 2 * 2`

I'll think about it some more.

… in the overhead calcs

…o diskbbq_updates

needed to add info about sizing but realized to do that we needed a b…

7bedece

…etter basis for diskbbq in the docs at all; added that

john-wagster added the wip label Nov 21, 2025

github-actions bot deployed to docs-preview November 21, 2025 17:21 View deployment

john-wagster mentioned this pull request Nov 21, 2025

Add OffHeapBytesSizes for Stats elastic/elasticsearch#138433

Merged

Merge branch 'main' into diskbbq_updates

4583290

github-actions bot deployed to docs-preview November 21, 2025 17:28 View deployment

john-wagster marked this pull request as ready for review November 21, 2025 20:04

john-wagster requested review from a team as code owners November 21, 2025 20:04

john-wagster removed the wip label Nov 21, 2025

benwtrent reviewed Nov 21, 2025

View reviewed changes

john-wagster added 2 commits November 21, 2025 15:50

tried to clarify / ground the DiskBBQ memory usage and fixed a *2 bug…

862a91d

… in the overhead calcs

Merge branch 'diskbbq_updates' of github.com:elastic/docs-content int…

51a3d9b

…o diskbbq_updates

github-actions bot deployed to docs-preview November 21, 2025 21:51 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DiskBBQ Updates #4037

DiskBBQ Updates #4037

john-wagster commented Nov 21, 2025

Uh oh!

github-actions bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

benwtrent Nov 21, 2025

Uh oh!

john-wagster Nov 21, 2025

Uh oh!

john-wagster Nov 21, 2025

Uh oh!

benwtrent Nov 21, 2025

Uh oh!

john-wagster Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		HNSW is a graph-based algorithm which only works efficiently when most vector data is held in memory. You should ensure that data nodes have at least enough RAM to hold the vector data and index structures.

		DiskBBQ is a clustering algorithm which can scale effeciently on a fraction of the total memory. You can start with enough RAM to hold the vector data and index structures but it's likely you will be able to use signifigantly less than this and still maintain good performance. In testing we find this will be between 1-5% of the index structure size (centroids and quantized vector data) per unique query where unique queries access non-overlapping clusters.


		If utilizing HNSW, the graph must also be in memory, to estimate the required bytes use `num_vectors * 4 * HNSW.m`. The default value for `HNSW.m` is 16, so by default `num_vectors * 4 * 16`.

		If utilizing DiskBBQ, a fraction of the clusters and centroids will need to be in memory. When doing this estimation it makes more sense to include both the index structure and the quantized vectors together as the structures are dependent. To estimate the total bytes we compute the cost of the centroids as `num_clusters * num_dimensions * 2 * 4 + num_clusters * (num_dimensions + 14)` plus the cost of the quantized vectors within the clusters as `num_vectors * ((num_dimensions/8 + 14 + 2) * 2)` where `num_clusters` is defined as `num_vectors / vectors_per_cluster` which by default will be `num_vectors / 384`

DiskBBQ Updates #4037

Are you sure you want to change the base?

DiskBBQ Updates #4037

Conversation

john-wagster commented Nov 21, 2025

Uh oh!

github-actions bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

github-actions bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vale Linting Results

Uh oh!

benwtrent Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

john-wagster Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

john-wagster Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

benwtrent Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

john-wagster Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Nov 21, 2025 •

edited

Loading

github-actions bot commented Nov 21, 2025 •

edited

Loading