From f6c4058c93db11e32baa4c4cbcb9d68f8ab97ca7 Mon Sep 17 00:00:00 2001 From: John Wagster Date: Wed, 3 Sep 2025 14:51:32 -0500 Subject: [PATCH 1/4] updating docs to reflect new bbq_disk format and changes in the mapping and query apis --- .../mapping-reference/dense-vector.md | 17 ++++++++++++----- .../rest-apis/retrievers/knn-retriever.md | 9 +++++++++ .../query-dsl/query-dsl-knn-query.md | 14 ++++++++++++-- 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/dense-vector.md b/docs/reference/elasticsearch/mapping-reference/dense-vector.md index de6d82104c005..db93d9221296c 100644 --- a/docs/reference/elasticsearch/mapping-reference/dense-vector.md +++ b/docs/reference/elasticsearch/mapping-reference/dense-vector.md @@ -341,18 +341,19 @@ $$$dense-vector-index-options$$$ `type` : (Required, string) The type of kNN algorithm to use. Can be either any of: * `hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) for scalable approximate kNN search. This supports all `element_type` values. - * `int8_hnsw` - The default index type for some float vectors: + * `int8_hnsw` - The default index type for some float vectors: * {applies_to}`stack: ga 9.1` Default for float vectors with less than 384 dimensions. * {applies_to}`stack: ga 9.0` Default for float all vectors. This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization). * `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization). * `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization). - + {applies_to}`stack: ga 9.1` `bbq_hnsw` is the default index type for float vectors with greater than or equal to 384 dimensions. * `flat` - This utilizes a brute-force search algorithm for exact kNN search. This supports all `element_type` values. - * `int8_flat` - This utilizes a brute-force search algorithm in addition to automatically scalar quantization. Only supports `element_type` of `float`. - * `int4_flat` - This utilizes a brute-force search algorithm in addition to automatically half-byte scalar quantization. Only supports `element_type` of `float`. - * `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatically binary quantization. Only supports `element_type` of `float`. + * `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float`. + * `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float`. + * `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`. + * {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a clustering search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`. `m` : (Optional, integer) The number of neighbors each node will be connected to in the HNSW graph. Defaults to `16`. Only applicable to `hnsw`, `int8_hnsw`, `int4_hnsw` and `bbq_hnsw` index types. @@ -363,6 +364,12 @@ $$$dense-vector-index-options$$$ `confidence_interval` : (Optional, float) Only applicable to `int8_hnsw`, `int4_hnsw`, `int8_flat`, and `int4_flat` index types. The confidence interval to use when quantizing the vectors. Can be any value between and including `0.90` and `1.0` or exactly `0`. When the value is `0`, this indicates that dynamic quantiles should be calculated for optimized quantization. When between `0.90` and `1.0`, this value restricts the values used when calculating the quantization thresholds. For example, a value of `0.95` will only use the middle 95% of the values when calculating the quantization thresholds (e.g. the highest and lowest 2.5% of values will be ignored). Defaults to `1/(dims + 1)` for `int8` quantized vectors and `0` for `int4` for dynamic quantile calculation. +`default_visit_percentage` {applies_to}`stack: ga 9.2` +: (Optional, integer) Only applicable to `bbq_disk`. Must be between 0 and 100. 0 will default to using `num_candidates` for calculating the percent visited. Increasing `default_visit_percentage` tends to improve the accuracy of the final results. Defaults to ~1% per shard for every 1 million vectors. + +`cluster_size` {applies_to}`stack: ga 9.2` +: (Optional, integer) Only applicable to `bbq_disk`. The number of vectors per cluster. Smaller cluster sizes increases accuracy at the cost of performance. Defaults to `384`. Must be a value between `64` and `65536`. + `rescore_vector` {applies_to}`stack: preview 9.0, ga 9.1` : (Optional, object) An optional section that configures automatic vector rescoring on knn queries for the given field. Only applicable to quantized index types. :::::{dropdown} Properties of rescore_vector diff --git a/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md b/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md index 12da522214383..2b7f162255035 100644 --- a/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md +++ b/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md @@ -41,6 +41,15 @@ A kNN retriever returns top documents from a [k-nearest neighbor search (kNN)](d The number of nearest neighbor candidates to consider per shard. Needs to be greater than `k`, or `size` if `k` is omitted, and cannot exceed 10,000. {{es}} collects `num_candidates` results from each shard, then merges them to find the top `k` results. Increasing `num_candidates` tends to improve the accuracy of the final `k` results. Defaults to `Math.min(1.5 * k, 10_000)`. +```{applies_to} +stack: ga 9.2 +``` +`visit_percentage` +: (Optional, float) + + The percentage of vectors to explore per shard while doing knn search with `bbq_disk`. Must be between 0 and 100. 0 will default to using `num_candidates` for calculating the percent visited. Increasing `visit_percentage` tends to improve the accuracy of the final results. If `visit_percentage` is set for `bbq_disk`, `num_candidates` is ignored. Defaults to ~1% per shard for every 1 million vectors. + + `filter` : (Optional, [query object or list of query objects](/reference/query-languages/querydsl.md)) diff --git a/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md b/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md index 4af581e466a7a..36e3ae810921f 100644 --- a/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md +++ b/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md @@ -2,6 +2,9 @@ navigation_title: "Knn" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html +applies_to: + stack: all + serverless: all --- # Knn query [query-dsl-knn-query] @@ -87,6 +90,13 @@ If all queried fields are of type [semantic_text](/reference/elasticsearch/mappi : (Optional, integer) The number of nearest neighbor candidates to consider per shard while doing knn search. Cannot exceed 10,000. Increasing `num_candidates` tends to improve the accuracy of the final results. Defaults to `1.5 * k` if `k` is set, or `1.5 * size` if `k` is not set. +```{applies_to} +stack: ga 9.2 +``` +`visit_percentage` +: (Optional, float) The percentage of vectors to explore per shard while doing knn search with `bbq_disk`. Must be between 0 and 100. 0 will default to using `num_candidates` for calculating the percent visited. Increasing `visit_percentage` tends to improve the accuracy of the final results. If `visit_percentage` is set for `bbq_disk`, `num_candidates` is ignored. Defaults to ~1% per shard for every 1 million vectors. + + `filter` : (Optional, query object) Query to filter the documents that can match. The kNN search will return the top documents that also match this filter. The value can be a single query or a list of queries. If `filter` is not provided, all documents are allowed to match. @@ -108,7 +118,7 @@ The filter is a pre-filter, meaning that it is applied **during** the approximat : (Optional, object) Apply oversampling and rescoring to quantized vectors. **Parameters for `rescore_vector`**: - + `oversample` : (Required, float) @@ -116,7 +126,7 @@ The filter is a pre-filter, meaning that it is applied **during** the approximat * Retrieve `num_candidates` candidates per shard. * From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors. - * The top `k` rescored candidates will be returned. Must be one of the following values: + * The top `k` rescored candidates will be returned. Must be one of the following values: * \>= 1f to indicate the oversample factor * Exactly `0` to indicate that no oversampling and rescoring should occur. {applies_to}`stack: ga 9.1` From 3d3275a3ccfe1cb043911cca8bbcdc9fbb84ab6d Mon Sep 17 00:00:00 2001 From: John Wagster Date: Wed, 3 Sep 2025 16:16:50 -0500 Subject: [PATCH 2/4] added a bit of 'when to use this' which we should iterate on --- docs/reference/elasticsearch/mapping-reference/dense-vector.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/dense-vector.md b/docs/reference/elasticsearch/mapping-reference/dense-vector.md index db93d9221296c..eeac60325befc 100644 --- a/docs/reference/elasticsearch/mapping-reference/dense-vector.md +++ b/docs/reference/elasticsearch/mapping-reference/dense-vector.md @@ -353,7 +353,7 @@ $$$dense-vector-index-options$$$ * `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float`. * `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float`. * `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`. - * {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a clustering search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`. + * {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearlly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded. `m` : (Optional, integer) The number of neighbors each node will be connected to in the HNSW graph. Defaults to `16`. Only applicable to `hnsw`, `int8_hnsw`, `int4_hnsw` and `bbq_hnsw` index types. From 424a05ee83da3b8a838a38906c381eb262de9fc3 Mon Sep 17 00:00:00 2001 From: John Wagster Date: Thu, 4 Sep 2025 09:31:52 -0500 Subject: [PATCH 3/4] Update docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md Co-authored-by: Liam Thompson --- .../elasticsearch/rest-apis/retrievers/knn-retriever.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md b/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md index 2b7f162255035..8133b631ae807 100644 --- a/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md +++ b/docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md @@ -41,10 +41,7 @@ A kNN retriever returns top documents from a [k-nearest neighbor search (kNN)](d The number of nearest neighbor candidates to consider per shard. Needs to be greater than `k`, or `size` if `k` is omitted, and cannot exceed 10,000. {{es}} collects `num_candidates` results from each shard, then merges them to find the top `k` results. Increasing `num_candidates` tends to improve the accuracy of the final `k` results. Defaults to `Math.min(1.5 * k, 10_000)`. -```{applies_to} -stack: ga 9.2 -``` -`visit_percentage` +`visit_percentage` {applies_to}`stack: ga 9.2` : (Optional, float) The percentage of vectors to explore per shard while doing knn search with `bbq_disk`. Must be between 0 and 100. 0 will default to using `num_candidates` for calculating the percent visited. Increasing `visit_percentage` tends to improve the accuracy of the final results. If `visit_percentage` is set for `bbq_disk`, `num_candidates` is ignored. Defaults to ~1% per shard for every 1 million vectors. From e3978c776ebbcf84326512d6f73fc26c08ba3d0f Mon Sep 17 00:00:00 2001 From: John Wagster Date: Thu, 4 Sep 2025 09:32:04 -0500 Subject: [PATCH 4/4] Update docs/reference/query-languages/query-dsl/query-dsl-knn-query.md Co-authored-by: Liam Thompson --- .../query-languages/query-dsl/query-dsl-knn-query.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md b/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md index 36e3ae810921f..f94b310b233ac 100644 --- a/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md +++ b/docs/reference/query-languages/query-dsl/query-dsl-knn-query.md @@ -90,10 +90,7 @@ If all queried fields are of type [semantic_text](/reference/elasticsearch/mappi : (Optional, integer) The number of nearest neighbor candidates to consider per shard while doing knn search. Cannot exceed 10,000. Increasing `num_candidates` tends to improve the accuracy of the final results. Defaults to `1.5 * k` if `k` is set, or `1.5 * size` if `k` is not set. -```{applies_to} -stack: ga 9.2 -``` -`visit_percentage` +`visit_percentage` {applies_to}`stack: ga 9.2` : (Optional, float) The percentage of vectors to explore per shard while doing knn search with `bbq_disk`. Must be between 0 and 100. 0 will default to using `num_candidates` for calculating the percent visited. Increasing `visit_percentage` tends to improve the accuracy of the final results. If `visit_percentage` is set for `bbq_disk`, `num_candidates` is ignored. Defaults to ~1% per shard for every 1 million vectors.