Skip to content

Commit c1e2ded

Browse files
committed
Update docs on dense_vector for new types
1 parent bcb861a commit c1e2ded

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

docs/reference/elasticsearch/mapping-reference/dense-vector.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ This setting is compatible with synthetic `_source`, where the entire `_source`
156156

157157
### Rehydration and precision
158158

159-
When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. Internally, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality.
159+
When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. By default, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality. Additionally, using an `element_type` of `bfloat16` will cause a further loss in precision in restored vectors.
160160

161161
### Storing original vectors in `_source`
162162

@@ -283,12 +283,15 @@ The following mapping parameters are accepted:
283283
$$$dense-vector-element-type$$$
284284

285285
`element_type`
286-
: (Optional, string) The data type used to encode vectors. The supported data types are `float` (default), `byte`, and `bit`.
286+
: (Optional, string) The data type used to encode vectors. The supported data types are `float` (default), `bfloat16`, byte`, and `bit`.
287287

288288
::::{dropdown} Valid values for element_type
289289
`float`
290290
: indexes a 4-byte floating-point value per dimension. This is the default value.
291291

292+
`bfloat16` {applies_to}`stack: ga 9.3`
293+
: indexes a 2-byte floating-point value per dimension. This uses the bfloat16 encoding, _not_ IEEE-754 float16, to maintain the same value range as 4-byte floats. Using `bfloat16` is likely to cause a loss of precision in the stored values compared to `float`.
294+
292295
`byte`
293296
: indexes a 1-byte integer value per dimension.
294297

@@ -362,7 +365,7 @@ $$$dense-vector-index-options$$$
362365
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float`.
363366
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float`.
364367
* `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`.
365-
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded.
368+
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float` or `bfloat16`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded.
366369

367370
`m`
368371
: (Optional, integer) The number of neighbors each node will be connected to in the HNSW graph. Defaults to `16`. Only applicable to `hnsw`, `int8_hnsw`, `int4_hnsw` and `bbq_hnsw` index types.
@@ -390,6 +393,9 @@ $$$dense-vector-index-options$$$
390393
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
391394
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
392395
:::::
396+
397+
`on_disk_rescore` {applies_to}`stack: ga 9.3` {applies_to}`serverless: unavailable`
398+
: (Optional, boolean) Only applicable to quantized index types. When `true`, vector rescoring will read the raw vector data directly from disk, and will not copy it in memory. This can improve performance when vector data is larger than the amount of available RAM. This setting only applies to newly-indexed vectors; after changing this setting, the vectors must be reindexed or force-merged to apply the new setting to the whole index. Defaults to `false`.
393399
::::
394400

395401

0 commit comments

Comments
 (0)