Skip to content

Commit 403ed56

Browse files
authored
Updating knn tuning guide and size estimates (elastic#115691) (elastic#115752)
1 parent 9c6eef2 commit 403ed56

File tree

1 file changed

+21
-9
lines changed

1 file changed

+21
-9
lines changed

docs/reference/how-to/knn-search.asciidoc

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,11 @@ structures. So these same recommendations also help with indexing speed.
1616
The default <<dense-vector-element-type,`element_type`>> is `float`. But this
1717
can be automatically quantized during index time through
1818
<<dense-vector-quantization,`quantization`>>. Quantization will reduce the
19-
required memory by 4x, but it will also reduce the precision of the vectors and
20-
increase disk usage for the field (by up to 25%). Increased disk usage is a
19+
required memory by 4x, 8x, or as much as 32x, but it will also reduce the precision of the vectors and
20+
increase disk usage for the field (by up to 25%, 12.5%, or 3.125%, respectively). Increased disk usage is a
2121
result of {es} storing both the quantized and the unquantized vectors.
22-
For example, when quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors. The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB.
22+
For example, when int8 quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors.
23+
The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB.
2324

2425
For `float` vectors with `dim` greater than or equal to `384`, using a
2526
<<dense-vector-quantization,`quantized`>> index is highly recommended.
@@ -68,12 +69,23 @@ Another option is to use <<synthetic-source,synthetic `_source`>>.
6869
kNN search. HNSW is a graph-based algorithm which only works efficiently when
6970
most vector data is held in memory. You should ensure that data nodes have at
7071
least enough RAM to hold the vector data and index structures. To check the
71-
size of the vector data, you can use the <<indices-disk-usage>> API. As a
72-
loose rule of thumb, and assuming the default HNSW options, the bytes used will
73-
be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
74-
the space required will be closer to `num_vectors * (num_dimensions + 12)`. Note that
75-
the required RAM is for the filesystem cache, which is separate from the Java
76-
heap.
72+
size of the vector data, you can use the <<indices-disk-usage>> API.
73+
74+
Here are estimates for different element types and quantization levels:
75+
+
76+
--
77+
`element_type: float`: `num_vectors * num_dimensions * 4`
78+
`element_type: float` with `quantization: int8`: `num_vectors * (num_dimensions + 4)`
79+
`element_type: float` with `quantization: int4`: `num_vectors * (num_dimensions/2 + 4)`
80+
`element_type: float` with `quantization: bbq`: `num_vectors * (num_dimensions/8 + 12)`
81+
`element_type: byte`: `num_vectors * num_dimensions`
82+
`element_type: bit`: `num_vectors * (num_dimensions/8)`
83+
--
84+
85+
If utilizing HNSW, the graph must also be in memory, to estimate the required bytes use `num_vectors * 4 * HNSW.m`. The
86+
default value for `HNSW.m` is 16, so by default `num_vectors * 4 * 16`.
87+
88+
Note that the required RAM is for the filesystem cache, which is separate from the Java heap.
7789

7890
The data nodes should also leave a buffer for other ways that RAM is needed.
7991
For example your index might also include text fields and numerics, which also

0 commit comments

Comments
 (0)