You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/dense-vector.md
+69-19Lines changed: 69 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,10 +29,13 @@ PUT my-index
29
29
"properties": {
30
30
"my_vector": {
31
31
"type": "dense_vector",
32
-
"dims": 3
32
+
"dims": 3,
33
+
"index_options": {
34
+
"type": "bbq_disk" <1>
35
+
}
33
36
},
34
-
"my_text": {
35
-
"type": "keyword"
37
+
"my_text": {
38
+
"type": "keyword"
36
39
}
37
40
}
38
41
}
@@ -50,6 +53,8 @@ PUT my-index/_doc/2
50
53
"my_vector" : [-0.5, 10, 10]
51
54
}
52
55
```
56
+
1. (Optional) Controls how vectors are indexed internally for kNN search. In this example, `bbq_disk` enables [disk-based binary quantization](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk), which can significantly reduce memory usage for large vector datasets. If you don’t specify `index_options`, {{es}} automatically selects a default indexing strategy based on the vector type and dimensions. To learn more about the available index options and how they affect vector quantization, refer to [Automatically quantize vectors for kNN search](#dense-vector-quantization).
57
+
53
58
:::
54
59
:::{tab-item} Base64-encoded string
55
60
```console
@@ -338,34 +343,60 @@ This configuration is appropriate when full source fidelity is required, such as
338
343
339
344
## Automatically quantize vectors for kNN search [dense-vector-quantization]
340
345
341
-
The `dense_vector` type supports quantization to reduce the memory footprint required when [searching](docs-content://solutions/search/vector/knn.md#approximate-knn)`float` vectors. The three following quantization strategies are supported:
346
+
The `dense_vector` field type supports quantization to reduce the memory footprint required when [searching](docs-content://solutions/search/vector/knn.md#approximate-knn)`float` vectors. The supported vector quantization strategies for `dense_vector` kNN indexing are:
347
+
-[`int8`](#dense-vector-quantization-int8)
348
+
-[`int4`](#dense-vector-quantization-int4)
349
+
-[`bbq`](#dense-vector-quantization-bbq), available as:
*`int8` - Quantizes each dimension of the vector to 1-byte integers. This reduces the memory footprint by 75% (or 4x) at the cost of some accuracy.
344
-
*`int4` - Quantizes each dimension of the vector to half-byte integers. This reduces the memory footprint by 87% (or 8x) at the cost of accuracy.
345
-
*`bbq` - [Better binary quantization](/reference/elasticsearch/mapping-reference/bbq.md) which reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, oversampling during query time and reranking can help mitigate the accuracy loss.
354
+
Here is an example of configuring disk-based binary quantization using `bbq_disk`:
346
355
347
-
When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for more information.
348
-
349
-
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default index type is `bbq_hnsw` for vectors with greater than or equal to 384 dimensions, otherwise it's `int8_hnsw`.
350
-
351
-
:::{note}
352
-
In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
353
-
:::
356
+
```console
357
+
PUT my-bbq-disk-index
358
+
{
359
+
"mappings": {
360
+
"properties": {
361
+
"my_vector": {
362
+
"type": "dense_vector",
363
+
"dims": 384,
364
+
"index": true,
365
+
"index_options": {
366
+
"type": "bbq_disk"
367
+
}
368
+
}
369
+
}
370
+
}
371
+
}
372
+
```
354
373
355
374
Quantized vectors can use [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) to improve accuracy on approximate kNN search results.
356
375
357
376
::::{note}
358
377
Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data. This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.
359
378
::::
360
379
361
-
::::{note}
362
-
`int4` quantization requires an even number of vector dimensions.
363
-
::::
380
+
### Default quantization types
381
+
382
+
::::{applies-switch}
383
+
384
+
:::{applies-item} stack: ga 9.0
385
+
When indexing `float` vectors, the default index type is `int8_hnsw`.
386
+
:::
387
+
388
+
:::{applies-item} stack: ga 9.1+
389
+
When indexing `float` vectors, the default index type is:
390
+
-`bbq_hnsw` for vectors with greater than or equal to 384 dimensions
391
+
-`int8_hnsw` for vectors with less than 384 dimensions
392
+
:::
364
393
365
-
::::{note}
366
-
`bbq` quantization only supports vector dimensions that are greater than 64.
367
394
::::
368
395
396
+
### int8 [dense-vector-quantization-int8]
397
+
398
+
Quantizes each dimension of the vector to 1-byte integers. This reduces the memory footprint by 75% (or 4x) at the cost of some accuracy.
399
+
369
400
Here is an example of how to create a byte-quantized index:
370
401
371
402
```console
@@ -386,6 +417,10 @@ PUT my-byte-quantized-index
386
417
}
387
418
```
388
419
420
+
### int4 [dense-vector-quantization-int4]
421
+
422
+
Quantizes each dimension of the vector to half-byte integers. This reduces the memory footprint by 87% (or 8x) at the cost of accuracy.
423
+
389
424
Here is an example of how to create a half-byte-quantized index:
390
425
391
426
```console
@@ -406,6 +441,17 @@ PUT my-byte-quantized-index
406
441
}
407
442
```
408
443
444
+
::::{note}
445
+
`int4` quantization requires an even number of vector dimensions.
446
+
::::
447
+
448
+
### bbq [dense-vector-quantization-bbq]
449
+
450
+
`bbq` or [Better binary quantization](/reference/elasticsearch/mapping-reference/bbq.md) reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, [oversampling](/reference/elasticsearch/mapping-reference/bbq.md#bbq-oversampling) during query time and reranking can help mitigate the accuracy loss. You can choose one of the following BBQ index types:
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md
+71-1Lines changed: 71 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -214,4 +214,74 @@ PUT my-index-000003
214
214
}
215
215
}
216
216
```
217
-
% TEST[skip:Requires {{infer}} endpoint]
217
+
218
+
## Set `index_options` for `sparse_vectors`[index-options-sparse_vectors]
219
+
220
+
```{applies_to}
221
+
stack: ga 9.2
222
+
```
223
+
224
+
Configuring `index_options` for [sparse vector fields](/reference/elasticsearch/mapping-reference/sparse-vector.md) lets you configure [token pruning](/reference/elasticsearch/mapping-reference/sparse-vector.md#token-pruning), which controls whether non-significant or overly frequent tokens are omitted to improve query performance.
225
+
226
+
The following example enables token pruning and sets pruning thresholds for a `sparse_vector` field:
227
+
228
+
229
+
```console
230
+
PUT semantic-embeddings
231
+
{
232
+
"mappings": {
233
+
"properties": {
234
+
"content": {
235
+
"type": "semantic_text",
236
+
"index_options": {
237
+
"sparse_vector": {
238
+
"prune": true, <1>
239
+
"pruning_config": {
240
+
"tokens_freq_ratio_threshold": 10, <2>
241
+
"tokens_weight_threshold": 0.5 <3>
242
+
}
243
+
}
244
+
}
245
+
}
246
+
}
247
+
}
248
+
}
249
+
```
250
+
1. (Optional) Enables pruning. Default is `true`.
251
+
2. (Optional) Prunes tokens whose frequency is more than 10 times the average token frequency in the field. Default is `5`.
252
+
3. (Optional) Prunes tokens whose weight is lower than 0.5. Default is `0.4`.
253
+
254
+
Learn more about [sparse_vector index options](/reference/elasticsearch/mapping-reference/sparse-vector.md#sparse-vector-index-options) settings and [token pruning](/reference/elasticsearch/mapping-reference/sparse-vector.md#token-pruning).
255
+
256
+
## Set `index_options` for `dense_vectors`[index-options-dense_vectors]
257
+
258
+
Configuring `index_options` for [dense vector fields](/reference/elasticsearch/mapping-reference/dense-vector.md) lets you control how dense vectors are indexed for kNN search. You can select the indexing algorithm, such as `int8_hnsw`, `int4_hnsw`, or `disk_bbq`, among [other available index options](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
259
+
260
+
The following example shows how to configure `index_options` for a dense vector field using the `int8_hnsw` indexing algorithm:
261
+
262
+
263
+
```console
264
+
PUT semantic-embeddings
265
+
{
266
+
"mappings": {
267
+
"properties": {
268
+
"content": {
269
+
"type": "semantic_text",
270
+
"index_options": {
271
+
"dense_vector": {
272
+
"type": "int8_hnsw", <1>
273
+
"m": 15, <2>
274
+
"ef_construction": 90, <3>
275
+
"confidence_interval": 0.95 <4>
276
+
}
277
+
}
278
+
}
279
+
}
280
+
}
281
+
}
282
+
```
283
+
1. (Optional) Selects the `int8_hnsw` vector quantization strategy. Learn about [default quantization types](/reference/elasticsearch/mapping-reference/dense-vector.md#default-quantization-types).
284
+
2. (Optional) Sets `m` to 15 to control how many neighbors each node connects to in the HNSW graph. Default is `16`.
285
+
3. (Optional) Sets `ef_construction` to 90 to control how many candidate neighbors are considered during graph construction. Default is `100`.
286
+
4. (Optional) Sets `confidence_interval` to 0.95 to limit the value range used during quantization and balance accuracy with memory efficiency.
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/semantic-text.md
+39-8Lines changed: 39 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,35 +9,66 @@ applies_to:
9
9
10
10
# Semantic text field type [semantic-text]
11
11
12
+
:::::{warning}
13
+
The `semantic_text` field mapping can be added regardless of license state. However, it typically calls the [{{infer-cap}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference), which requires an [appropriate license](https://www.elastic.co/subscriptions). In these cases, using `semantic_text` in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.
14
+
:::::
15
+
12
16
The `semantic_text` field type simplifies [semantic search](docs-content://solutions/search/semantic-search.md) by providing sensible defaults that automate most of the manual work typically required for vector search. Using `semantic_text`, you don't have to manually configure mappings, set up ingestion pipelines, or handle chunking. The field type automatically:
13
17
14
18
- Configures index mappings: Chooses the correct field type (`sparse_vector` or `dense_vector`), dimensions, similarity functions, and storage optimizations based on the {{infer}} endpoint.
15
19
- Generates embeddings during indexing: Automatically generates embeddings when you index documents, without requiring ingestion pipelines or {{infer}} processors.
16
20
- Handles chunking: Automatically chunks long text documents during indexing.
17
21
18
-
:::::{warning}
19
-
The `semantic_text` field mapping can be added regardless of license state. However, it typically calls the [{{infer-cap}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference), which requires an [appropriate license](https://www.elastic.co/subscriptions). In these cases, using `semantic_text` in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.
20
-
:::::
21
-
22
22
## Basic `semantic_text` mapping example
23
23
24
-
The following example creates an index mapping with a `semantic_text` field:
24
+
The following example creates an index mapping with a `semantic_text` field, using default values:
25
25
26
26
```console
27
27
PUT semantic-embeddings
28
28
{
29
29
"mappings": {
30
30
"properties": {
31
31
"content": {
32
-
"type": "semantic_text" <1>
32
+
"type": "semantic_text"
33
+
}
34
+
}
35
+
}
36
+
}
37
+
```
38
+
39
+
## Extended `semantic_text` mapping example
40
+
41
+
The following example creates an index mapping with a `semantic_text` field that uses dense vectors:
1. In this example, the `semantic_text` field uses a [default {{infer}} endpoint](./semantic-text-setup-configuration.md#default-and-preconfigured-endpoints) because the `inference_id` parameter isn't specified.
68
+
1. (Optional) Specifies the [{{infer}} endpoint](/reference/elasticsearch/mapping-reference/semantic-text-reference.md#configuring-inference-endpoints) used to generate embeddings at index time. If you don’t specify an `inference_id`, the `semantic_text` field uses a [default {{infer}} endpoint](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#default-and-preconfigured-endpoints).
69
+
2. (Optional) The {{infer}} endpoint used to generate embeddings at query time. If not specified, the endpoint defined by `inference_id` is used at both index and query time.
70
+
3. (Optional) Configures how the underlying vector representation is indexed. In this example, [`bbq_disk`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk) is selected for dense vectors. You can configure different index options depending on whether the field uses dense or sparse vectors. Learn how to [set `index_options` for `sparse_vectors`](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#index-options-sparse_vectors) and how to [set `index_options` for `dense_vectors`](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#index-options-dense_vectors).
71
+
4. (Optional) Overrides the [chunking settings](/reference/elasticsearch/mapping-reference/semantic-text-reference.md#chunking-behavior) from the {{infer}} endpoint. In this example, the `word` strategy splits text on individual words with a maximum of 120 words per chunk and an overlap of 40 words between chunks. The default chunking strategy is `sentence`.
41
72
42
73
:::{tip}
43
74
For a complete example, refer to the [Semantic search with `semantic_text`](docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md) tutorial.
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/sparse-vector.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,6 +62,8 @@ See [semantic search with ELSER](docs-content://solutions/search/semantic-search
62
62
63
63
The following parameters are accepted by `sparse_vector` fields:
64
64
65
+
$$$sparse-vector-index-options$$$
66
+
65
67
index_options {applies_to}`stack: ga 9.1`
66
68
: (Optional, object) You can set index options for your `sparse_vector` field to determine if you should prune tokens, and the parameter configurations for the token pruning. If pruning options are not set in your [`sparse_vector` query](/reference/query-languages/query-dsl/query-dsl-sparse-vector-query.md), Elasticsearch will use the default options configured for the field, if any.
0 commit comments