Improves visibility of vector index options and inference configuration (elastic#141653)

kosabogi · Mikep86 · szabosteve · web-flow · commit e86a8a1fc75e · 2026-02-24T11:19:47.000+01:00
* Improve visibility of vector index options and inference configuration

* Fixes link

* Removes incorrect note

* Update docs/reference/elasticsearch/mapping-reference/semantic-text.md

Co-authored-by: Mike Pellegrini &lt;mike.pellegrini@elastic.co&gt;

* Adresses suggestions

* Syntax fix

* Fixes syntax, adds list

* Applies suggestions

* Update docs/reference/elasticsearch/mapping-reference/dense-vector.md

Co-authored-by: István Zoltán Szabó &lt;istvan.szabo@elastic.co&gt;

* Update docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md

Co-authored-by: István Zoltán Szabó &lt;istvan.szabo@elastic.co&gt;

* Update docs/reference/elasticsearch/mapping-reference/semantic-text.md

Co-authored-by: István Zoltán Szabó &lt;istvan.szabo@elastic.co&gt;

* Update docs/reference/elasticsearch/mapping-reference/semantic-text.md

Co-authored-by: István Zoltán Szabó &lt;istvan.szabo@elastic.co&gt;

* Fixes syntax and code example

---------

Co-authored-by: Mike Pellegrini &lt;mike.pellegrini@elastic.co&gt;
Co-authored-by: István Zoltán Szabó &lt;istvan.szabo@elastic.co&gt;
diff --git a/docs/reference/elasticsearch/mapping-reference/dense-vector.md b/docs/reference/elasticsearch/mapping-reference/dense-vector.md
@@ -29,10 +29,13 @@ PUT my-index
     "properties": {
       "my_vector": {
         "type": "dense_vector",
-        "dims": 3
+        "dims": 3,
+        "index_options": {
+          "type": "bbq_disk" <1>
+        }
       },
-      "my_text" : {
-        "type" : "keyword"
+      "my_text": {
+        "type": "keyword"
       }
     }
   }
@@ -50,6 +53,8 @@ PUT my-index/_doc/2
   "my_vector" : [-0.5, 10, 10]
 }
 ```
+1. (Optional) Controls how vectors are indexed internally for kNN search. In this example, `bbq_disk` enables [disk-based binary quantization](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk), which can significantly reduce memory usage for large vector datasets. If you don’t specify `index_options`, {{es}} automatically selects a default indexing strategy based on the vector type and dimensions. To learn more about the available index options and how they affect vector quantization, refer to [Automatically quantize vectors for kNN search](#dense-vector-quantization).
+
 :::
 :::{tab-item} Base64-encoded string
 ```console
@@ -338,34 +343,60 @@ This configuration is appropriate when full source fidelity is required, such as
 
 ## Automatically quantize vectors for kNN search [dense-vector-quantization]
 
-The `dense_vector` type supports quantization to reduce the memory footprint required when [searching](docs-content://solutions/search/vector/knn.md#approximate-knn) `float` vectors. The three following quantization strategies are supported:
+The `dense_vector` field type supports quantization to reduce the memory footprint required when [searching](docs-content://solutions/search/vector/knn.md#approximate-knn) `float` vectors. The supported vector quantization strategies for `dense_vector` kNN indexing are:
+- [`int8`](#dense-vector-quantization-int8) 
+- [`int4`](#dense-vector-quantization-int4)
+- [`bbq`](#dense-vector-quantization-bbq), available as:
+  - [`bbq_hnsw`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-hnsw)
+  - [`bbq_flat`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-flat)
+  - [`bbq_disk`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk)
 
-* `int8` - Quantizes each dimension of the vector to 1-byte integers. This reduces the memory footprint by 75% (or 4x) at the cost of some accuracy.
-* `int4` - Quantizes each dimension of the vector to half-byte integers. This reduces the memory footprint by 87% (or 8x) at the cost of accuracy.
-* `bbq` - [Better binary quantization](/reference/elasticsearch/mapping-reference/bbq.md) which reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, oversampling during query time and reranking can help mitigate the accuracy loss.
+Here is an example of configuring disk-based binary quantization using `bbq_disk`:
 
-When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for more information.
-
-To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default index type is `bbq_hnsw` for vectors with greater than or equal to 384 dimensions, otherwise it's `int8_hnsw`.
-
-:::{note}
-In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
-:::
+```console
+PUT my-bbq-disk-index
+{
+  "mappings": {
+    "properties": {
+      "my_vector": {
+        "type": "dense_vector",
+        "dims": 384,
+        "index": true,
+        "index_options": {
+          "type": "bbq_disk"
+        }
+      }
+    }
+  }
+}
+```
 
 Quantized vectors can use [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) to improve accuracy on approximate kNN search results.
 
 ::::{note}
 Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data. This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.
 ::::
 
-::::{note}
-`int4` quantization requires an even number of vector dimensions.
-::::
+### Default quantization types
+
+::::{applies-switch}
+
+:::{applies-item} stack: ga 9.0
+When indexing `float` vectors, the default index type is `int8_hnsw`.
+:::
+
+:::{applies-item} stack: ga 9.1+
+When indexing `float` vectors, the default index type is:
+- `bbq_hnsw` for vectors with greater than or equal to 384 dimensions
+- `int8_hnsw` for vectors with less than 384 dimensions
+:::
 
-::::{note}
-`bbq` quantization only supports vector dimensions that are greater than 64.
 ::::
 
+### int8 [dense-vector-quantization-int8]
+
+Quantizes each dimension of the vector to 1-byte integers. This reduces the memory footprint by 75% (or 4x) at the cost of some accuracy.
+
 Here is an example of how to create a byte-quantized index:
 
 ```console
@@ -386,6 +417,10 @@ PUT my-byte-quantized-index
 }
 ```
 
+### int4 [dense-vector-quantization-int4]
+
+Quantizes each dimension of the vector to half-byte integers. This reduces the memory footprint by 87% (or 8x) at the cost of accuracy.
+
 Here is an example of how to create a half-byte-quantized index:
 
 ```console
@@ -406,6 +441,17 @@ PUT my-byte-quantized-index
 }
 ```
 
+::::{note}
+`int4` quantization requires an even number of vector dimensions.
+::::
+
+### bbq [dense-vector-quantization-bbq]
+
+`bbq` or [Better binary quantization](/reference/elasticsearch/mapping-reference/bbq.md) reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, [oversampling](/reference/elasticsearch/mapping-reference/bbq.md#bbq-oversampling) during query time and reranking can help mitigate the accuracy loss. You can choose one of the following BBQ index types:
+  * [`bbq_hnsw`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-hnsw)
+  * [`bbq_flat`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-flat)
+  * [`bbq_disk`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk)
+
 Here is an example of how to create a binary quantized index:
 
 ```console
@@ -426,6 +472,10 @@ PUT my-byte-quantized-index
 }
 ```
 
+::::{note}
+`bbq` quantization only supports vector dimensions that are greater than 64.
+::::
+
 ## Parameters for dense vector fields [dense-vector-params]
 
 The following mapping parameters are accepted:
diff --git a/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md b/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md
@@ -214,4 +214,74 @@ PUT my-index-000003
   }
 }
 ```
-% TEST[skip:Requires {{infer}} endpoint]
+
+## Set `index_options` for `sparse_vectors` [index-options-sparse_vectors]
+
+```{applies_to}
+stack: ga 9.2
+```
+
+Configuring `index_options` for [sparse vector fields](/reference/elasticsearch/mapping-reference/sparse-vector.md) lets you configure [token pruning](/reference/elasticsearch/mapping-reference/sparse-vector.md#token-pruning), which controls whether non-significant or overly frequent tokens are omitted to improve query performance.
+
+The following example enables token pruning and sets pruning thresholds for a `sparse_vector` field:
+
+
+```console
+PUT semantic-embeddings
+{
+  "mappings": {
+    "properties": {
+      "content": {
+        "type": "semantic_text", 
+        "index_options": {
+          "sparse_vector": {
+            "prune": true, <1>
+            "pruning_config": {
+              "tokens_freq_ratio_threshold": 10, <2>
+              "tokens_weight_threshold": 0.5 <3>
+            }
+          }
+        }
+      }
+    }
+  }
+}
+```
+1. (Optional) Enables pruning. Default is `true`.
+2. (Optional) Prunes tokens whose frequency is more than 10 times the average token frequency in the field. Default is `5`.
+3. (Optional) Prunes tokens whose weight is lower than 0.5. Default is `0.4`.
+
+Learn more about [sparse_vector index options](/reference/elasticsearch/mapping-reference/sparse-vector.md#sparse-vector-index-options) settings and [token pruning](/reference/elasticsearch/mapping-reference/sparse-vector.md#token-pruning).
+
+## Set `index_options` for `dense_vectors` [index-options-dense_vectors]
+
+Configuring `index_options` for [dense vector fields](/reference/elasticsearch/mapping-reference/dense-vector.md) lets you control how dense vectors are indexed for kNN search. You can select the indexing algorithm, such as `int8_hnsw`, `int4_hnsw`, or `disk_bbq`, among [other available index options](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
+
+The following example shows how to configure `index_options` for a dense vector field using the `int8_hnsw` indexing algorithm:
+
+
+```console
+PUT semantic-embeddings
+{
+  "mappings": {
+    "properties": {
+      "content": {
+        "type": "semantic_text",
+        "index_options": {
+          "dense_vector": {
+            "type": "int8_hnsw", <1>
+            "m": 15, <2>
+            "ef_construction": 90, <3>
+            "confidence_interval": 0.95 <4>
+          }
+        }
+      }
+    }
+  }
+}
+```
+1. (Optional) Selects the `int8_hnsw` vector quantization strategy. Learn about [default quantization types](/reference/elasticsearch/mapping-reference/dense-vector.md#default-quantization-types).
+2. (Optional) Sets `m` to 15 to control how many neighbors each node connects to in the HNSW graph. Default is `16`.
+3. (Optional) Sets `ef_construction` to 90 to control how many candidate neighbors are considered during graph construction. Default is `100`.
+4. (Optional) Sets `confidence_interval` to 0.95 to limit the value range used during quantization and balance accuracy with memory efficiency.
+
diff --git a/docs/reference/elasticsearch/mapping-reference/semantic-text.md b/docs/reference/elasticsearch/mapping-reference/semantic-text.md
@@ -9,35 +9,66 @@ applies_to:
 
 # Semantic text field type [semantic-text]
 
+:::::{warning}
+The `semantic_text` field mapping can be added regardless of license state. However, it typically calls the [{{infer-cap}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference), which requires an [appropriate license](https://www.elastic.co/subscriptions). In these cases, using `semantic_text` in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.
+:::::
+
 The `semantic_text` field type simplifies [semantic search](docs-content://solutions/search/semantic-search.md) by providing sensible defaults that automate most of the manual work typically required for vector search. Using `semantic_text`, you don't have to manually configure mappings, set up ingestion pipelines, or handle chunking. The field type automatically:
 
 - Configures index mappings: Chooses the correct field type (`sparse_vector` or `dense_vector`), dimensions, similarity functions, and storage optimizations based on the {{infer}} endpoint.
 - Generates embeddings during indexing: Automatically generates embeddings when you index documents, without requiring ingestion pipelines or {{infer}} processors.
 - Handles chunking: Automatically chunks long text documents during indexing.
 
-:::::{warning}
-The `semantic_text` field mapping can be added regardless of license state. However, it typically calls the [{{infer-cap}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference), which requires an [appropriate license](https://www.elastic.co/subscriptions). In these cases, using `semantic_text` in a cluster without the appropriate license causes operations such as indexing and reindexing to fail.
-:::::
-
 ## Basic `semantic_text` mapping example
 
-The following example creates an index mapping with a `semantic_text` field:
+The following example creates an index mapping with a `semantic_text` field, using default values:
 
 ```console
 PUT semantic-embeddings 
 {
   "mappings": { 
     "properties": {
       "content": { 
-        "type": "semantic_text" <1>
+        "type": "semantic_text"
+      }
+    }
+  }
+}
+```
+
+## Extended `semantic_text` mapping example
+
+The following example creates an index mapping with a `semantic_text` field that uses dense vectors:
+
+```console
+PUT semantic-embeddings
+{
+  "mappings": {
+    "properties": {
+      "content": {
+        "type": "semantic_text",
+        "inference_id": "my-inference-endpoint", <1>
+        "search_inference_id": "my-search-inference-endpoint", <2>
+        "index_options": { <3>
+          "dense_vector": {
+            "type": "bbq_disk"
+          }
+        },
+        "chunking_settings": { <4>
+          "strategy": "word",
+          "max_chunk_size": 120,
+          "overlap": 40
+        }
       }
     }
   }
 }
 ```
-% TEST[skip:Requires {{infer}} endpoint]
 
-1. In this example, the `semantic_text` field uses a [default {{infer}} endpoint](./semantic-text-setup-configuration.md#default-and-preconfigured-endpoints) because the `inference_id` parameter isn't specified.
+1. (Optional) Specifies the [{{infer}} endpoint](/reference/elasticsearch/mapping-reference/semantic-text-reference.md#configuring-inference-endpoints) used to generate embeddings at index time. If you don’t specify an `inference_id`, the `semantic_text` field uses a [default {{infer}} endpoint](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#default-and-preconfigured-endpoints).
+2. (Optional) The {{infer}} endpoint used to generate embeddings at query time. If not specified, the endpoint defined by `inference_id` is used at both index and query time.
+3. (Optional) Configures how the underlying vector representation is indexed. In this example, [`bbq_disk`](/reference/elasticsearch/mapping-reference/bbq.md#bbq-disk) is selected for dense vectors. You can configure different index options depending on whether the field uses dense or sparse vectors. Learn how to [set `index_options` for `sparse_vectors`](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#index-options-sparse_vectors) and how to [set `index_options` for `dense_vectors`](/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration.md#index-options-dense_vectors).
+4. (Optional) Overrides the [chunking settings](/reference/elasticsearch/mapping-reference/semantic-text-reference.md#chunking-behavior) from the {{infer}} endpoint. In this example, the `word` strategy splits text on individual words with a maximum of 120 words per chunk and an overlap of 40 words between chunks. The default chunking strategy is `sentence`.
 
 :::{tip}
 For a complete example, refer to the [Semantic search with `semantic_text`](docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md) tutorial.
diff --git a/docs/reference/elasticsearch/mapping-reference/sparse-vector.md b/docs/reference/elasticsearch/mapping-reference/sparse-vector.md
@@ -62,6 +62,8 @@ See [semantic search with ELSER](docs-content://solutions/search/semantic-search
 
 The following parameters are accepted by `sparse_vector` fields:
 
+$$$sparse-vector-index-options$$$
+
 index_options {applies_to}`stack: ga 9.1`
 :   (Optional, object) You can set index options for your  `sparse_vector` field to determine if you should prune tokens, and the parameter configurations for the token pruning. If pruning options are not set in your [`sparse_vector` query](/reference/query-languages/query-dsl/query-dsl-sparse-vector-query.md), Elasticsearch will use the default options configured for the field, if any.