@@ -28,7 +28,7 @@ service.
2828
2929Using ` semantic_text ` , you won’t need to specify how to generate embeddings for
3030your data, or how to index it. The {{infer}} endpoint automatically determines
31- the embedding generation, indexing, and query to use.
31+ the embedding generation, indexing, and query to use.
3232Newly created indices with ` semantic_text ` fields using dense embeddings will be
3333[ quantized] ( /reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization )
3434to ` bbq_hnsw ` automatically.
@@ -111,6 +111,33 @@ the [Create {{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/ope
111111to create the endpoint. If not specified, the {{infer}} endpoint defined by
112112` inference_id ` will be used at both index and query time.
113113
114+ ` index_options `
115+ : (Optional, string) Specifies the index options to override default values
116+ for the field. Currently, ` dense_vector ` index options are supported.
117+ For text embeddings, ` index_options ` may match any allowed
118+ [ dense_vector index options] ( /reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options ) .
119+
120+ An example of how to set index_options for a ` semantic_text ` field:
121+
122+ ``` console
123+ PUT my-index-000004
124+ {
125+ "mappings": {
126+ "properties": {
127+ "inference_field": {
128+ "type": "semantic_text",
129+ "inference_id": "my-text-embedding-endpoint",
130+ "index_options": {
131+ "dense_vector": {
132+ "type": "int4_flat"
133+ }
134+ }
135+ }
136+ }
137+ }
138+ }
139+ ```
140+
114141` chunking_settings `
115142: (Optional, object) Settings for chunking text into smaller passages.
116143If specified, these will override the chunking settings set in the {{infer-cap}}
@@ -138,8 +165,10 @@ To completely disable chunking, use the `none` chunking strategy.
138165 or `1`. Required for `sentence` type chunking settings
139166
140167::::{warning}
141- If the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an
142- error. In contrast, the ` elastic ` and ` elasticsearch ` services will automatically truncate the input to fit within the
168+ If the input exceeds the maximum token limit of the underlying model, some
169+ services (such as OpenAI) may return an
170+ error. In contrast, the ` elastic ` and ` elasticsearch ` services will
171+ automatically truncate the input to fit within the
143172model's limit.
144173::::
145174
@@ -173,7 +202,8 @@ For more details on chunking and how to configure chunking settings,
173202see [ Configuring chunking] ( https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference )
174203in the Inference API documentation.
175204
176- You can pre-chunk the input by sending it to Elasticsearch as an array of strings.
205+ You can pre-chunk the input by sending it to Elasticsearch as an array of
206+ strings.
177207Example:
178208
179209``` console
@@ -203,15 +233,20 @@ PUT test-index/_doc/1
203233```
204234
2052351 . The text is pre-chunked and provided as an array of strings.
206- Each element in the array represents a single chunk that will be sent directly to the inference service without further chunking.
236+ Each element in the array represents a single chunk that will be sent
237+ directly to the inference service without further chunking.
207238
208239** Important considerations** :
209240
210- * When providing pre-chunked input, ensure that you set the chunking strategy to ` none ` to avoid additional processing.
211- * Each chunk should be sized carefully, staying within the token limit of the inference service and the underlying model.
212- * If a chunk exceeds the model's token limit, the behavior depends on the service:
213- * Some services (such as OpenAI) will return an error.
214- * Others (such as ` elastic ` and ` elasticsearch ` ) will automatically truncate the input.
241+ * When providing pre-chunked input, ensure that you set the chunking strategy to
242+ ` none ` to avoid additional processing.
243+ * Each chunk should be sized carefully, staying within the token limit of the
244+ inference service and the underlying model.
245+ * If a chunk exceeds the model's token limit, the behavior depends on the
246+ service:
247+ * Some services (such as OpenAI) will return an error.
248+ * Others (such as ` elastic ` and ` elasticsearch ` ) will automatically truncate
249+ the input.
215250
216251Refer
217252to [ this tutorial] ( docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md )
0 commit comments