@@ -117,15 +117,16 @@ If specified, these will override the chunking settings set in the {{infer-cap}}
117117endpoint associated with ` inference_id ` .
118118If chunking settings are updated, they will not be applied to existing documents
119119until they are reindexed.
120+ To completely disable chunking, use the ` none ` chunking strategy.
120121
121122 **Valid values for `chunking_settings`**:
122123
123124 `type`
124- : Indicates the type of chunking strategy to use. Valid values are `word` or
125+ : Indicates the type of chunking strategy to use. Valid values are `none`, ` word` or
125126 `sentence`. Required.
126127
127128 `max_chunk_size`
128- : The maximum number of works in a chunk. Required.
129+ : The maximum number of words in a chunk. Required for `word` and `sentence` strategies .
129130
130131 `overlap`
131132 : The number of overlapping words allowed in chunks. This cannot be defined as
@@ -136,6 +137,12 @@ until they are reindexed.
136137 : The number of overlapping sentences allowed in chunks. Valid values are `0`
137138 or `1`. Required for `sentence` type chunking settings
138139
140+ ::::{warning}
141+ If the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an
142+ error. In contrast, the ` elastic ` and ` elasticsearch ` services will automatically truncate the input to fit within the
143+ model's limit.
144+ ::::
145+
139146## {{infer-cap}} endpoint validation [ infer-endpoint-validation]
140147
141148The ` inference_id ` will not be validated when the mapping is created, but when
@@ -166,10 +173,49 @@ For more details on chunking and how to configure chunking settings,
166173see [ Configuring chunking] ( https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference )
167174in the Inference API documentation.
168175
176+ You can pre-chunk the input by sending it to Elasticsearch as an array of strings.
177+ Example:
178+
179+ ``` console
180+ PUT test-index
181+ {
182+ "mappings": {
183+ "properties": {
184+ "my_semantic_field": {
185+ "type": "semantic_text",
186+ "chunking_settings": {
187+ "strategy": "none" <1>
188+ }
189+ }
190+ }
191+ }
192+ }
193+ ```
194+
195+ 1 . Disable chunking on ` my_semantic_field ` .
196+
197+ ``` console
198+ PUT test-index/_doc/1
199+ {
200+ "my_semantic_field": ["my first chunk", "my second chunk", ...] <1>
201+ ...
202+ }
203+ ```
204+
205+ 1 . The text is pre-chunked and provided as an array of strings.
206+ Each element in the array represents a single chunk that will be sent directly to the inference service without further chunking.
207+
208+ ** Important considerations** :
209+
210+ * When providing pre-chunked input, ensure that you set the chunking strategy to ` none ` to avoid additional processing.
211+ * Each chunk should be sized carefully, staying within the token limit of the inference service and the underlying model.
212+ * If a chunk exceeds the model's token limit, the behavior depends on the service:
213+ * Some services (such as OpenAI) will return an error.
214+ * Others (such as ` elastic ` and ` elasticsearch ` ) will automatically truncate the input.
215+
169216Refer
170217to [ this tutorial] ( docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md )
171- to learn more about semantic search using ` semantic_text ` and the ` semantic `
172- query.
218+ to learn more about semantic search using ` semantic_text ` .
173219
174220## Extracting Relevant Fragments from Semantic Text [ semantic-text-highlighting]
175221
0 commit comments