Skip to content

Commit 6826371

Browse files
authored
[DOCS] Explains that chunks stored as offsets in semantic_text (#132809)
* Explains that chunks stored as offsets. * Small changes. * Refines applies_to placement.
1 parent 0b8c416 commit 6826371

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

docs/reference/elasticsearch/mapping-reference/semantic-text.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,6 @@ PUT my-index-000003
107107
```
108108

109109
### Using ELSER on EIS
110-
111110
```{applies_to}
112111
stack: preview 9.1
113112
serverless: preview
@@ -223,6 +222,10 @@ generated from it. When querying, the individual passages will be automatically
223222
searched for each document, and the most relevant passage will be used to
224223
compute a score.
225224

225+
Chunks are stored as start and end character offsets rather than as separate
226+
text strings. These offsets point to the exact location of each chunk within the
227+
original input text.
228+
226229
For more details on chunking and how to configure chunking settings,
227230
see [Configuring chunking](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference)
228231
in the Inference API documentation.
@@ -238,7 +241,8 @@ stack: ga 9.1
238241

239242
You can pre-chunk the input by sending it to Elasticsearch as an array of
240243
strings.
241-
Example:
244+
245+
For example:
242246

243247
```console
244248
PUT test-index
@@ -540,7 +544,6 @@ POST test-index/_search
540544
This will return verbose chunked embeddings content that is used to perform
541545
semantic search for `semantic_text` fields.
542546

543-
544547
## Limitations [limitations]
545548

546549
`semantic_text` field types have the following limitations:

0 commit comments

Comments
 (0)