Skip to content

Commit dbf4c5d

Browse files
szabostevedavidkyle
andcommitted
[DOCS] Documents configurable chunking (#115300)
Co-authored-by: David Kyle <[email protected]>
1 parent c9a9738 commit dbf4c5d

15 files changed

+354
-3
lines changed

docs/reference/inference/inference-apis.asciidoc

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
3535
Now use <<semantic-search-semantic-text, semantic text>> to perform
3636
<<semantic-search, semantic search>> on your data.
3737

38-
3938
[discrete]
4039
[[default-enpoints]]
4140
=== Default {infer} endpoints
@@ -53,6 +52,67 @@ For these models, the minimum number of allocations is `0`.
5352
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
5453

5554

55+
[discrete]
56+
[[infer-chunking-config]]
57+
=== Configuring chunking
58+
59+
{infer-cap} endpoints have a limit on the amount of text they can process at once, determined by the model's input capacity.
60+
Chunking is the process of splitting the input text into pieces that remain within these limits.
61+
It occurs when ingesting documents into <<semantic-text,`semantic_text` fields>>.
62+
Chunking also helps produce sections that are digestible for humans.
63+
Returning a long document in search results is less useful than providing the most relevant chunk of text.
64+
65+
Each chunk will include the text subpassage and the corresponding embedding generated from it.
66+
67+
By default, documents are split into sentences and grouped in sections up to 250 words with 1 sentence overlap so that each chunk shares a sentence with the previous chunk.
68+
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.
69+
70+
{es} uses the https://unicode-org.github.io/icu-docs/[ICU4J] library to detect word and sentence boundaries for chunking.
71+
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#word-boundary[Word boundaries] are identified by following a series of rules, not just the presence of a whitespace character.
72+
For written languages that do use whitespace such as Chinese or Japanese dictionary lookups are used to detect word boundaries.
73+
74+
75+
[discrete]
76+
==== Chunking strategies
77+
78+
Two strategies are available for chunking: `sentence` and `word`.
79+
80+
The `sentence` strategy splits the input text at sentence boundaries.
81+
Each chunk contains one or more complete sentences ensuring that the integrity of sentence-level context is preserved, except when a sentence causes a chunk to exceed a word count of `max_chunk_size`, in which case it will be split across chunks.
82+
The `sentence_overlap` option defines the number of sentences from the previous chunk to include in the current chunk which is either `0` or `1`.
83+
84+
The `word` strategy splits the input text on individual words up to the `max_chunk_size` limit.
85+
The `overlap` option is the number of words from the previous chunk to include in the current chunk.
86+
87+
The default chunking strategy is `sentence`.
88+
89+
NOTE: The default chunking strategy for {infer} endpoints created before 8.16 is `word`.
90+
91+
92+
[discrete]
93+
==== Example of configuring the chunking behavior
94+
95+
The following example creates an {infer} endpoint with the `elasticsearch` service that deploys the ELSER model by default and configures the chunking behavior.
96+
97+
[source,console]
98+
------------------------------------------------------------
99+
PUT _inference/sparse_embedding/small_chunk_size
100+
{
101+
"service": "elasticsearch",
102+
"service_settings": {
103+
"num_allocations": 1,
104+
"num_threads": 1
105+
},
106+
"chunking_settings": {
107+
"strategy": "sentence",
108+
"max_chunk_size": 100,
109+
"sentence_overlap": 0
110+
}
111+
}
112+
------------------------------------------------------------
113+
// TEST[skip:TBD]
114+
115+
56116
include::delete-inference.asciidoc[]
57117
include::get-inference.asciidoc[]
58118
include::post-inference.asciidoc[]

docs/reference/inference/inference-shared.asciidoc

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,36 @@ end::task-settings[]
3131

3232
tag::task-type[]
3333
The type of the {infer} task that the model will perform.
34-
end::task-type[]
34+
end::task-type[]
35+
36+
tag::chunking-settings[]
37+
Chunking configuration object.
38+
Refer to <<infer-chunking-config>> to learn more about chunking.
39+
end::chunking-settings[]
40+
41+
tag::chunking-settings-max-chunking-size[]
42+
Specifies the maximum size of a chunk in words.
43+
Defaults to `250`.
44+
This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `10` (for `word` strategy).
45+
end::chunking-settings-max-chunking-size[]
46+
47+
tag::chunking-settings-overlap[]
48+
Only for `word` chunking strategy.
49+
Specifies the number of overlapping words for chunks.
50+
Defaults to `100`.
51+
This value cannot be higher than the half of `max_chunking_size`.
52+
end::chunking-settings-overlap[]
53+
54+
tag::chunking-settings-sentence-overlap[]
55+
Only for `sentence` chunking strategy.
56+
Specifies the numnber of overlapping sentences for chunks.
57+
It can be either `1` or `0`.
58+
Defaults to `1`.
59+
end::chunking-settings-sentence-overlap[]
60+
61+
tag::chunking-settings-strategy[]
62+
Specifies the chunking strategy.
63+
It could be either `sentence` or `word`.
64+
end::chunking-settings-strategy[]
65+
66+

docs/reference/inference/service-alibabacloud-ai-search.asciidoc

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,26 @@ Available task types:
3434
[[infer-service-alibabacloud-ai-search-api-request-body]]
3535
==== {api-request-body-title}
3636

37+
`chunking_settings`::
38+
(Optional, object)
39+
include::inference-shared.asciidoc[tag=chunking-settings]
40+
41+
`max_chunking_size`:::
42+
(Optional, integer)
43+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
44+
45+
`overlap`:::
46+
(Optional, integer)
47+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
48+
49+
`sentence_overlap`:::
50+
(Optional, integer)
51+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
52+
53+
`strategy`:::
54+
(Optional, string)
55+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
56+
3757
`service`::
3858
(Required, string) The type of service supported for the specified task type.
3959
In this case,
@@ -108,7 +128,6 @@ To modify this, set the `requests_per_minute` setting of this object in your ser
108128
include::inference-shared.asciidoc[tag=request-per-minute-example]
109129
--
110130

111-
112131
`task_settings`::
113132
(Optional, object)
114133
include::inference-shared.asciidoc[tag=task-settings]

docs/reference/inference/service-amazon-bedrock.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,26 @@ Available task types:
3232
[[infer-service-amazon-bedrock-api-request-body]]
3333
==== {api-request-body-title}
3434

35+
`chunking_settings`::
36+
(Optional, object)
37+
include::inference-shared.asciidoc[tag=chunking-settings]
38+
39+
`max_chunking_size`:::
40+
(Optional, integer)
41+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
42+
43+
`overlap`:::
44+
(Optional, integer)
45+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
46+
47+
`sentence_overlap`:::
48+
(Optional, integer)
49+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
50+
51+
`strategy`:::
52+
(Optional, string)
53+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
54+
3555
`service`::
3656
(Required, string) The type of service supported for the specified task type.
3757
In this case,

docs/reference/inference/service-anthropic.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,26 @@ Available task types:
3232
[[infer-service-anthropic-api-request-body]]
3333
==== {api-request-body-title}
3434

35+
`chunking_settings`::
36+
(Optional, object)
37+
include::inference-shared.asciidoc[tag=chunking-settings]
38+
39+
`max_chunking_size`:::
40+
(Optional, integer)
41+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
42+
43+
`overlap`:::
44+
(Optional, integer)
45+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
46+
47+
`sentence_overlap`:::
48+
(Optional, integer)
49+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
50+
51+
`strategy`:::
52+
(Optional, string)
53+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
54+
3555
`service`::
3656
(Required, string)
3757
The type of service supported for the specified task type. In this case,

docs/reference/inference/service-azure-ai-studio.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,26 @@ Available task types:
3333
[[infer-service-azure-ai-studio-api-request-body]]
3434
==== {api-request-body-title}
3535

36+
`chunking_settings`::
37+
(Optional, object)
38+
include::inference-shared.asciidoc[tag=chunking-settings]
39+
40+
`max_chunking_size`:::
41+
(Optional, integer)
42+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
43+
44+
`overlap`:::
45+
(Optional, integer)
46+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
47+
48+
`sentence_overlap`:::
49+
(Optional, integer)
50+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
51+
52+
`strategy`:::
53+
(Optional, string)
54+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
55+
3656
`service`::
3757
(Required, string)
3858
The type of service supported for the specified task type. In this case,

docs/reference/inference/service-azure-openai.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,26 @@ Available task types:
3333
[[infer-service-azure-openai-api-request-body]]
3434
==== {api-request-body-title}
3535

36+
`chunking_settings`::
37+
(Optional, object)
38+
include::inference-shared.asciidoc[tag=chunking-settings]
39+
40+
`max_chunking_size`:::
41+
(Optional, integer)
42+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
43+
44+
`overlap`:::
45+
(Optional, integer)
46+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
47+
48+
`sentence_overlap`:::
49+
(Optional, integer)
50+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
51+
52+
`strategy`:::
53+
(Optional, string)
54+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
55+
3656
`service`::
3757
(Required, string)
3858
The type of service supported for the specified task type. In this case,

docs/reference/inference/service-cohere.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,26 @@ Available task types:
3434
[[infer-service-cohere-api-request-body]]
3535
==== {api-request-body-title}
3636

37+
`chunking_settings`::
38+
(Optional, object)
39+
include::inference-shared.asciidoc[tag=chunking-settings]
40+
41+
`max_chunking_size`:::
42+
(Optional, integer)
43+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
44+
45+
`overlap`:::
46+
(Optional, integer)
47+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
48+
49+
`sentence_overlap`:::
50+
(Optional, integer)
51+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
52+
53+
`strategy`:::
54+
(Optional, string)
55+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
56+
3757
`service`::
3858
(Required, string)
3959
The type of service supported for the specified task type. In this case,

docs/reference/inference/service-elasticsearch.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,26 @@ Available task types:
3636
[[infer-service-elasticsearch-api-request-body]]
3737
==== {api-request-body-title}
3838

39+
`chunking_settings`::
40+
(Optional, object)
41+
include::inference-shared.asciidoc[tag=chunking-settings]
42+
43+
`max_chunking_size`:::
44+
(Optional, integer)
45+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
46+
47+
`overlap`:::
48+
(Optional, integer)
49+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
50+
51+
`sentence_overlap`:::
52+
(Optional, integer)
53+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
54+
55+
`strategy`:::
56+
(Optional, string)
57+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
58+
3959
`service`::
4060
(Required, string)
4161
The type of service supported for the specified task type. In this case,

docs/reference/inference/service-elser.asciidoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,26 @@ Available task types:
3636
[[infer-service-elser-api-request-body]]
3737
==== {api-request-body-title}
3838

39+
`chunking_settings`::
40+
(Optional, object)
41+
include::inference-shared.asciidoc[tag=chunking-settings]
42+
43+
`max_chunking_size`:::
44+
(Optional, integer)
45+
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]
46+
47+
`overlap`:::
48+
(Optional, integer)
49+
include::inference-shared.asciidoc[tag=chunking-settings-overlap]
50+
51+
`sentence_overlap`:::
52+
(Optional, integer)
53+
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]
54+
55+
`strategy`:::
56+
(Optional, string)
57+
include::inference-shared.asciidoc[tag=chunking-settings-strategy]
58+
3959
`service`::
4060
(Required, string)
4161
The type of service supported for the specified task type. In this case,

0 commit comments

Comments
 (0)