Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions docs/reference/inference/inference-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,61 @@ Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform
<<semantic-search, semantic search>> on your data.


[discrete]
[[infer-chunking-config]]
=== Configuring chunking

{infer-cap} endpoints have a limit on the amount of text they can process at once.
To allow for large amounts of text to be used, {infer} endpoints automatically split the text into smaller, more manageable passages if needed, called _chunks_.
Then chunks will be processed by the {infer} process.

Each chunk will include the text subpassage and the corresponding embedding generated from it.

By default, documents are split into 250-word sections with 1 sentence overlap so that each chunk shares a sentence with the previous chunk.
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.

[discrete]
==== Chunking strategies

Two strategies are available for chunking: `sentence` and `word`.

The `sentence` strategy splits the input text at sentence boundaries.
Each chunk contains one or more complete sentences - except if the sentence is longer than the value of `max_chunk_size` -, ensuring that the integrity of sentence-level context is preserved.

The `word` strategy splits the input text based on individual words.
For `word` strategy, {es} uses the https://unicode-org.github.io/icu-docs/[ICU4J] library to detect word boundaries.
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#word-boundary[Word boundaries] are identified by following a series of rules, not just the presence of a whitespace character.
For written languages that do use whitespace such as Chinese or Japanese dictionary lookups are used to detect word boundaries.

The default chunking strategy is `sentence`.

NOTE: The default chunking strategy for {infer} endpoint created before 8.16 is `word`.

[discrete]
==== Example of configuring the chunking behavior

The following example creates an {infer} endpoint with the `elasticsearch` service that deploys the ELSER model by default and configures the chunking behavior.

[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/small_chunk_size
{
"service": "elasticsearch",
"service_settings": {
"num_allocations": 1,
"num_threads": 1
},
"chunking_settings": {
"strategy": "sentence",
"max_chunk_size": 100,
"sentence_overlap": 0
}
}
------------------------------------------------------------
// TEST[skip:TBD]


include::delete-inference.asciidoc[]
include::get-inference.asciidoc[]
include::post-inference.asciidoc[]
Expand Down
34 changes: 33 additions & 1 deletion docs/reference/inference/inference-shared.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,36 @@ end::task-settings[]

tag::task-type[]
The type of the {infer} task that the model will perform.
end::task-type[]
end::task-type[]

tag::chunking-settings[]
Chunking configuration object.
Refer to <<infer-chunking-config>> to learn more about chunking.
end::chunking-settings[]

tag::chunking-settings-max-chunking-size[]
Specifies the maximum size of a chunk in words.
Defaults to `250`.
This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `15` (for `word` strategy).
end::chunking-settings-max-chunking-size[]

tag::chunking-settings-overlap[]
Only for `word` chunking strategy.
Specifies the number of overlapping words for chunks.
Defaults to `100`.
This value cannot be higher than the half of `max_chunking_size`.
end::chunking-settings-overlap[]

tag::chunking-settings-sentence-overlap[]
Only for `sentence` chunking strategy.
Specifies the numnber of overlapping sentences for chunks.
It can be either `1` or `0`.
Defaults to `1`.
end::chunking-settings-sentence-overlap[]

tag::chunking-settings-strategy[]
Specifies the chunking strategy.
It could be either `sentence` or `word`.
end::chunking-settings-strategy[]


21 changes: 20 additions & 1 deletion docs/reference/inference/service-alibabacloud-ai-search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,26 @@ Available task types:
[[infer-service-alibabacloud-ai-search-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string) The type of service supported for the specified task type.
In this case,
Expand Down Expand Up @@ -108,7 +128,6 @@ To modify this, set the `requests_per_minute` setting of this object in your ser
include::inference-shared.asciidoc[tag=request-per-minute-example]
--


`task_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=task-settings]
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-amazon-bedrock.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,26 @@ Available task types:
[[infer-service-amazon-bedrock-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string) The type of service supported for the specified task type.
In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-anthropic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,26 @@ Available task types:
[[infer-service-anthropic-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-azure-ai-studio.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-azure-ai-studio-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-azure-openai.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-azure-openai-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-cohere.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,26 @@ Available task types:
[[infer-service-cohere-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-elasticsearch.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,26 @@ Available task types:
[[infer-service-elasticsearch-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,26 @@ Available task types:
[[infer-service-elser-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-google-ai-studio.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-google-ai-studio-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-google-vertex-ai.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-google-vertex-ai-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
Loading