diff --git a/docs/reference/mapping/types/semantic-text.asciidoc b/docs/reference/mapping/types/semantic-text.asciidoc index ac23c153e01a3..684ad7c369e7d 100644 --- a/docs/reference/mapping/types/semantic-text.asciidoc +++ b/docs/reference/mapping/types/semantic-text.asciidoc @@ -87,7 +87,7 @@ Trying to <> that is used on a [discrete] [[auto-text-chunking]] -==== Automatic text chunking +==== Text chunking {infer-cap} endpoints have a limit on the amount of text they can process. To allow for large amounts of text to be used in semantic search, `semantic_text` automatically generates smaller passages if needed, called _chunks_. @@ -95,8 +95,7 @@ To allow for large amounts of text to be used in semantic search, `semantic_text Each chunk will include the text subpassage and the corresponding embedding generated from it. When querying, the individual passages will be automatically searched for each document, and the most relevant passage will be used to compute a score. -Documents are split into 250-word sections with a 100-word overlap so that each section shares 100 words with the previous section. -This overlap ensures continuity and prevents vital contextual information in the input text from being lost by a hard break. +For more details on chunking and how to configure chunking settings, see <> in the Inference API documentation. [discrete]