Skip to content

Commit bb909bc

Browse files
committed
Updates chunk settings documentation (#116719)
(cherry picked from commit bada2a6)
1 parent fa541d2 commit bb909bc

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

docs/reference/mapping/types/semantic-text.asciidoc

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -87,16 +87,15 @@ Trying to <<delete-inference-api,delete an {infer} endpoint>> that is used on a
8787

8888
[discrete]
8989
[[auto-text-chunking]]
90-
==== Automatic text chunking
90+
==== Text chunking
9191

9292
{infer-cap} endpoints have a limit on the amount of text they can process.
9393
To allow for large amounts of text to be used in semantic search, `semantic_text` automatically generates smaller passages if needed, called _chunks_.
9494

9595
Each chunk will include the text subpassage and the corresponding embedding generated from it.
9696
When querying, the individual passages will be automatically searched for each document, and the most relevant passage will be used to compute a score.
9797

98-
Documents are split into 250-word sections with a 100-word overlap so that each section shares 100 words with the previous section.
99-
This overlap ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.
98+
For more details on chunking and how to configure chunking settings, see <<infer-chunking-config, Configuring chunking>> in the Inference API documentation.
10099

101100

102101
[discrete]

0 commit comments

Comments
 (0)