Skip to content
10 changes: 6 additions & 4 deletions docs/reference/inference/inference-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ the following APIs to manage {infer} models and perform {infer}:
* <<post-inference-api>>
* <<put-inference-api>>
* <<stream-inference-api>>
* <<unified-inference-api>>
* <<update-inference-api>>

[[inference-landscape]]
Expand All @@ -28,9 +29,9 @@ image::images/inference-landscape.jpg[A representation of the Elastic inference

An {infer} endpoint enables you to use the corresponding {ml} model without
manual deployment and apply it to your data at ingestion time through
<<semantic-search-semantic-text, semantic text>>.
<<semantic-search-semantic-text, semantic text>>.

Choose a model from your provider or use ELSER – a retrieval model trained by
Choose a model from your provider or use ELSER – a retrieval model trained by
Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform
<<semantic-search, semantic search>> on your data.
Expand Down Expand Up @@ -61,7 +62,7 @@ The following list contains the default {infer} endpoints listed by `inference_i
Use the `inference_id` of the endpoint in a <<semantic-text,`semantic_text`>> field definition or when creating an <<inference-processor,{infer} processor>>.
The API call will automatically download and deploy the model which might take a couple of minutes.
Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled.
For these models, the minimum number of allocations is `0`.
For these models, the minimum number of allocations is `0`.
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.


Expand All @@ -78,7 +79,7 @@ Returning a long document in search results is less useful than providing the mo
Each chunk will include the text subpassage and the corresponding embedding generated from it.

By default, documents are split into sentences and grouped in sections up to 250 words with 1 sentence overlap so that each chunk shares a sentence with the previous chunk.
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.

{es} uses the https://unicode-org.github.io/icu-docs/[ICU4J] library to detect word and sentence boundaries for chunking.
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#word-boundary[Word boundaries] are identified by following a series of rules, not just the presence of a whitespace character.
Expand Down Expand Up @@ -129,6 +130,7 @@ PUT _inference/sparse_embedding/small_chunk_size
include::delete-inference.asciidoc[]
include::get-inference.asciidoc[]
include::post-inference.asciidoc[]
include::unified-inference.asciidoc[]
include::put-inference.asciidoc[]
include::stream-inference.asciidoc[]
include::update-inference.asciidoc[]
Expand Down
42 changes: 41 additions & 1 deletion docs/reference/inference/inference-shared.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ end::chunking-settings[]
tag::chunking-settings-max-chunking-size[]
Specifies the maximum size of a chunk in words.
Defaults to `250`.
This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `10` (for `word` strategy).
This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `10` (for `word` strategy).
end::chunking-settings-max-chunking-size[]

tag::chunking-settings-overlap[]
Expand All @@ -63,4 +63,44 @@ Specifies the chunking strategy.
It could be either `sentence` or `word`.
end::chunking-settings-strategy[]

tag::unified-schema-content-with-examples[]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I import this a couple times for each of the messages since they all have the same format (string or an array of objects).

.Examples
[%collapsible%closed]
======
String example
[source,js]
------------------------------------------------------------
{
"content": "Some string"
}
------------------------------------------------------------
// NOTCONSOLE

Object example
[source,js]
------------------------------------------------------------
{
"content": [
{
"text": "Some text",
"type": "text"
}
]
}
------------------------------------------------------------
// NOTCONSOLE
======

String representation:::
(Required, string)
The text content.
+
Object representation:::
`text`::::
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get lucky that each of the messages has content:: so the colons work out here to nest it correctly.

(Required, string)
The text content.
+
`type`::::
(Required, string)
This must be set to the value `text`.
end::unified-schema-content-with-examples[]
Loading