Skip to content

Commit 45b8fa4

Browse files
Adding unified docs to main page
1 parent 16d0f62 commit 45b8fa4

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

docs/reference/inference/inference-apis.asciidoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ the following APIs to manage {infer} models and perform {infer}:
2020
* <<post-inference-api>>
2121
* <<put-inference-api>>
2222
* <<stream-inference-api>>
23+
* <<unified-inference-api>>
2324
* <<update-inference-api>>
2425

2526
[[inference-landscape]]
@@ -28,9 +29,9 @@ image::images/inference-landscape.jpg[A representation of the Elastic inference
2829

2930
An {infer} endpoint enables you to use the corresponding {ml} model without
3031
manual deployment and apply it to your data at ingestion time through
31-
<<semantic-search-semantic-text, semantic text>>.
32+
<<semantic-search-semantic-text, semantic text>>.
3233

33-
Choose a model from your provider or use ELSER – a retrieval model trained by
34+
Choose a model from your provider or use ELSER – a retrieval model trained by
3435
Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
3536
Now use <<semantic-search-semantic-text, semantic text>> to perform
3637
<<semantic-search, semantic search>> on your data.
@@ -61,7 +62,7 @@ The following list contains the default {infer} endpoints listed by `inference_i
6162
Use the `inference_id` of the endpoint in a <<semantic-text,`semantic_text`>> field definition or when creating an <<inference-processor,{infer} processor>>.
6263
The API call will automatically download and deploy the model which might take a couple of minutes.
6364
Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled.
64-
For these models, the minimum number of allocations is `0`.
65+
For these models, the minimum number of allocations is `0`.
6566
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
6667

6768

@@ -78,7 +79,7 @@ Returning a long document in search results is less useful than providing the mo
7879
Each chunk will include the text subpassage and the corresponding embedding generated from it.
7980

8081
By default, documents are split into sentences and grouped in sections up to 250 words with 1 sentence overlap so that each chunk shares a sentence with the previous chunk.
81-
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.
82+
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.
8283

8384
{es} uses the https://unicode-org.github.io/icu-docs/[ICU4J] library to detect word and sentence boundaries for chunking.
8485
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#word-boundary[Word boundaries] are identified by following a series of rules, not just the presence of a whitespace character.
@@ -129,6 +130,7 @@ PUT _inference/sparse_embedding/small_chunk_size
129130
include::delete-inference.asciidoc[]
130131
include::get-inference.asciidoc[]
131132
include::post-inference.asciidoc[]
133+
include::unified-inference.asciidoc[]
132134
include::put-inference.asciidoc[]
133135
include::stream-inference.asciidoc[]
134136
include::update-inference.asciidoc[]

0 commit comments

Comments
 (0)