Skip to content

Commit 39fde5b

Browse files
kosabogiszabostevedavidkyle
authored
[DOCS] [8.17] Adds new default inference endpoint information (#117985) (#118239)
* Adds new default inference information * Update docs/reference/mapping/types/semantic-text.asciidoc * Update docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc * Update docs/reference/mapping/types/semantic-text.asciidoc --------- Co-authored-by: István Zoltán Szabó <[email protected]> Co-authored-by: David Kyle <[email protected]>
1 parent c90bf0d commit 39fde5b

File tree

3 files changed

+21
-57
lines changed

3 files changed

+21
-57
lines changed

docs/reference/mapping/types/semantic-text.asciidoc

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,14 @@ Long passages are <<auto-text-chunking, automatically chunked>> to smaller secti
1212

1313
The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
1414
You can create the inference endpoint by using the <<put-inference-api>>.
15-
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
16-
If you don't specify an inference endpoint, the <<infer-service-elser,ELSER service>> is used by default.
15+
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
16+
17+
If you don’t specify an inference endpoint, the `inference_id` field defaults to `.elser-2-elasticsearch`, a preconfigured endpoint for the elasticsearch service.
1718

1819
Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
1920
The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.
2021

21-
If you use the ELSER service, you can set up `semantic_text` with the following API request:
22+
If you use the preconfigured `.elser-2-elasticsearch` endpoint, you can set up `semantic_text` with the following API request:
2223

2324
[source,console]
2425
------------------------------------------------------------
@@ -34,7 +35,7 @@ PUT my-index-000001
3435
}
3536
------------------------------------------------------------
3637

37-
If you use a service other than ELSER, you must create an {infer} endpoint using the <<put-inference-api>> and reference it when setting up `semantic_text` as the following example demonstrates:
38+
To use a custom {infer} endpoint instead of the default `.elser-2-elasticsearch`, you must <<put-inference-api>> and specify its `inference_id` when setting up the `semantic_text` field type.
3839

3940
[source,console]
4041
------------------------------------------------------------
@@ -53,8 +54,7 @@ PUT my-index-000002
5354
// TEST[skip:Requires inference endpoint]
5455
<1> The `inference_id` of the {infer} endpoint to use to generate embeddings.
5556

56-
57-
The recommended way to use semantic_text is by having dedicated {infer} endpoints for ingestion and search.
57+
The recommended way to use `semantic_text` is by having dedicated {infer} endpoints for ingestion and search.
5858
This ensures that search speed remains unaffected by ingestion workloads, and vice versa.
5959
After creating dedicated {infer} endpoints for both, you can reference them using the `inference_id` and `search_inference_id` parameters when setting up the index mapping for an index that uses the `semantic_text` field.
6060

@@ -82,10 +82,11 @@ PUT my-index-000003
8282

8383
`inference_id`::
8484
(Required, string)
85-
{infer-cap} endpoint that will be used to generate the embeddings for the field.
85+
{infer-cap} endpoint that will be used to generate embeddings for the field.
86+
By default, `.elser-2-elasticsearch` is used.
8687
This parameter cannot be updated.
8788
Use the <<put-inference-api>> to create the endpoint.
88-
If `search_inference_id` is specified, the {infer} endpoint defined by `inference_id` will only be used at index time.
89+
If `search_inference_id` is specified, the {infer} endpoint will only be used at index time.
8990

9091
`search_inference_id`::
9192
(Optional, string)
@@ -208,7 +209,7 @@ PUT test-index
208209
"properties": {
209210
"infer_field": {
210211
"type": "semantic_text",
211-
"inference_id": "my-elser-endpoint"
212+
"inference_id": ".elser-2-elasticsearch"
212213
},
213214
"source_field": {
214215
"type": "text",

docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,15 @@ You don't need to define model related settings and parameters, or create {infer
1414
The recommended way to use <<semantic-search,semantic search>> in the {stack} is following the `semantic_text` workflow.
1515
When you need more control over indexing and query settings, you can still use the complete {infer} workflow (refer to <<semantic-search-inference,this tutorial>> to review the process).
1616

17-
This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
17+
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
1818

1919

2020
[discrete]
2121
[[semantic-text-requirements]]
2222
==== Requirements
2323

24-
This tutorial uses the <<infer-service-elser,ELSER service>> for demonstration, which is created automatically as needed.
25-
To use the `semantic_text` field type with an {infer} service other than ELSER, you must create an inference endpoint using the <<put-inference-api>>.
24+
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, which is created automatically as needed.
25+
To use the `semantic_text` field type with an {infer} service other than `elasticsearch` service, you must create an inference endpoint using the <<put-inference-api>>.
2626

2727

2828
[discrete]
@@ -48,7 +48,7 @@ PUT semantic-embeddings
4848
// TEST[skip:TBD]
4949
<1> The name of the field to contain the generated embeddings.
5050
<2> The field to contain the embeddings is a `semantic_text` field.
51-
Since no `inference_id` is provided, the <<infer-service-elser,ELSER service>> is used by default.
51+
Since no `inference_id` is provided, the default endpoint `.elser-2-elasticsearch` for the <<infer-service-elasticsearch,`elasticsearch` service>> is used.
5252
To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.
5353

5454

docs/reference/search/search-your-data/semantic-text-hybrid-search

Lines changed: 7 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -8,47 +8,12 @@ This tutorial demonstrates how to perform hybrid search, combining semantic sear
88

99
In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient.
1010

11-
The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow. This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and its supported models offered by the {infer-cap} API.
12-
13-
[discrete]
14-
[[semantic-text-hybrid-infer-endpoint]]
15-
==== Create the {infer} endpoint
16-
17-
Create an inference endpoint by using the <<put-inference-api>>:
18-
19-
[source,console]
20-
------------------------------------------------------------
21-
PUT _inference/sparse_embedding/my-elser-endpoint <1>
22-
{
23-
"service": "elser", <2>
24-
"service_settings": {
25-
"adaptive_allocations": { <3>
26-
"enabled": true,
27-
"min_number_of_allocations": 3,
28-
"max_number_of_allocations": 10
29-
},
30-
"num_threads": 1
31-
}
32-
}
33-
------------------------------------------------------------
34-
// TEST[skip:TBD]
35-
<1> The task type is `sparse_embedding` in the path as the `elser` service will
36-
be used and ELSER creates sparse vectors. The `inference_id` is
37-
`my-elser-endpoint`.
38-
<2> The `elser` service is used in this example.
39-
<3> This setting enables and configures adaptive allocations.
40-
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
41-
42-
[NOTE]
43-
====
44-
You might see a 502 bad gateway error in the response when using the {kib} Console.
45-
This error usually just reflects a timeout, while the model downloads in the background.
46-
You can check the download progress in the {ml-app} UI.
47-
====
11+
The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow.
12+
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
4813

4914
[discrete]
5015
[[hybrid-search-create-index-mapping]]
51-
==== Create an index mapping for hybrid search
16+
==== Create an index mapping
5217

5318
The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search.
5419

@@ -60,21 +25,19 @@ PUT semantic-embeddings
6025
"properties": {
6126
"semantic_text": { <1>
6227
"type": "semantic_text",
63-
"inference_id": "my-elser-endpoint" <2>
6428
},
65-
"content": { <3>
29+
"content": { <2>
6630
"type": "text",
67-
"copy_to": "semantic_text" <4>
31+
"copy_to": "semantic_text" <3>
6832
}
6933
}
7034
}
7135
}
7236
------------------------------------------------------------
7337
// TEST[skip:TBD]
7438
<1> The name of the field to contain the generated embeddings for semantic search.
75-
<2> The identifier of the inference endpoint that generates the embeddings based on the input text.
76-
<3> The name of the field to contain the original text for lexical search.
77-
<4> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.
39+
<2> The name of the field to contain the original text for lexical search.
40+
<3> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.
7841

7942
[NOTE]
8043
====

0 commit comments

Comments
 (0)