You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Get started with semantic search in {{es-serverless}}
10
10
11
-
<!--
12
-
As you ramp up on Elastic, you'll use the Elasticsearch Relevance Engine (ESRE), designed to power AI search applications. With ESRE, you can take advantage of a suite of developer tools including Elastic's textual search, vector database, and our proprietary transformer model for semantic search.
13
-
-->
14
-
15
-
Elastic offers a variety of search techniques, starting with BM25, the industry standard for textual search.
16
-
It provides precise matching for specific searches, matching exact keywords, and it improves with tuning.
17
-
18
-
<!--
19
-
As you get started on vector search, keep in mind there are two forms of vector search: “dense” (aka, kNN vector search) and “sparse."
20
-
TBD: Which type is implemented when you use semantic_text field?
21
-
-->
11
+
_Semantic search_ is a type of AI-powered search that enables you to use intuitive language in your queries.
12
+
It returns results that match the meaning of a query, as opposed to literal keyword matches.
13
+
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
22
14
23
-
Elastic also offers an out-of-the-box Learned Sparse Encoder model ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser)) for semantic search.
24
-
This model outperforms on a variety of data sets, such as financial data, weather records, and question-answer pairs, among others.
15
+
Elastic offers an out-of-the-box Learned Sparse Encoder model ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser)) that outperforms on a variety of data sets, such as financial data, weather records, and question-answer pairs.
25
16
The model is built to provide great relevance across domains, without the need for additional fine tuning.
26
-
27
-
<!--
28
-
Check out this interactive demo to see how search results are more relevant when you test Elastic's Learned Sparse Encoder model against Elastic's textual BM25 algorithm.
29
-
30
-
In addition, Elastic also supports dense vectors to implement similarity search on unstructured data beyond text, such as videos, images, and audio.
31
-
-->
32
-
33
-
The advantage of [AI-powered search](/solutions/search/ai-search/ai-search.md) is that these technologies enable you to use intuitive language in your search queries.
34
-
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
17
+
If you want to check out all the use cases and implementation paths, go to [](/solutions/search/ai-search/ai-search.md).
35
18
36
19
## Prerequisites
37
20
38
-
To try out semantic search, [create an {{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-get-started-create-project) that is optimized for vectors.
21
+
To try out semantic search, log into an [{{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md) that is optimized for vectors.
22
+
If you want to add sample data, you must have a `developer` or `admin`[predefined role](/deploy-manage/users-roles/cloud-organization/user-roles.md#general-assign-user-roles-table) or an equivalent custom role.
39
23
40
24
<!--
41
-
TBD: It seems like semantic search fields exist in all, so what is the value of this option?
42
-
TBD: Can all roles perform these steps?
25
+
TBD: It seems like semantic search fields exist in all, so what is the value of this "optimized for vectors" option?
43
26
-->
44
27
45
28
## Add data
46
29
47
30
% TBD: What type of data is ideal for semantic search?
48
31
49
-
There are some simple data sets that you can use for learning purposes.
50
-
For example, if you follow the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow), you can choose the semantic search option.
32
+
There are some small data sets available for learning purposes when you select the semantic search workflow in the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow).
51
33
Follow the instructions to install an {{es}} client and copy the code examples.
52
34
Alternatively, try out the API requests in the [Console](/explore-analyze/query-filter/tools/console.md):
53
35
54
36
:::::{stepper}
55
37
56
38
::::{step} Define a semantic text field
57
39
58
-
The following example creates a mapping for a single [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field:
40
+
You can implement semantic search with varying levels of complexity and customization.
41
+
To get started, the recommended method is to use [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) fields.
42
+
43
+
The following example creates a mapping for a single field:
59
44
60
45
```console
61
46
PUT /semantic-index/_mapping
@@ -72,13 +57,7 @@ PUT /semantic-index/_mapping
72
57
73
58
::::{step} Add documents
74
59
75
-
You can use the Elasticsearch bulk API to ingest an array of documents.
76
-
The initial bulk ingestion request could take longer than the default request timeout.
77
-
If the following request times out, allow time for the machine learning model loading to complete (typically 1-5 minutes) then retry it:
78
-
79
-
<!--
80
-
TBD: Describe where to look for the downloaded model in Trained Models?
81
-
-->
60
+
You can use the Elasticsearch bulk API to ingest an array of documents:
82
61
83
62
```console
84
63
POST /_bulk?pretty
@@ -89,45 +68,88 @@ POST /_bulk?pretty
89
68
{ "index": { "_index": "semantic-index" } }
90
69
{"text":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}
91
70
```
92
-
::::
93
-
:::::
94
71
95
-
What just happened? The content was transformed into a sparse vector inside the `text` field.
96
-
This transformation involves two main steps.
72
+
The bulk ingestion request might take longer than the default request timeout.
73
+
If it times out, allow time for the machine learning model loading to complete (typically 1-5 minutes) then retry it.
74
+
75
+
<!--
76
+
TBD: Describe where to look for the downloaded model in Trained Models?
77
+
-->
78
+
79
+
What just happened? The content was transformed into a sparse vector, which involves two main steps.
97
80
First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. Next, each chunk of text is transformed into a sparse vector representation using text expansion techniques.
98
81
By default, `semantic_text` fields leverage ELSER to convert the text into a format that captures the semantic meaning.
99
82
83
+
% TBD: Confirm "Elser model" vs ".elser-2-elasticsearch" terminology.
To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or by using the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md).
90
+
91
+
In **Discover**, you can click the expand icon  to show details about any documents in the table.
For more tips, check out [](/explore-analyze/discover/discover-get-started.md).
99
+
::::
100
+
:::::
102
101
<!--
103
-
TBD: Confirm "Elser model" vs ".elser-2-elasticsearch, a preconfigured endpoint for the elasticsearch service".
104
-
TBD: Show how this data looks in Discover, do you see the text or just the vectors?
105
-
TBD: Include the Elastic Open Web Crawler variation too?
102
+
TBD: When you view these documents in Discover they're shown as having "text" field type instead of "semantic_text" is this right?
103
+
TBD: Should we call out that the KQL filters in Discover don't seem to work against semantic_text fields yet?
106
104
-->
107
105
108
-
## Test a semantic search query
106
+
## Test semantic search queries
107
+
108
+
Elasticsearch provides a variety of query languages for interacting with your data.
109
+
For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md).
110
+
111
+
You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, `sparse_vector`. Refer to [](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) for the complete list.
112
+
113
+
Let's try out two types of queries in two different languages.
114
+
115
+
:::::{stepper}
116
+
117
+
::::{step} Run a semantic query in Query DSL
118
+
119
+
Open the **{{index-manage-app}}** from the navigation menu or return to the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow) to find code examples for searching the sample data.
Try running some queries to check the accuracy and relevance of the search results.
111
-
For example, use some keywords that don't exist in the documents:
127
+
For example, click **Run in Console** and use some seach terms that you did not see when you explored the documents:
112
128
113
129
```console
114
-
GET semantic-index/_search
130
+
POST /semantic-index/_search
115
131
{
116
-
"query": {
117
-
"semantic": {
118
-
"field": "text",
119
-
"query": "best parks for rappelling"
132
+
"retriever": {
133
+
"standard": {
134
+
"query": {
135
+
"semantic": {
136
+
"field": "text",
137
+
"query": "best park for rappelling"
138
+
}
139
+
}
120
140
}
121
141
}
122
142
}
123
143
```
124
144
145
+
This is a [semantic](/reference/query-languages/query-dsl/query-dsl-semantic-query.md) query that is expressed in [Query Domain Specific Language](/explore-analyze/query-filter/languages/querydsl.md) (DSL), which is the primary query language for {{es}}.
146
+
125
147
The query is translated automatically into a vector representation and runs against the contents of the semantic text field.
126
-
The search results are sorted by relevance score, which measures how well each document matches the query.
148
+
The search results are sorted by a relevance score, which measures how well each document matches the query.
127
149
128
150
```json
129
151
{
130
-
"took": 249,
152
+
"took": 22,
131
153
"timed_out": false,
132
154
"_shards": {
133
155
"total": 3,
@@ -137,33 +159,65 @@ The search results are sorted by relevance score, which measures how well each d
137
159
},
138
160
"hits": {
139
161
"total": {
140
-
"value": 6,
162
+
"value": 3,
141
163
"relation": "eq"
142
164
},
143
-
"max_score": 12.118624,
165
+
"max_score": 11.389743,
144
166
"hits": [
145
167
{
146
-
"_index": "search-0lxc",
147
-
"_id": "0lGtpJcB7hfWuB0FGC06",
148
-
"_score": 12.118624,
168
+
"_index": "semantic-index",
169
+
"_id": "Pp0MtJcBZjjo1YKoXkWH",
170
+
"_score": 11.389743,
149
171
"_source": {
150
-
"text": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."
151
-
}
152
-
},
153
-
...
172
+
"text": "Rocky Mountain National Park ...
154
173
```
155
174
175
+
In this example, the document related to Rocky Mountain National park has the highest score.
176
+
::::
177
+
::::{step} Run a match query in ES|QL
178
+
179
+
Another way to try out semantic search is by using the [match](/query-languages/esql/functions-operators/search-functions.md#esql-match) search function in the [Elasticsearch Query Language](/explore-analyze/query-filter/languages/esql.md) (ES|QL).
180
+
181
+
Go to **Discover** and select **Try ES|QL** from the application menu bar.
1. The FROM source command returns a table of data. Each row in the table represents a document. The `METADATA` clause provides access to the query relevance score, which is a [metadata field](/reference/query-languages/esql/esql-metadata-fields.md).
199
+
2. A simplified syntax for the `MATCH` search function, this command performs a semantic query on the specified field.
200
+
3. The KEEP processing command affects the columns and their order in the results table.
201
+
4. The results are sorted in descending order based on the `_score`.
202
+
5. The maximum number of rows to return.
203
+
204
+
In this example, the first row in the table is the document that had the highest relevance score for the query.
205
+
206
+
To learn more, check out [](/explore-analyze/discover/try-esql.md) and [](/solutions/search/esql-for-search.md).
207
+
::::
208
+
:::::
156
209
<!--
157
210
TBD: Provide more information about how to interpret and filter the search results.
211
+
TBD: Include the Elastic Open Web Crawler variation too or point to it in another guide?
158
212
-->
159
213
160
214
## Next steps
161
215
162
216
Thanks for taking the time to try out semantic search in {{es-serverless}}.
163
-
For another semantic search example, check out [](/solutions/search/semantic-search/semantic-search-semantic-text.md).
217
+
For a deeper dive, check out [](/solutions/search/semantic-search.md).
164
218
165
219
If you want to extend this example, try an index with more fields.
166
220
For example, if you have both a `text` field and a `semantic_text` field, you can combine the strengths of traditional keyword search and advanced semantic search.
167
221
A [hybrid search](/solutions/search/hybrid-semantic-text.md) provides comprehensive search capabilities to find relevant information based on both the raw text and its underlying meaning.
168
222
169
-
To learn about more options, such as vector and keyword search, check out [](/solutions/search/search-approaches.md).
223
+
To learn about more options, such as vector and keyword search, go to [](/solutions/search/search-approaches.md).
0 commit comments