Skip to content

Commit 71355f5

Browse files
committed
Add ES|QL and Discover steps
1 parent 17e41b6 commit 71355f5

File tree

4 files changed

+116
-62
lines changed

4 files changed

+116
-62
lines changed
347 KB
Loading
396 KB
Loading
358 KB
Loading

solutions/search/serverless-elasticsearch-get-started-semantic.md

Lines changed: 116 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -8,54 +8,39 @@ products:
88
---
99
# Get started with semantic search in {{es-serverless}}
1010

11-
<!--
12-
As you ramp up on Elastic, you'll use the Elasticsearch Relevance Engine (ESRE), designed to power AI search applications. With ESRE, you can take advantage of a suite of developer tools including Elastic's textual search, vector database, and our proprietary transformer model for semantic search.
13-
-->
14-
15-
Elastic offers a variety of search techniques, starting with BM25, the industry standard for textual search.
16-
It provides precise matching for specific searches, matching exact keywords, and it improves with tuning.
17-
18-
<!--
19-
As you get started on vector search, keep in mind there are two forms of vector search: “dense” (aka, kNN vector search) and “sparse."
20-
TBD: Which type is implemented when you use semantic_text field?
21-
-->
11+
_Semantic search_ is a type of AI-powered search that enables you to use intuitive language in your queries.
12+
It returns results that match the meaning of a query, as opposed to literal keyword matches.
13+
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
2214

23-
Elastic also offers an out-of-the-box Learned Sparse Encoder model ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser)) for semantic search.
24-
This model outperforms on a variety of data sets, such as financial data, weather records, and question-answer pairs, among others.
15+
Elastic offers an out-of-the-box Learned Sparse Encoder model ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser)) that outperforms on a variety of data sets, such as financial data, weather records, and question-answer pairs.
2516
The model is built to provide great relevance across domains, without the need for additional fine tuning.
26-
27-
<!--
28-
Check out this interactive demo to see how search results are more relevant when you test Elastic's Learned Sparse Encoder model against Elastic's textual BM25 algorithm.
29-
30-
In addition, Elastic also supports dense vectors to implement similarity search on unstructured data beyond text, such as videos, images, and audio.
31-
-->
32-
33-
The advantage of [AI-powered search](/solutions/search/ai-search/ai-search.md) is that these technologies enable you to use intuitive language in your search queries.
34-
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
17+
If you want to check out all the use cases and implementation paths, go to [](/solutions/search/ai-search/ai-search.md).
3518

3619
## Prerequisites
3720

38-
To try out semantic search, [create an {{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-get-started-create-project) that is optimized for vectors.
21+
To try out semantic search, log into an [{{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md) that is optimized for vectors.
22+
If you want to add sample data, you must have a `developer` or `admin` [predefined role](/deploy-manage/users-roles/cloud-organization/user-roles.md#general-assign-user-roles-table) or an equivalent custom role.
3923

4024
<!--
41-
TBD: It seems like semantic search fields exist in all, so what is the value of this option?
42-
TBD: Can all roles perform these steps?
25+
TBD: It seems like semantic search fields exist in all, so what is the value of this "optimized for vectors" option?
4326
-->
4427

4528
## Add data
4629

4730
% TBD: What type of data is ideal for semantic search?
4831

49-
There are some simple data sets that you can use for learning purposes.
50-
For example, if you follow the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow), you can choose the semantic search option.
32+
There are some small data sets available for learning purposes when you select the semantic search workflow in the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow).
5133
Follow the instructions to install an {{es}} client and copy the code examples.
5234
Alternatively, try out the API requests in the [Console](/explore-analyze/query-filter/tools/console.md):
5335

5436
:::::{stepper}
5537

5638
::::{step} Define a semantic text field
5739

58-
The following example creates a mapping for a single [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field:
40+
You can implement semantic search with varying levels of complexity and customization.
41+
To get started, the recommended method is to use [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) fields.
42+
43+
The following example creates a mapping for a single field:
5944

6045
```console
6146
PUT /semantic-index/_mapping
@@ -72,13 +57,7 @@ PUT /semantic-index/_mapping
7257

7358
::::{step} Add documents
7459

75-
You can use the Elasticsearch bulk API to ingest an array of documents.
76-
The initial bulk ingestion request could take longer than the default request timeout.
77-
If the following request times out, allow time for the machine learning model loading to complete (typically 1-5 minutes) then retry it:
78-
79-
<!--
80-
TBD: Describe where to look for the downloaded model in Trained Models?
81-
-->
60+
You can use the Elasticsearch bulk API to ingest an array of documents:
8261

8362
```console
8463
POST /_bulk?pretty
@@ -89,45 +68,88 @@ POST /_bulk?pretty
8968
{ "index": { "_index": "semantic-index" } }
9069
{"text":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}
9170
```
92-
::::
93-
:::::
9471

95-
What just happened? The content was transformed into a sparse vector inside the `text` field.
96-
This transformation involves two main steps.
72+
The bulk ingestion request might take longer than the default request timeout.
73+
If it times out, allow time for the machine learning model loading to complete (typically 1-5 minutes) then retry it.
74+
75+
<!--
76+
TBD: Describe where to look for the downloaded model in Trained Models?
77+
-->
78+
79+
What just happened? The content was transformed into a sparse vector, which involves two main steps.
9780
First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. Next, each chunk of text is transformed into a sparse vector representation using text expansion techniques.
9881
By default, `semantic_text` fields leverage ELSER to convert the text into a format that captures the semantic meaning.
9982

83+
% TBD: Confirm "Elser model" vs ".elser-2-elasticsearch" terminology.
84+
10085
![Semantic search chunking](/solutions/images/animated-gif-semantic-search-chunking.gif)
10186

87+
::::
88+
::::{step} Explore the data
89+
To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or by using the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md).
90+
91+
In **Discover**, you can click the expand icon ![double arrow icon to open a flyout with the document details](/explore-analyze/images/kibana-expand-icon-2.png "") to show details about any documents in the table.
92+
93+
:::{image} /solutions/images/serverless-discover-semantic.png
94+
:screenshot:
95+
:alt: Discover table view with document expanded
96+
:::
97+
98+
For more tips, check out [](/explore-analyze/discover/discover-get-started.md).
99+
::::
100+
:::::
102101
<!--
103-
TBD: Confirm "Elser model" vs ".elser-2-elasticsearch, a preconfigured endpoint for the elasticsearch service".
104-
TBD: Show how this data looks in Discover, do you see the text or just the vectors?
105-
TBD: Include the Elastic Open Web Crawler variation too?
102+
TBD: When you view these documents in Discover they're shown as having "text" field type instead of "semantic_text" is this right?
103+
TBD: Should we call out that the KQL filters in Discover don't seem to work against semantic_text fields yet?
106104
-->
107105

108-
## Test a semantic search query
106+
## Test semantic search queries
107+
108+
Elasticsearch provides a variety of query languages for interacting with your data.
109+
For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md).
110+
111+
You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, `sparse_vector`. Refer to [](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) for the complete list.
112+
113+
Let's try out two types of queries in two different languages.
114+
115+
:::::{stepper}
116+
117+
::::{step} Run a semantic query in Query DSL
118+
119+
Open the **{{index-manage-app}}** from the navigation menu or return to the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow) to find code examples for searching the sample data.
120+
121+
:::{image} /solutions/images/serverless-index-management-semantic.png
122+
:screenshot:
123+
:alt: Index management semantic search workflow
124+
:::
109125

110126
Try running some queries to check the accuracy and relevance of the search results.
111-
For example, use some keywords that don't exist in the documents:
127+
For example, click **Run in Console** and use some seach terms that you did not see when you explored the documents:
112128

113129
```console
114-
GET semantic-index/_search
130+
POST /semantic-index/_search
115131
{
116-
"query": {
117-
"semantic": {
118-
"field": "text",
119-
"query": "best parks for rappelling"
132+
"retriever": {
133+
"standard": {
134+
"query": {
135+
"semantic": {
136+
"field": "text",
137+
"query": "best park for rappelling"
138+
}
139+
}
120140
}
121141
}
122142
}
123143
```
124144

145+
This is a [semantic](/reference/query-languages/query-dsl/query-dsl-semantic-query.md) query that is expressed in [Query Domain Specific Language](/explore-analyze/query-filter/languages/querydsl.md) (DSL), which is the primary query language for {{es}}.
146+
125147
The query is translated automatically into a vector representation and runs against the contents of the semantic text field.
126-
The search results are sorted by relevance score, which measures how well each document matches the query.
148+
The search results are sorted by a relevance score, which measures how well each document matches the query.
127149

128150
```json
129151
{
130-
"took": 249,
152+
"took": 22,
131153
"timed_out": false,
132154
"_shards": {
133155
"total": 3,
@@ -137,33 +159,65 @@ The search results are sorted by relevance score, which measures how well each d
137159
},
138160
"hits": {
139161
"total": {
140-
"value": 6,
162+
"value": 3,
141163
"relation": "eq"
142164
},
143-
"max_score": 12.118624,
165+
"max_score": 11.389743,
144166
"hits": [
145167
{
146-
"_index": "search-0lxc",
147-
"_id": "0lGtpJcB7hfWuB0FGC06",
148-
"_score": 12.118624,
168+
"_index": "semantic-index",
169+
"_id": "Pp0MtJcBZjjo1YKoXkWH",
170+
"_score": 11.389743,
149171
"_source": {
150-
"text": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."
151-
}
152-
},
153-
...
172+
"text": "Rocky Mountain National Park ...
154173
```
155174

175+
In this example, the document related to Rocky Mountain National park has the highest score.
176+
::::
177+
::::{step} Run a match query in ES|QL
178+
179+
Another way to try out semantic search is by using the [match](/query-languages/esql/functions-operators/search-functions.md#esql-match) search function in the [Elasticsearch Query Language](/explore-analyze/query-filter/languages/esql.md) (ES|QL).
180+
181+
Go to **Discover** and select **Try ES|QL** from the application menu bar.
182+
183+
:::{image} /solutions/images/serverless-discover-esql.png
184+
:screenshot:
185+
:alt: Run an ES|QL semantic query in Discover
186+
:::
187+
188+
Copy the following query:
189+
190+
```esql
191+
FROM semantic-index METADATA _score <1>
192+
| WHERE text: "what's the biggest park?" <2>
193+
| KEEP text, _score <3>
194+
| SORT _score DESC <4>
195+
| LIMIT 1000 <5>
196+
```
197+
198+
1. The FROM source command returns a table of data. Each row in the table represents a document. The `METADATA` clause provides access to the query relevance score, which is a [metadata field](/reference/query-languages/esql/esql-metadata-fields.md).
199+
2. A simplified syntax for the `MATCH` search function, this command performs a semantic query on the specified field.
200+
3. The KEEP processing command affects the columns and their order in the results table.
201+
4. The results are sorted in descending order based on the `_score`.
202+
5. The maximum number of rows to return.
203+
204+
In this example, the first row in the table is the document that had the highest relevance score for the query.
205+
206+
To learn more, check out [](/explore-analyze/discover/try-esql.md) and [](/solutions/search/esql-for-search.md).
207+
::::
208+
:::::
156209
<!--
157210
TBD: Provide more information about how to interpret and filter the search results.
211+
TBD: Include the Elastic Open Web Crawler variation too or point to it in another guide?
158212
-->
159213

160214
## Next steps
161215

162216
Thanks for taking the time to try out semantic search in {{es-serverless}}.
163-
For another semantic search example, check out [](/solutions/search/semantic-search/semantic-search-semantic-text.md).
217+
For a deeper dive, check out [](/solutions/search/semantic-search.md).
164218

165219
If you want to extend this example, try an index with more fields.
166220
For example, if you have both a `text` field and a `semantic_text` field, you can combine the strengths of traditional keyword search and advanced semantic search.
167221
A [hybrid search](/solutions/search/hybrid-semantic-text.md) provides comprehensive search capabilities to find relevant information based on both the raw text and its underlying meaning.
168222

169-
To learn about more options, such as vector and keyword search, check out [](/solutions/search/search-approaches.md).
223+
To learn about more options, such as vector and keyword search, go to [](/solutions/search/search-approaches.md).

0 commit comments

Comments
 (0)