Skip to content

Commit 9467226

Browse files
committed
Remove query DSL example
1 parent b9cb754 commit 9467226

File tree

1 file changed

+34
-113
lines changed

1 file changed

+34
-113
lines changed

solutions/search/get-started/semantic-search.md

Lines changed: 34 additions & 113 deletions
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,12 @@ _Semantic search_ is a type of AI-powered search that enables you to use natural
1313
It returns results that match the meaning of a query, as opposed to literal keyword matches.
1414
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
1515

16-
Semantic search uses {{es}} vector database and vector search technology.
17-
Each _vector_ (or _vector embedding_) is an array of numbers that each represent a different characteristic of the text, such as sentiment, context, and syntactics.
18-
These numeric representations make comparison with other vectors very efficient.
16+
Semantic search uses {{es}} [vector database](https://www.elastic.co/what-is/vector-database) and [vector search](https://www.elastic.co/what-is/vector-search) technology.
17+
Each _vector_ (or _vector embedding_) is an array of numbers that represent different characteristics of the text, such as sentiment, context, and syntactics.
18+
These numeric representations make vector comparisons very efficient.
1919

20-
In this guide, you'll learn how to perform semantic search on a small set of sample data.
21-
You'll create vectors and store them in {{es}}.
22-
Then you'll run a query, which will be transformed into vectors and compared to the stored data.
23-
By playing with a simple use case, you'll take the first steps toward understanding whether this type of search is relevant to your own data.
20+
In this quickstart guide, you'll create vectors for a small set of sample data, store them in {{es}}, then run a semantic query.
21+
By playing with a simple use case, you'll take the first steps toward understanding whether it's applicable to your own data.
2422

2523
## Prerequisites
2624

@@ -36,12 +34,12 @@ TBD: What is the impact of this "optimized for vectors" option?
3634
## Create a vector database
3735

3836
When you create vectors (or _vectorize_ your data), you convert complex and nuanced documents into multidimensional numerical representations.
39-
You can choose from many different vector embedding models. Some are extremely hardware efficient and can be run with less computational power. Others have a greater understanding of the context and can answer questions and lead a threaded conversation.
40-
These examples use the default Learned Sparse Encoder ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) model, which provides great relevance across domains without the need for additional fine tuning.
37+
You can choose from many different vector embedding models. Some are extremely hardware efficient and can be run with less computational power. Others have a greater understanding of the context, can answer questions, and lead a threaded conversation.
38+
The examples in this guide use the default Learned Sparse Encoder ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) model, which provides great relevance across domains without the need for additional fine tuning.
4139

42-
The way that you store and index vectors has a significant impact on the performance and accuracy of search results.
40+
The way that you store vectors has a significant impact on the performance and accuracy of search results.
4341
They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations.
44-
These examples store the vectors in `semantic_text` fields, which provide sensible defaults and automation.
42+
This guide uses the [semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md), which provide sensible defaults and automation.
4543

4644
Try vectorizing a small set of documents.
4745
You can follow the guided index workflow:
@@ -70,7 +68,6 @@ PUT /semantic-index/_mapping
7068
When you use `semantic_text` fields, the type of vector is determined by the vector embedding model.
7169
In this case, the default ELSER model will be used to create sparse vectors.
7270

73-
For more details about `semantic_text` fields, refer to [](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md).
7471
For a deeper dive, check out [Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector](https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types).
7572
::::
7673

@@ -88,22 +85,24 @@ POST /_bulk?pretty
8885
{"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}
8986
```
9087

91-
The bulk ingestion request might take longer than the default request timeout.
92-
If it times out, wait for the machine learning model loading to complete (typically 1-5 minutes) then retry it.
88+
The bulk ingestion might take longer than the default request timeout.
89+
If it times out, wait for the ELSER model to load (typically 1-5 minutes) then retry it.
9390
::::
9491
:::::
9592

96-
What just happened? The content was transformed into a sparse vector, which involves two main steps.
97-
First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. Then each chunk of text is transformed into a sparse vector representation using text expansion techniques.
93+
What just happened? The content was transformed into sparse vectors, which involves two main steps.
94+
First, the content was divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched.
95+
Then each chunk of text was transformed into a sparse vector representation using text expansion techniques.
9896

9997
![Semantic search chunking](https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt9bbe5e260012b15d/67ffffc8165067d96124b586/animated-gif-semantic-search-chunking.gif)
10098

99+
With a few vectors stored in {{es}}, semantic search can now occur.
101100

102101
## Explore the data
103102

104103
To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or by using the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md).
105104

106-
In **Discover**, you can click the expand icon ![double arrow icon to open a flyout with the document details](/explore-analyze/images/kibana-expand-icon-2.png "") to show details about any documents in the table.
105+
In **Discover**, you can click the expand icon ![double arrow icon to open a flyout with the document details](/explore-analyze/images/kibana-expand-icon-2.png "") to show details about documents in the table.
107106

108107
:::{image} /solutions/images/serverless-discover-semantic.png
109108
:screenshot:
@@ -112,101 +111,19 @@ In **Discover**, you can click the expand icon ![double arrow icon to open a fly
112111

113112
For more tips, check out [](/explore-analyze/discover/discover-get-started.md).
114113

115-
<!--
116-
TBD: When you view these documents in Discover they're shown as having "text" field type instead of "semantic_text" is this right?
117-
-->
118-
119114
## Test semantic search
120115

121-
<!--
122-
TO-DO: Talk about the pipeline where vectors are required for both the data and search query
123-
% encodes details of searchable information into vectors and then compares vectors to determine which are most similar.
124-
When you run a query, the search engine transforms the query into embeddings, which are numerical representations of data and related contexts. They are stored in vectors. The kNN algorithm, or k-nearest neighbor algorithm, then matches vectors of existing documents (a semantic search concerns text) to the query vectors. The semantic search then generates results and ranks them based on conceptual relevance.
125-
-->
126-
127116
{{es}} provides a variety of query languages for interacting with your data.
128117
For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md).
129118

130-
You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, and `sparse_vector`. Refer to [Semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) for the complete list.
131-
132-
Let's try out two types of queries in two different languages.
133-
134-
:::::{stepper}
135-
136-
::::{step} Run a semantic query with Query DSL
137-
138-
Open the **{{index-manage-app}}** page from the navigation menu or return to the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow) to find code examples for searching the sample data.
139-
140-
:::{image} /solutions/images/serverless-index-management-semantic.png
141-
:screenshot:
142-
:alt: Index management semantic search workflow
143-
:::
144-
145-
Try running some queries to check the accuracy and relevance of the search results.
146-
For example, click **Run in Console** and use some seach terms that you did not see when you explored the documents:
147-
148-
```console
149-
POST /semantic-index/_search
150-
{
151-
"retriever": {
152-
"standard": {
153-
"query": {
154-
"semantic": {
155-
"field": "content",
156-
"query": "best park for rappelling"
157-
}
158-
}
159-
}
160-
}
161-
}
162-
```
163-
164-
This is a [semantic](elasticsearch://reference/query-languages/query-dsl/query-dsl-semantic-query.md) query that is expressed in [Query Domain Specific Language](/explore-analyze/query-filter/languages/querydsl.md) (DSL), which is the primary query language for {{es}}.
165-
166-
The query is translated automatically into a vector representation and runs against the contents of the semantic text field.
167-
The search results are sorted by a relevance score, which measures how well each document matches the query.
168-
169-
```json
170-
{
171-
"took": 22,
172-
"timed_out": false,
173-
"_shards": {
174-
"total": 3,
175-
"successful": 3,
176-
"skipped": 0,
177-
"failed": 0
178-
},
179-
"hits": {
180-
"total": {
181-
"value": 3,
182-
"relation": "eq"
183-
},
184-
"max_score": 11.389743,
185-
"hits": [
186-
{
187-
"_index": "semantic-index",
188-
"_id": "Pp0MtJcBZjjo1YKoXkWH",
189-
"_score": 11.389743,
190-
"_source": {
191-
"content": "Rocky Mountain National Park ..."
192-
...
193-
}
194-
```
195-
196-
In this example, the document related to Rocky Mountain National park has the highest score.
197-
::::
198-
::::{step} Run a match query in ES|QL
199-
200-
Another way to try out semantic search is by using the [match](elasticsearch://reference/query-languages/esql/functions-operators/search-functions.md#esql-match) search function in the [Elasticsearch Query Language](/explore-analyze/query-filter/languages/esql.md) (ES|QL).
119+
You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, and `sparse_vector`.
120+
The query is translated automatically into the appropriate vector representation to run against the contents of the semantic text field.
121+
The search results include a relevance score, which measures how well each document matches the query.
201122

123+
Let's test a semantic search query in [Elasticsearch Query Language](/explore-analyze/query-filter/languages/esql.md) (ES|QL).
202124
Go to **Discover** and select **Try ES|QL** from the application menu bar.
203-
204-
:::{image} /solutions/images/serverless-discover-esql.png
205-
:screenshot:
206-
:alt: Run an ES|QL semantic query in Discover
207-
:::
208-
209-
Copy the following query:
125+
Think of some queries that are relevant to the documents you explored, such as finding the biggest park or the best for rappelling.
126+
For example, copy the following query:
210127

211128
```esql
212129
FROM semantic-index METADATA _score <1>
@@ -217,21 +134,25 @@ FROM semantic-index METADATA _score <1>
217134
```
218135

219136
1. The FROM source command returns a table of data. Each row in the table represents a document. The `METADATA` clause provides access to the query relevance score, which is a [metadata field](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md).
220-
2. A simplified syntax for the `MATCH` search function, this command performs a semantic query on the specified field.
137+
2. A simplified syntax for the [match](elasticsearch://reference/query-languages/esql/functions-operators/search-functions.md#esql-match) search function, this command performs a semantic query on the specified field.
221138
3. The KEEP processing command affects the columns and their order in the results table.
222139
4. The results are sorted in descending order based on the `_score`.
223-
5. The maximum number of rows to return.
140+
5. This optional command defines the maximum number of rows to return.
224141

225-
In this example, the first row in the table is the document that had the highest relevance score for the query.
142+
After you click **▶Run**, the results appear in a table.
143+
In this example, the first row in the table is the document related to Yellowstone National Park, which had the highest relevance score for the query.
144+
145+
:::{image} /solutions/images/serverless-discover-esql.png
146+
:screenshot:
147+
:alt: Run an ES|QL semantic query in Discover
148+
:::
226149

227-
To learn more, check out [](/explore-analyze/discover/try-esql.md) and [](/solutions/search/esql-for-search.md).
228-
::::
229-
:::::
230150
<!--
231-
TBD: Provide more information about how to interpret and filter the search results.
232-
TBD: Include the Elastic Open Web Crawler variation too or point to it in another guide?
151+
TBD: Run the same query in Console
233152
-->
234153

154+
To learn more, check out [](/explore-analyze/discover/try-esql.md) and [](/solutions/search/esql-for-search.md).
155+
235156
## Next steps
236157

237158
Thanks for taking the time to try out semantic search in {{es-serverless}}.

0 commit comments

Comments
 (0)