Skip to content

Commit 17e41b6

Browse files
committed
Add gif and stepper
1 parent 0f12059 commit 17e41b6

File tree

2 files changed

+40
-17
lines changed

2 files changed

+40
-17
lines changed
859 KB
Loading

solutions/search/serverless-elasticsearch-get-started-semantic.md

Lines changed: 40 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ As you get started on vector search, keep in mind there are two forms of vector
2020
TBD: Which type is implemented when you use semantic_text field?
2121
-->
2222

23-
Elastic also offers an out-of-the-box Learned Sparse Encoder model for semantic search.
23+
Elastic also offers an out-of-the-box Learned Sparse Encoder model ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser)) for semantic search.
2424
This model outperforms on a variety of data sets, such as financial data, weather records, and question-answer pairs, among others.
2525
The model is built to provide great relevance across domains, without the need for additional fine tuning.
2626

@@ -33,21 +33,32 @@ In addition, Elastic also supports dense vectors to implement similarity search
3333
The advantage of [AI-powered search](/solutions/search/ai-search/ai-search.md) is that these technologies enable you to use intuitive language in your search queries.
3434
For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document.
3535

36-
To try it out, [create an {{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-get-started-create-project) that is optimized for vectors.
36+
## Prerequisites
3737

38-
% TBD: It seems like semantic search fields exist in all, so what is the value of this option?
38+
To try out semantic search, [create an {{es-serverless}} project](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-get-started-create-project) that is optimized for vectors.
39+
40+
<!--
41+
TBD: It seems like semantic search fields exist in all, so what is the value of this option?
42+
TBD: Can all roles perform these steps?
43+
-->
3944

4045
## Add data
4146

4247
% TBD: What type of data is ideal for semantic search?
4348

4449
There are some simple data sets that you can use for learning purposes.
4550
For example, if you follow the [guided index flow](/solutions/search/serverless-elasticsearch-get-started.md#elasticsearch-follow-guided-index-flow), you can choose the semantic search option.
46-
Follow the instructions to install an {{es}} client and define field mappings.
51+
Follow the instructions to install an {{es}} client and copy the code examples.
4752
Alternatively, try out the API requests in the [Console](/explore-analyze/query-filter/tools/console.md):
4853

54+
:::::{stepper}
55+
56+
::::{step} Define a semantic text field
57+
58+
The following example creates a mapping for a single [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field:
59+
4960
```console
50-
PUT /my-index/_mapping
61+
PUT /semantic-index/_mapping
5162
{
5263
"properties": {
5364
"text": {
@@ -57,10 +68,11 @@ PUT /my-index/_mapping
5768
}
5869
```
5970

60-
By default, the [semantic_text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field type provides vector search capabilities using the ELSER model.
61-
% TBD: Confirm "Elser model" vs ".elser-2-elasticsearch, a preconfigured endpoint for the elasticsearch service".
71+
::::
72+
73+
::::{step} Add documents
6274

63-
Next, use the Elasticsearch bulk API to ingest an array of documents into the index.
75+
You can use the Elasticsearch bulk API to ingest an array of documents.
6476
The initial bulk ingestion request could take longer than the default request timeout.
6577
If the following request times out, allow time for the machine learning model loading to complete (typically 1-5 minutes) then retry it:
6678

@@ -70,25 +82,36 @@ TBD: Describe where to look for the downloaded model in Trained Models?
7082

7183
```console
7284
POST /_bulk?pretty
73-
{ "index": { "_index": "my-index" } }
85+
{ "index": { "_index": "semantic-index" } }
7486
{"text":"Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site."}
75-
{ "index": { "_index": "my-index" } }
87+
{ "index": { "_index": "semantic-index" } }
7688
{"text":"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face."}
77-
{ "index": { "_index": "my-index" } }
89+
{ "index": { "_index": "semantic-index" } }
7890
{"text":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}
7991
```
92+
::::
93+
:::::
8094

8195
What just happened? The content was transformed into a sparse vector inside the `text` field.
8296
This transformation involves two main steps.
8397
First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. Next, each chunk of text is transformed into a sparse vector representation using text expansion techniques.
84-
This step leverages ELSER (Elastic Search Engine for Relevance) to convert the text into a format that captures the semantic meaning, enabling more accurate and relevant search results.
98+
By default, `semantic_text` fields leverage ELSER to convert the text into a format that captures the semantic meaning.
99+
100+
![Semantic search chunking](/solutions/images/animated-gif-semantic-search-chunking.gif)
101+
102+
<!--
103+
TBD: Confirm "Elser model" vs ".elser-2-elasticsearch, a preconfigured endpoint for the elasticsearch service".
104+
TBD: Show how this data looks in Discover, do you see the text or just the vectors?
105+
TBD: Include the Elastic Open Web Crawler variation too?
106+
-->
85107

86108
## Test a semantic search query
87109

88-
Now try a semantic query:
110+
Try running some queries to check the accuracy and relevance of the search results.
111+
For example, use some keywords that don't exist in the documents:
89112

90113
```console
91-
GET my-index/_search
114+
GET semantic-index/_search
92115
{
93116
"query": {
94117
"semantic": {
@@ -101,7 +124,6 @@ GET my-index/_search
101124

102125
The query is translated automatically into a vector representation and runs against the contents of the semantic text field.
103126
The search results are sorted by relevance score, which measures how well each document matches the query.
104-
For example:
105127

106128
```json
107129
{
@@ -131,7 +153,9 @@ For example:
131153
...
132154
```
133155

134-
% TBD: Provide more information about how to interpret and filter the search results.
156+
<!--
157+
TBD: Provide more information about how to interpret and filter the search results.
158+
-->
135159

136160
## Next steps
137161

@@ -143,4 +167,3 @@ For example, if you have both a `text` field and a `semantic_text` field, you ca
143167
A [hybrid search](/solutions/search/hybrid-semantic-text.md) provides comprehensive search capabilities to find relevant information based on both the raw text and its underlying meaning.
144168

145169
To learn about more options, such as vector and keyword search, check out [](/solutions/search/search-approaches.md).
146-

0 commit comments

Comments
 (0)