Skip to content

Commit 5b2d861

Browse files
szabosteveleemthompoMikep86
authored
[DOCS] Rework semantic search main page (#112452) (#112808)
Co-authored-by: Liam Thompson <[email protected]> Co-authored-by: Mike Pellegrini <[email protected]>
1 parent 3a36785 commit 5b2d861

File tree

2 files changed

+151
-70
lines changed

2 files changed

+151
-70
lines changed
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
[[semantic-search-deployed-nlp-model]]
2+
=== Tutorial: semantic search with a deployed model
3+
4+
++++
5+
<titleabbrev>Semantic search with deployed model</titleabbrev>
6+
++++
7+
8+
[IMPORTANT]
9+
====
10+
* For the easiest way to perform semantic search in the {stack}, refer to the <<semantic-search-semantic-text, `semantic_text`>> end-to-end tutorial.
11+
* This tutorial was written before the <<inference-apis,{infer} endpoint>> and <<semantic-text,`semantic_text` field type>> was introduced.
12+
Today we have simpler options for performing semantic search.
13+
====
14+
15+
This guide shows you how to implement semantic search with models deployed in {es}: from selecting an NLP model, to writing queries.
16+
17+
18+
[discrete]
19+
[[deployed-select-nlp-model]]
20+
==== Select an NLP model
21+
22+
{es} offers the usage of a {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[wide range of NLP models], including both dense and sparse vector models.
23+
Your choice of the language model is critical for implementing semantic search successfully.
24+
25+
While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging.
26+
Selecting an appropriate model from our third-party model list is the first step.
27+
Training the model on your own data is essential to ensure better search results than using only BM25.
28+
However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming.
29+
30+
To address this issue, Elastic provides a pre-trained representational model called {ml-docs}/ml-nlp-elser.html[Elastic Learned Sparse EncodeR (ELSER)].
31+
ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning.
32+
This adaptability makes it suitable for various NLP use cases out of the box.
33+
Unless you have a team of ML specialists, it is highly recommended to use the ELSER model.
34+
35+
In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values.
36+
This representation is commonly used for textual data.
37+
In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors.
38+
Each non-zero element of the vector corresponds to a term in the model vocabulary.
39+
The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero.
40+
Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is.
41+
42+
43+
[discrete]
44+
[[deployed-deploy-nlp-model]]
45+
==== Deploy the model
46+
47+
After you decide which model you want to use for implementing semantic search, you need to deploy the model in {es}.
48+
49+
include::{es-ref-dir}/tab-widgets/semantic-search/deploy-nlp-model-widget.asciidoc[]
50+
51+
52+
[discrete]
53+
[[deployed-field-mappings]]
54+
==== Map a field for the text embeddings
55+
56+
Before you start using the deployed model to generate embeddings based on your input text, you need to prepare your index mapping first.
57+
The mapping of the index depends on the type of model.
58+
59+
include::{es-ref-dir}/tab-widgets/semantic-search/field-mappings-widget.asciidoc[]
60+
61+
62+
[discrete]
63+
[[deployed-generate-embeddings]]
64+
==== Generate text embeddings
65+
66+
Once you have created the mappings for the index, you can generate text embeddings from your input text.
67+
This can be done by using an
68+
<<ingest,ingest pipeline>> with an <<inference-processor,inference processor>>.
69+
The ingest pipeline processes the input data and indexes it into the destination index.
70+
At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline.
71+
After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output.
72+
73+
include::{es-ref-dir}/tab-widgets/semantic-search/generate-embeddings-widget.asciidoc[]
74+
75+
Now it is time to perform semantic search!
76+
77+
78+
[discrete]
79+
[[deployed-search]]
80+
==== Search the data
81+
82+
Depending on the type of model you have deployed, you can query rank features with a <<query-dsl-sparse-vector-query, sparse vector>> query, or dense vectors with a kNN search.
83+
84+
include::{es-ref-dir}/tab-widgets/semantic-search/search-widget.asciidoc[]
85+
86+
87+
[discrete]
88+
[[deployed-hybrid-search]]
89+
==== Beyond semantic search with hybrid search
90+
91+
In some situations, lexical search may perform better than semantic search.
92+
For example, when searching for single words or IDs, like product numbers.
93+
94+
Combining semantic and lexical search into one hybrid search request using <<rrf,reciprocal rank fusion>> provides the best of both worlds.
95+
Not only that, but hybrid search using reciprocal rank fusion {blog-ref}improving-information-retrieval-elastic-stack-hybrid[has been shown to perform better in general].
96+
97+
include::{es-ref-dir}/tab-widgets/semantic-search/hybrid-search-widget.asciidoc[]

docs/reference/search/search-your-data/semantic-search.asciidoc

Lines changed: 54 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -7,120 +7,104 @@ Semantic search is a search method that helps you find data based on the intent
77
Using an NLP model enables you to extract text embeddings out of text.
88
Embeddings are vectors that provide a numeric representation of a text.
99
Pieces of content with similar meaning have similar representations.
10-
NLP models can be used in the {stack} various ways, you can:
1110

12-
* deploy models in {es}
13-
* use the <<semantic-search-semantic-text, `semantic_text` workflow>> (recommended)
14-
* use the <<semantic-search-inference, {infer} API workflow>>
11+
You have several options for using NLP models in the {stack}:
1512

13+
* use the `semantic_text` workflow (recommended)
14+
* use the {infer} API workflow
15+
* deploy models directly in {es}
1616

17-
[[semantic-search-diagram]]
18-
.A simplified representation of encoding textual concepts as vectors
19-
image::images/search/vector-search-oversimplification.png[A simplified representation of encoding textual concepts as vectors,align="center"]
17+
Refer to <<using-nlp-models,this section>> to choose your workflow.
2018

21-
At query time, {es} can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings.
19+
You can also store your own embeddings in {es} as vectors.
20+
Refer to <<using-query,this section>> for guidance on which query type to use for semantic search.
2221

23-
This guide shows you how to implement semantic search with {es}: From selecting an NLP model, to writing queries.
22+
At query time, {es} can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings.
2423

25-
IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <<semantic-search-semantic-text, `semantic_text`>> end-to-end tutorial.
2624

2725
[discrete]
28-
[[semantic-search-select-nlp-model]]
29-
=== Select an NLP model
30-
31-
{es} offers the usage of a
32-
{ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[wide range of NLP models], including both dense and sparse vector models.
33-
Your choice of the language model is critical for implementing semantic search successfully.
34-
35-
While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging.
36-
Selecting an appropriate model from our third-party model list is the first step.
37-
Training the model on your own data is essential to ensure better search results than using only BM25.
38-
However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming.
39-
40-
To address this issue, Elastic provides a pre-trained representational model called {ml-docs}/ml-nlp-elser.html[Elastic Learned Sparse EncodeR (ELSER)].
41-
ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning.
42-
This adaptability makes it suitable for various NLP use cases out of the box.
43-
Unless you have a team of ML specialists, it is highly recommended to use the ELSER model.
44-
45-
In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values.
46-
This representation is commonly used for textual data.
47-
In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors.
48-
Each non-zero element of the vector corresponds to a term in the model vocabulary.
49-
The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero.
50-
Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is.
26+
[[using-nlp-models]]
27+
=== Choose a semantic search workflow
5128

5229
[discrete]
53-
[[semantic-search-deploy-nlp-model]]
54-
=== Deploy the model
30+
==== `semantic_text` workflow
5531

56-
After you decide which model you want to use for implementing semantic search, you need to deploy the model in {es}.
32+
The simplest way to use NLP models in the {stack} is through the <<semantic-search-semantic-text, `semantic_text` workflow>>.
33+
We recommend using this approach because it abstracts away a lot of manual work.
34+
All you need to do is create an {infer} endpoint and an index mapping to start ingesting, embedding, and querying data.
35+
There is no need to define model-related settings and parameters, or to create {infer} ingest pipelines.
36+
Refer to the <<put-inference-api, Create an {infer} endpoint API>> documentation for a list of supported services.
5737

58-
include::{es-ref-dir}/tab-widgets/semantic-search/deploy-nlp-model-widget.asciidoc[]
38+
The <<semantic-search-semantic-text, Semantic search with `semantic_text`>> tutorial shows you the process end-to-end.
5939

6040
[discrete]
61-
[[semantic-search-field-mappings]]
62-
=== Map a field for the text embeddings
41+
==== {infer} API workflow
6342

64-
Before you start using the deployed model to generate embeddings based on your input text, you need to prepare your index mapping first.
65-
The mapping of the index depends on the type of model.
43+
The <<semantic-search-inference, {infer} API workflow>> is more complex but offers greater control over the {infer} endpoint configuration.
44+
You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings.
6645

67-
include::{es-ref-dir}/tab-widgets/semantic-search/field-mappings-widget.asciidoc[]
46+
The <<semantic-search-inference, Semantic search with the {infer} API>> tutorial shows you the process end-to-end.
6847

6948
[discrete]
70-
[[semantic-search-generate-embeddings]]
71-
=== Generate text embeddings
49+
==== Model deployment workflow
7250

73-
Once you have created the mappings for the index, you can generate text embeddings from your input text.
74-
This can be done by using an
75-
<<ingest,ingest pipeline>> with an <<inference-processor,inference processor>>.
76-
The ingest pipeline processes the input data and indexes it into the destination index.
77-
At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline.
78-
After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output.
51+
You can also deploy NLP in {es} manually, without using an {infer} endpoint.
52+
This is the most complex and labor intensive workflow for performing semantic search in the {stack}.
53+
You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data.
7954

80-
include::{es-ref-dir}/tab-widgets/semantic-search/generate-embeddings-widget.asciidoc[]
55+
The <<semantic-search-deployed-nlp-model, Semantic search with a model deployed in {es}>> tutorial shows you the process end-to-end.
8156

82-
Now it is time to perform semantic search!
8357

8458
[discrete]
85-
[[semantic-search-search]]
86-
=== Search the data
59+
[[using-query]]
60+
=== Using the right query
8761

88-
Depending on the type of model you have deployed, you can query rank features with a <<query-dsl-sparse-vector-query, sparse vector>> query, or dense vectors with a kNN search.
62+
Crafting the right query is crucial for semantic search.
63+
Which query you use and which field you target in your queries depends on your chosen workflow.
64+
If you're using the `semantic_text` workflow it's quite simple.
65+
If not, it depends on which type of embeddings you're working with.
8966

90-
include::{es-ref-dir}/tab-widgets/semantic-search/search-widget.asciidoc[]
67+
[cols="30%, 30%, 40%", options="header"]
68+
|=======================================================================================================================================================================================================
69+
| Field type to query | Query to use | Notes
70+
| <<semantic-text,`semantic_text`>> | <<query-dsl-semantic-query,`semantic`>> | The `semantic_text` field handles generating embeddings for you at index time and query time.
71+
| <<sparse-vector,`sparse_vector`>> | <<query-dsl-sparse-vector-query,`sparse_vector`>> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time.
72+
| <<dense-vector,`dense_vector`>> | <<query-dsl-knn-query,`knn`>> | The `knn` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time.
73+
|=======================================================================================================================================================================================================
9174

92-
[discrete]
93-
[[semantic-search-hybrid-search]]
94-
=== Beyond semantic search with hybrid search
75+
If you want {es} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query.
76+
If you want to bring your own embeddings, use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used to generate the embeddings.
9577

96-
In some situations, lexical search may perform better than semantic search.
97-
For example, when searching for single words or IDs, like product numbers.
98-
99-
Combining semantic and lexical search into one hybrid search request using
100-
<<rrf,reciprocal rank fusion>> provides the best of both worlds.
101-
Not only that, but hybrid search using reciprocal rank fusion {blog-ref}improving-information-retrieval-elastic-stack-hybrid[has been shown to perform better in general].
78+
IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <<semantic-search-semantic-text, `semantic_text`>> end-to-end tutorial.
10279

103-
include::{es-ref-dir}/tab-widgets/semantic-search/hybrid-search-widget.asciidoc[]
10480

10581
[discrete]
10682
[[semantic-search-read-more]]
10783
=== Read more
10884

10985
* Tutorials:
110-
** <<semantic-search-elser,Semantic search with ELSER>>
86+
** <<semantic-search-semantic-text, Semantic search with `semantic_text`>>
87+
** <<semantic-search-inference, Semantic search with the {infer} API>>
88+
** <<semantic-search-elser,Semantic search with ELSER>> using the model deployment workflow
89+
** <<semantic-search-deployed-nlp-model, Semantic search with a model deployed in {es}>>
11190
** {ml-docs}/ml-nlp-text-emb-vector-search-example.html[Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model]
91+
* Interactive examples:
92+
** The https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs`] repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {es} Python client
93+
** https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb[Semantic search with ELSER using the model deployment workflow]
94+
** https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[Semantic search with `semantic_text`]
11295
* Blogs:
96+
** https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text[{es} new semantic_text mapping: Simplifying semantic search]
11397
** {blog-ref}may-2023-launch-sparse-encoder-ai-model[Introducing Elastic Learned Sparse Encoder: Elastic's AI model for semantic search]
11498
** {blog-ref}lexical-ai-powered-search-elastic-vector-database[How to get the best of lexical and AI-powered search with Elastic's vector database]
11599
** Information retrieval blog series:
116100
*** {blog-ref}improving-information-retrieval-elastic-stack-search-relevance[Part 1: Steps to improve search relevance]
117101
*** {blog-ref}improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval[Part 2: Benchmarking passage retrieval]
118102
*** {blog-ref}may-2023-launch-information-retrieval-elasticsearch-ai-model[Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model]
119103
*** {blog-ref}improving-information-retrieval-elastic-stack-hybrid[Part 4: Hybrid retrieval]
120-
* Interactive examples:
121-
** The https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs`] repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {es} Python client
122104

123-
include::semantic-search-elser.asciidoc[]
105+
124106
include::semantic-search-semantic-text.asciidoc[]
125107
include::semantic-search-inference.asciidoc[]
108+
include::semantic-search-elser.asciidoc[]
126109
include::cohere-es.asciidoc[]
110+
include::semantic-search-deploy-model.asciidoc[]

0 commit comments

Comments
 (0)