From fc0320a691f9a525a4e87767758f2c9ac9e24309 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Tue, 3 Sep 2024 13:19:59 +0200 Subject: [PATCH 01/10] [DOCS] Rework semantic search main page. --- .../semantic-search-deploy-model.asciidoc | 92 +++++++++++++++ .../search-your-data/semantic-search.asciidoc | 110 ++++++------------ 2 files changed, 129 insertions(+), 73 deletions(-) create mode 100644 docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc diff --git a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc new file mode 100644 index 0000000000000..11a7d3821fc73 --- /dev/null +++ b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc @@ -0,0 +1,92 @@ +[[semantic-search-deployed-nlp-model]] +=== Tutorial: semantic search with a deployed model + +++++ +Semantic search with deployed model +++++ + +IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. + +This guide shows you how to implement semantic search with models deployed in {es}: from selecting an NLP model, to writing queries. + + +[discrete] +[[deployed-select-nlp-model]] +==== Select an NLP model + +{es} offers the usage of a {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[wide range of NLP models], including both dense and sparse vector models. +Your choice of the language model is critical for implementing semantic search successfully. + +While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging. +Selecting an appropriate model from our third-party model list is the first step. +Training the model on your own data is essential to ensure better search results than using only BM25. +However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming. + +To address this issue, Elastic provides a pre-trained representational model called {ml-docs}/ml-nlp-elser.html[Elastic Learned Sparse EncodeR (ELSER)]. +ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. +This adaptability makes it suitable for various NLP use cases out of the box. +Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. + +In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values. +This representation is commonly used for textual data. +In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors. +Each non-zero element of the vector corresponds to a term in the model vocabulary. +The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero. +Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is. + + +[discrete] +[[deployed-deploy-nlp-model]] +==== Deploy the model + +After you decide which model you want to use for implementing semantic search, you need to deploy the model in {es}. + +include::{es-ref-dir}/tab-widgets/semantic-search/deploy-nlp-model-widget.asciidoc[] + + +[discrete] +[[deployed-field-mappings]] +==== Map a field for the text embeddings + +Before you start using the deployed model to generate embeddings based on your input text, you need to prepare your index mapping first. +The mapping of the index depends on the type of model. + +include::{es-ref-dir}/tab-widgets/semantic-search/field-mappings-widget.asciidoc[] + + +[discrete] +[[deployed-generate-embeddings]] +==== Generate text embeddings + +Once you have created the mappings for the index, you can generate text embeddings from your input text. +This can be done by using an +<> with an <>. +The ingest pipeline processes the input data and indexes it into the destination index. +At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline. +After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output. + +include::{es-ref-dir}/tab-widgets/semantic-search/generate-embeddings-widget.asciidoc[] + +Now it is time to perform semantic search! + + +[discrete] +[[deployed-search]] +==== Search the data + +Depending on the type of model you have deployed, you can query rank features with a <> query, or dense vectors with a kNN search. + +include::{es-ref-dir}/tab-widgets/semantic-search/search-widget.asciidoc[] + + +[discrete] +[[deployed-hybrid-search]] +==== Beyond semantic search with hybrid search + +In some situations, lexical search may perform better than semantic search. +For example, when searching for single words or IDs, like product numbers. + +Combining semantic and lexical search into one hybrid search request using <> provides the best of both worlds. +Not only that, but hybrid search using reciprocal rank fusion {blog-ref}improving-information-retrieval-elastic-stack-hybrid[has been shown to perform better in general]. + +include::{es-ref-dir}/tab-widgets/semantic-search/hybrid-search-widget.asciidoc[] \ No newline at end of file diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index fa84c3848b78c..d22c28b92e9ba 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -9,9 +9,9 @@ Embeddings are vectors that provide a numeric representation of a text. Pieces of content with similar meaning have similar representations. NLP models can be used in the {stack} various ways, you can: -* deploy models in {es} * use the <> (recommended) * use the <> +* deploy models directly in {es} [[semantic-search-diagram]] @@ -20,95 +20,59 @@ image::images/search/vector-search-oversimplification.png[A simplified represent At query time, {es} can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings. -This guide shows you how to implement semantic search with {es}: From selecting an NLP model, to writing queries. - -IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. - -[discrete] -[[semantic-search-select-nlp-model]] -=== Select an NLP model - -{es} offers the usage of a -{ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[wide range of NLP models], including both dense and sparse vector models. -Your choice of the language model is critical for implementing semantic search successfully. - -While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging. -Selecting an appropriate model from our third-party model list is the first step. -Training the model on your own data is essential to ensure better search results than using only BM25. -However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming. - -To address this issue, Elastic provides a pre-trained representational model called {ml-docs}/ml-nlp-elser.html[Elastic Learned Sparse EncodeR (ELSER)]. -ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. -This adaptability makes it suitable for various NLP use cases out of the box. -Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. - -In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values. -This representation is commonly used for textual data. -In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors. -Each non-zero element of the vector corresponds to a term in the model vocabulary. -The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero. -Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is. [discrete] -[[semantic-search-deploy-nlp-model]] -=== Deploy the model - -After you decide which model you want to use for implementing semantic search, you need to deploy the model in {es}. +[[using-nlp-models]] +=== Using NLP models -include::{es-ref-dir}/tab-widgets/semantic-search/deploy-nlp-model-widget.asciidoc[] +The easiest and recommended way to use NLP models in the {stack} is the <>. +If you want to use ELSER for semantic search or you already have a service you use, create an {infer} endpoint and an index mapping to start ingesting and querying data. +You don't need to define model-related settings and parameters, or create {infer} ingest pipelines. +Refer to the <> documentation for a list of supported services. -[discrete] -[[semantic-search-field-mappings]] -=== Map a field for the text embeddings +The <> more complex but enables you to have more control over the {infer} endpoint configuration. +You need to create an {infer} endpoint and provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the correct settings. -Before you start using the deployed model to generate embeddings based on your input text, you need to prepare your index mapping first. -The mapping of the index depends on the type of model. +You can also deploy NLP models directly in {es}. +This is the most complex way to perform semantic search in the {stack}. +You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported NLP models] that includes both dense and sparse vector models. +Then deploy the selected model by using the Eland client, create an index mapping and a sufficient ingest pipeline to start ingesting and querying data. -include::{es-ref-dir}/tab-widgets/semantic-search/field-mappings-widget.asciidoc[] [discrete] -[[semantic-search-generate-embeddings]] -=== Generate text embeddings +[[using-query]] +=== Using the right query -Once you have created the mappings for the index, you can generate text embeddings from your input text. -This can be done by using an -<> with an <>. -The ingest pipeline processes the input data and indexes it into the destination index. -At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline. -After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output. +Crafting the right query is crucial for semantic search. +The query type you should use depends first on whether you are using the recommended `semantic_text` workflow. +If not, it depends on the vector type in which your embeddings are stored. -include::{es-ref-dir}/tab-widgets/semantic-search/generate-embeddings-widget.asciidoc[] +[cols="3*", options="header"] +|======================================================================================================================================================================================================= +| Field to query | Query to use | Notes +| <> | <> . | The `semantic_text` field handles generating embeddings for you at index time and query time. +| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +|======================================================================================================================================================================================================= -Now it is time to perform semantic search! +If you want {es} to generate embeddings for you both index and query time, use the `semantic_text` field and the `semantic` query. +If you want to bring your own embeddings, store them in {es} and use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used for generating the embeddings. -[discrete] -[[semantic-search-search]] -=== Search the data - -Depending on the type of model you have deployed, you can query rank features with a <> query, or dense vectors with a kNN search. - -include::{es-ref-dir}/tab-widgets/semantic-search/search-widget.asciidoc[] - -[discrete] -[[semantic-search-hybrid-search]] -=== Beyond semantic search with hybrid search - -In some situations, lexical search may perform better than semantic search. -For example, when searching for single words or IDs, like product numbers. - -Combining semantic and lexical search into one hybrid search request using -<> provides the best of both worlds. -Not only that, but hybrid search using reciprocal rank fusion {blog-ref}improving-information-retrieval-elastic-stack-hybrid[has been shown to perform better in general]. +IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. -include::{es-ref-dir}/tab-widgets/semantic-search/hybrid-search-widget.asciidoc[] [discrete] [[semantic-search-read-more]] === Read more * Tutorials: -** <> +** <> +** <> +** <> using the {infer} workflow +** <> ** {ml-docs}/ml-nlp-text-emb-vector-search-example.html[Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model] +* Interactive examples: +** The https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs`] repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {es} Python client * Blogs: ** {blog-ref}may-2023-launch-sparse-encoder-ai-model[Introducing Elastic Learned Sparse Encoder: Elastic's AI model for semantic search] ** {blog-ref}lexical-ai-powered-search-elastic-vector-database[How to get the best of lexical and AI-powered search with Elastic's vector database] @@ -117,10 +81,10 @@ include::{es-ref-dir}/tab-widgets/semantic-search/hybrid-search-widget.asciidoc[ *** {blog-ref}improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval[Part 2: Benchmarking passage retrieval] *** {blog-ref}may-2023-launch-information-retrieval-elasticsearch-ai-model[Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model] *** {blog-ref}improving-information-retrieval-elastic-stack-hybrid[Part 4: Hybrid retrieval] -* Interactive examples: -** The https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs`] repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {es} Python client -include::semantic-search-elser.asciidoc[] + include::semantic-search-semantic-text.asciidoc[] include::semantic-search-inference.asciidoc[] +include::semantic-search-elser.asciidoc[] include::cohere-es.asciidoc[] +include::semantic-search-deploy-model.asciidoc[] From 42acb0e3a813d94fd9b5e36ac0f1aa5bdf182511 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Tue, 3 Sep 2024 14:49:33 +0200 Subject: [PATCH 02/10] [DOCS] Improves prose. --- .../search-your-data/semantic-search.asciidoc | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index d22c28b92e9ba..63a7feafc6b9a 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -25,18 +25,16 @@ At query time, {es} can use the same NLP model to convert a query into embedding [[using-nlp-models]] === Using NLP models -The easiest and recommended way to use NLP models in the {stack} is the <>. -If you want to use ELSER for semantic search or you already have a service you use, create an {infer} endpoint and an index mapping to start ingesting and querying data. -You don't need to define model-related settings and parameters, or create {infer} ingest pipelines. +The easiest and recommended way to use NLP models in the {stack} is through the <>. +If you want to use ELSER for semantic search or already have a service you use, create an {infer} endpoint and an index mapping to start ingesting and querying data. +There is no need to define model-related settings and parameters, or to create {infer} ingest pipelines. Refer to the <> documentation for a list of supported services. -The <> more complex but enables you to have more control over the {infer} endpoint configuration. -You need to create an {infer} endpoint and provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the correct settings. +The <> more complex but offers greater control over the {infer} endpoint configuration. +You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. -You can also deploy NLP models directly in {es}. -This is the most complex way to perform semantic search in the {stack}. -You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported NLP models] that includes both dense and sparse vector models. -Then deploy the selected model by using the Eland client, create an index mapping and a sufficient ingest pipeline to start ingesting and querying data. +You can also deploy NLP models directly in {es}, which is the most complex way to perform semantic search in the {stack}. +You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. [discrete] From a33a930f2841ad09570e8924bb57e820f130e953 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 11:44:09 +0200 Subject: [PATCH 03/10] Apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- .../search-your-data/semantic-search.asciidoc | 22 +++++++++++-------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 63a7feafc6b9a..bbf52a357b7c6 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -7,7 +7,8 @@ Semantic search is a search method that helps you find data based on the intent Using an NLP model enables you to extract text embeddings out of text. Embeddings are vectors that provide a numeric representation of a text. Pieces of content with similar meaning have similar representations. -NLP models can be used in the {stack} various ways, you can: + +You have several options for using NLP models in the {stack}: * use the <> (recommended) * use the <> @@ -23,17 +24,19 @@ At query time, {es} can use the same NLP model to convert a query into embedding [discrete] [[using-nlp-models]] -=== Using NLP models +=== Choose a semantic search workflow -The easiest and recommended way to use NLP models in the {stack} is through the <>. -If you want to use ELSER for semantic search or already have a service you use, create an {infer} endpoint and an index mapping to start ingesting and querying data. +The simplest way to use NLP models in the {stack} is through the <>. +We recommend using this approach because it abstracts away a lot of manual work. +All you need to do is create an {infer} endpoint and an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {infer} ingest pipelines. Refer to the <> documentation for a list of supported services. -The <> more complex but offers greater control over the {infer} endpoint configuration. +The <> is more complex but offers greater control over the {infer} endpoint configuration. You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. -You can also deploy NLP models directly in {es}, which is the most complex way to perform semantic search in the {stack}. +You can also deploy NLP in {es} manually, without using an {infer} endpoint. +This is the most complex and labor intensive workflow for performing semantic search in the {stack}. You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. @@ -42,12 +45,13 @@ You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp- === Using the right query Crafting the right query is crucial for semantic search. -The query type you should use depends first on whether you are using the recommended `semantic_text` workflow. -If not, it depends on the vector type in which your embeddings are stored. +Which query you use and which field you target in your queries depends on your chosen workflow. +If you're using the `semantic_text` workflow it's quite simple. +If not, it depends on which type of embeddings you're working with. [cols="3*", options="header"] |======================================================================================================================================================================================================= -| Field to query | Query to use | Notes +| Field type to query | Query to use | Notes | <> | <> . | The `semantic_text` field handles generating embeddings for you at index time and query time. | <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. | <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. From d633faaa1f507061b4f9d27334cc664d4f03fdfc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 14:22:16 +0200 Subject: [PATCH 04/10] [DOCS] Addresses feedback. --- .../semantic-search-deploy-model.asciidoc | 6 +++++- .../search-your-data/semantic-search.asciidoc | 19 +++++++++++-------- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc index 11a7d3821fc73..413be561796c3 100644 --- a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc @@ -5,7 +5,11 @@ Semantic search with deployed model ++++ -IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. +[IMPORTANT] +==== +* For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. +* This tutorial was written before the {infer} endpoint and `semantic_text` options were introduced which makes this as the most complicated option. We recommend to use another method to perform semantic search. +==== This guide shows you how to implement semantic search with models deployed in {es}: from selecting an NLP model, to writing queries. diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index bbf52a357b7c6..2b004a700803a 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -10,14 +10,14 @@ Pieces of content with similar meaning have similar representations. You have several options for using NLP models in the {stack}: -* use the <> (recommended) -* use the <> +* use the `semantic_text` workflow (recommended) +* use the {infer} API workflow * deploy models directly in {es} +Refer to <> to choose your workflow. -[[semantic-search-diagram]] -.A simplified representation of encoding textual concepts as vectors -image::images/search/vector-search-oversimplification.png[A simplified representation of encoding textual concepts as vectors,align="center"] +You can also store your own embeddings in {es} as vectors. +Refer to <> for guidance on which query type to use for semantic search. At query time, {es} can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings. @@ -31,13 +31,16 @@ We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an {infer} endpoint and an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {infer} ingest pipelines. Refer to the <> documentation for a list of supported services. +The <> tutorial shows you the process end-to-end. The <> is more complex but offers greater control over the {infer} endpoint configuration. You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. +The <> tutorial shows you the process end-to-end. You can also deploy NLP in {es} manually, without using an {infer} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {stack}. You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. +the <> tutorial shows you the process end-to-end. [discrete] @@ -52,9 +55,9 @@ If not, it depends on which type of embeddings you're working with. [cols="3*", options="header"] |======================================================================================================================================================================================================= | Field type to query | Query to use | Notes -| <> | <> . | The `semantic_text` field handles generating embeddings for you at index time and query time. -| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. -| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +| <> . | <> . | The `semantic_text` field handles generating embeddings for you at index time and query time. +| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. |======================================================================================================================================================================================================= If you want {es} to generate embeddings for you both index and query time, use the `semantic_text` field and the `semantic` query. From 32041ab43b3334ac1cb322ea17372446c910bf22 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 14:27:23 +0200 Subject: [PATCH 05/10] Update docs/reference/search/search-your-data/semantic-search.asciidoc --- docs/reference/search/search-your-data/semantic-search.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 2b004a700803a..21a6e00e86399 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -55,7 +55,7 @@ If not, it depends on which type of embeddings you're working with. [cols="3*", options="header"] |======================================================================================================================================================================================================= | Field type to query | Query to use | Notes -| <> . | <> . | The `semantic_text` field handles generating embeddings for you at index time and query time. +| <> | <> | The `semantic_text` field handles generating embeddings for you at index time and query time. | <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. | <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. |======================================================================================================================================================================================================= From 6d2e20de9900598bc0028a204ec6f4cee8884161 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 14:29:48 +0200 Subject: [PATCH 06/10] [DOCS] Adds subheadings. --- .../search/search-your-data/semantic-search.asciidoc | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 2b004a700803a..0bf748e5d1ef1 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -26,6 +26,9 @@ At query time, {es} can use the same NLP model to convert a query into embedding [[using-nlp-models]] === Choose a semantic search workflow +[discrete] +==== The `semantic_text` workflow + The simplest way to use NLP models in the {stack} is through the <>. We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an {infer} endpoint and an index mapping to start ingesting, embedding, and querying data. @@ -33,10 +36,16 @@ There is no need to define model-related settings and parameters, or to create { Refer to the <> documentation for a list of supported services. The <> tutorial shows you the process end-to-end. +[discrete] +==== The {infer} API workflow + The <> is more complex but offers greater control over the {infer} endpoint configuration. You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. The <> tutorial shows you the process end-to-end. +[discrete] +==== Model deployment workflow + You can also deploy NLP in {es} manually, without using an {infer} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {stack}. You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. From 102590df539717f9ecf1d88d3d7074d98c5954db Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 16:18:20 +0200 Subject: [PATCH 07/10] Apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- .../semantic-search-deploy-model.asciidoc | 2 +- .../search/search-your-data/semantic-search.asciidoc | 9 ++++++--- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc index 413be561796c3..e5e67f5cd1aca 100644 --- a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc @@ -8,7 +8,7 @@ [IMPORTANT] ==== * For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. -* This tutorial was written before the {infer} endpoint and `semantic_text` options were introduced which makes this as the most complicated option. We recommend to use another method to perform semantic search. +* This tutorial was written before the {infer} endpoint and `semantic_text` options were introduced. Today we have simpler options for performing semantic search. ==== This guide shows you how to implement semantic search with models deployed in {es}: from selecting an NLP model, to writing queries. diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 2cd7140ff9585..04e9bd937337d 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -34,6 +34,7 @@ We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an {infer} endpoint and an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {infer} ingest pipelines. Refer to the <> documentation for a list of supported services. + The <> tutorial shows you the process end-to-end. [discrete] @@ -41,6 +42,7 @@ The <> tuto The <> is more complex but offers greater control over the {infer} endpoint configuration. You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. + The <> tutorial shows you the process end-to-end. [discrete] @@ -49,7 +51,8 @@ The <> tutorial You can also deploy NLP in {es} manually, without using an {infer} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {stack}. You need to select an NLP model from the {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding[list of supported dense and sparse vector models], deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. -the <> tutorial shows you the process end-to-end. + +The <> tutorial shows you the process end-to-end. [discrete] @@ -65,11 +68,11 @@ If not, it depends on which type of embeddings you're working with. |======================================================================================================================================================================================================= | Field type to query | Query to use | Notes | <> | <> | The `semantic_text` field handles generating embeddings for you at index time and query time. -| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. | <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. |======================================================================================================================================================================================================= -If you want {es} to generate embeddings for you both index and query time, use the `semantic_text` field and the `semantic` query. +If you want {es} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. If you want to bring your own embeddings, store them in {es} and use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used for generating the embeddings. IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. From 7503c4df377a44cb282f10d0b6d4605a9c402d2b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 4 Sep 2024 16:36:48 +0200 Subject: [PATCH 08/10] [DOCS] More edits. --- .../search/search-your-data/semantic-search.asciidoc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 04e9bd937337d..622ce25e74328 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -64,12 +64,12 @@ Which query you use and which field you target in your queries depends on your c If you're using the `semantic_text` workflow it's quite simple. If not, it depends on which type of embeddings you're working with. -[cols="3*", options="header"] +[cols="30%, 30%, 40%", options="header"] |======================================================================================================================================================================================================= | Field type to query | Query to use | Notes -| <> | <> | The `semantic_text` field handles generating embeddings for you at index time and query time. -| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. -| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You are expected to provide embeddings at index time. +| <> | <> | The `semantic_text` field handles generating embeddings for you at index time and query time. +| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. Provide embeddings at index time. +| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. Provide embeddings at index time. |======================================================================================================================================================================================================= If you want {es} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. From 4e0a49e49d6c457e1fbf337f5696d9c34a9860a3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Thu, 12 Sep 2024 08:56:31 +0200 Subject: [PATCH 09/10] Apply suggestions from code review Co-authored-by: Mike Pellegrini --- .../search/search-your-data/semantic-search.asciidoc | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 622ce25e74328..99dd9ea03a594 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -27,7 +27,7 @@ At query time, {es} can use the same NLP model to convert a query into embedding === Choose a semantic search workflow [discrete] -==== The `semantic_text` workflow +==== `semantic_text` workflow The simplest way to use NLP models in the {stack} is through the <>. We recommend using this approach because it abstracts away a lot of manual work. @@ -38,7 +38,7 @@ Refer to the <> documentation The <> tutorial shows you the process end-to-end. [discrete] -==== The {infer} API workflow +==== {infer} API workflow The <> is more complex but offers greater control over the {infer} endpoint configuration. You need to create an {infer} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {infer} ingest pipeline with the appropriate settings. @@ -68,12 +68,12 @@ If not, it depends on which type of embeddings you're working with. |======================================================================================================================================================================================================= | Field type to query | Query to use | Notes | <> | <> | The `semantic_text` field handles generating embeddings for you at index time and query time. -| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. Provide embeddings at index time. -| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. Provide embeddings at index time. +| <> | <> | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time. +| <> | <> | The `knn` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time. |======================================================================================================================================================================================================= If you want {es} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. -If you want to bring your own embeddings, store them in {es} and use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used for generating the embeddings. +If you want to bring your own embeddings, use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used to generate the embeddings. IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. @@ -85,7 +85,7 @@ IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer * Tutorials: ** <> ** <> -** <> using the {infer} workflow +** <> using the model deployment workflow ** <> ** {ml-docs}/ml-nlp-text-emb-vector-search-example.html[Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model] * Interactive examples: From 340926cdd6c6f6701a9ae18a9f7fd7e48e77a669 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Thu, 12 Sep 2024 10:42:05 +0200 Subject: [PATCH 10/10] [DOCS] Addresses feedback. --- .../search-your-data/semantic-search-deploy-model.asciidoc | 3 ++- .../reference/search/search-your-data/semantic-search.asciidoc | 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc index e5e67f5cd1aca..6c610159ae0b9 100644 --- a/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-deploy-model.asciidoc @@ -8,7 +8,8 @@ [IMPORTANT] ==== * For the easiest way to perform semantic search in the {stack}, refer to the <> end-to-end tutorial. -* This tutorial was written before the {infer} endpoint and `semantic_text` options were introduced. Today we have simpler options for performing semantic search. +* This tutorial was written before the <> and <> was introduced. +Today we have simpler options for performing semantic search. ==== This guide shows you how to implement semantic search with models deployed in {es}: from selecting an NLP model, to writing queries. diff --git a/docs/reference/search/search-your-data/semantic-search.asciidoc b/docs/reference/search/search-your-data/semantic-search.asciidoc index 99dd9ea03a594..62e41b3eef3de 100644 --- a/docs/reference/search/search-your-data/semantic-search.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search.asciidoc @@ -90,7 +90,10 @@ IMPORTANT: For the easiest way to perform semantic search in the {stack}, refer ** {ml-docs}/ml-nlp-text-emb-vector-search-example.html[Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model] * Interactive examples: ** The https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs`] repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {es} Python client +** https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb[Semantic search with ELSER using the model deployment workflow] +** https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[Semantic search with `semantic_text`] * Blogs: +** https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text[{es} new semantic_text mapping: Simplifying semantic search] ** {blog-ref}may-2023-launch-sparse-encoder-ai-model[Introducing Elastic Learned Sparse Encoder: Elastic's AI model for semantic search] ** {blog-ref}lexical-ai-powered-search-elastic-vector-database[How to get the best of lexical and AI-powered search with Elastic's vector database] ** Information retrieval blog series: