From 854f963c189eeefc802266ca7c6e06f78b233482 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:24:23 +0100 Subject: [PATCH 01/30] [Search] Vector & semantic search --- ...asticsearch-reference-semantic-options.svg | 5 +- .../semantic-search.md | 101 ------------------ solutions/search/full-text.md | 7 +- solutions/search/semantic-search.md | 84 ++++++++++++--- solutions/search/vector.md | 97 ++++++++++++++++- .../search/vector/sparse-vector-elser.md | 1 - solutions/toc.yml | 2 +- 7 files changed, 168 insertions(+), 129 deletions(-) delete mode 100644 raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search.md diff --git a/images/elasticsearch-reference-semantic-options.svg b/images/elasticsearch-reference-semantic-options.svg index 3bedf53073..c4d68b19b5 100644 --- a/images/elasticsearch-reference-semantic-options.svg +++ b/images/elasticsearch-reference-semantic-options.svg @@ -1,8 +1,5 @@ - - - Elasticsearch semantic search workflows - + diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search.md deleted file mode 100644 index c568363907..0000000000 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search.md +++ /dev/null @@ -1,101 +0,0 @@ -# Semantic search [semantic-search] - -Semantic search is a search method that helps you find data based on the intent and contextual meaning of a search query, instead of a match on query terms (lexical search). - -{{es}} provides various semantic search capabilities using [natural language processing (NLP)](../../../explore-analyze/machine-learning/nlp.md) and vector search. Using an NLP model enables you to extract text embeddings out of text. Embeddings are vectors that provide a numeric representation of a text. Pieces of content with similar meaning have similar representations. - -:::{image} ../../../images/elasticsearch-reference-semantic-options.svg -:alt: Overview of semantic search workflows in {es} -::: - -You have several options for using NLP models in the {{stack}}: - -* use the `semantic_text` workflow (recommended) -* use the {{infer}} API workflow -* deploy models directly in {es} - -Refer to [this section](../../../solutions/search/semantic-search.md#using-nlp-models) to choose your workflow. - -You can also store your own embeddings in {{es}} as vectors. Refer to [this section](../../../solutions/search/semantic-search.md#using-query) for guidance on which query type to use for semantic search. - -At query time, {{es}} can use the same NLP model to convert a query into embeddings, enabling you to find documents with similar text embeddings. - - -## Choose a semantic search workflow [using-nlp-models] - - -### `semantic_text` workflow [_semantic_text_workflow] - -The simplest way to use NLP models in the {{stack}} is through the [`semantic_text` workflow](../../../solutions/search/semantic-search/semantic-search-semantic-text.md). We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an {{infer}} endpoint and an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {{infer}} ingest pipelines. Refer to the [Create an {{infer}} endpoint API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html) documentation for a list of supported services. - -The [Semantic search with `semantic_text`](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) tutorial shows you the process end-to-end. - - -### {{infer}} API workflow [_infer_api_workflow] - -The [{{infer}} API workflow](../../../solutions/search/inference-api.md) is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {{infer}} ingest pipeline with the appropriate settings. - -The [Semantic search with the {{infer}} API](../../../solutions/search/inference-api.md) tutorial shows you the process end-to-end. - - -### Model deployment workflow [_model_deployment_workflow] - -You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {{stack}}. You need to select an NLP model from the [list of supported dense and sparse vector models](../../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. - -The [Semantic search with a model deployed in {{es}}](../../../solutions/search/semantic-search/semantic-search-deployed-nlp-model.md) tutorial shows you the process end-to-end. - - -## Using the right query [using-query] - -Crafting the right query is crucial for semantic search. Which query you use and which field you target in your queries depends on your chosen workflow. If you’re using the `semantic_text` workflow it’s quite simple. If not, it depends on which type of embeddings you’re working with. - -| Field type to query | Query to use | Notes | -| --- | --- | --- | -| [`semantic_text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html) | [`semantic`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-semantic-query.html) | The `semantic_text` field handles generating embeddings for you at index time and query time. | -| [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) | [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html) | The `sparse_vector` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time. | -| [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) | [`knn`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html) | The `knn` query can generate query embeddings for you, but you can also provide your own. You must provide embeddings at index time. | - -If you want {{es}} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. If you want to bring your own embeddings, use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used to generate the embeddings. - -::::{important} -For the easiest way to perform semantic search in the {{stack}}, refer to the [`semantic_text`](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) end-to-end tutorial. -:::: - - - -## Read more [semantic-search-read-more] - -* Tutorials: - - * [Semantic search with `semantic_text`](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) - * [Semantic search with the {{infer}} API](../../../solutions/search/inference-api.md) - * [Semantic search with ELSER](../../../solutions/search/vector/sparse-vector-elser.md) using the model deployment workflow - * [Semantic search with a model deployed in {{es}}](../../../solutions/search/semantic-search/semantic-search-deployed-nlp-model.md) - * [Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md) - -* Interactive examples: - - * The [`elasticsearch-labs`](https://github.com/elastic/elasticsearch-labs) repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {{es}} Python client - * [Semantic search with ELSER using the model deployment workflow](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb) - * [Semantic search with `semantic_text`](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb) - -* Blogs: - - * [{{es}} new semantic_text mapping: Simplifying semantic search](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text) - * [Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search](https://www.elastic.co/blog/may-2023-launch-sparse-encoder-ai-model) - * [How to get the best of lexical and AI-powered search with Elastic’s vector database](https://www.elastic.co/blog/lexical-ai-powered-search-elastic-vector-database) - * Information retrieval blog series: - - * [Part 1: Steps to improve search relevance](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-search-relevance) - * [Part 2: Benchmarking passage retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval) - * [Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) - * [Part 4: Hybrid retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid) - - - - - - - - - diff --git a/solutions/search/full-text.md b/solutions/search/full-text.md index 5e403845c6..5dffd081d0 100644 --- a/solutions/search/full-text.md +++ b/solutions/search/full-text.md @@ -5,11 +5,8 @@ mapped_pages: # Full-text search [full-text-search] -::::{admonition} Hands-on introduction to full-text search -:class: tip - -Would you prefer to jump straight into a hands-on tutorial? Refer to our quick start [full-text search tutorial](get-started.md). - +::::{tip} +Would you prefer to start with a hands-on example? Refer to our [full-text search tutorial](querydsl-full-text-filter-tutorial.md). :::: Full-text search, also known as lexical search, is a technique for fast, efficient searching through text fields in documents. Documents and search queries are transformed to enable returning [relevant](https://www.elastic.co/what-is/search-relevance) results instead of simply exact term matches. Fields of type [`text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html#text-field-type) are analyzed and indexed for full-text search. diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index 536d9f33b2..855fb90864 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -1,26 +1,78 @@ ---- -mapped_urls: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html - - https://www.elastic.co/guide/en/serverless/current/elasticsearch-reference-semantic-search.html ---- +# Semantic search [semantic-search] -# Semantic search +:::{note} +This page focuses on the semantic search workflows available in {{es}}. For detailed information about vector search implementations, refer to [vector search](vector.md). +::: -% What needs to be done: Lift-and-shift +Sometimes [full-text search](full-text.md) isn't enough. -% Use migrated content from existing pages that map to this page: +Semantic search techniques help users find data based on intent and contextual meaning, going beyond traditional keyword matching. -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-reference-semantic-search.md +Semantic search has a wide range of use cases, including: -% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc): +- **Question answering**: Find the most relevant answers to user questions +- **Recommendation systems**: Suggest similar items based on user preferences +- **Information retrieval**: Retrieve relevant information from large datasets +- **Product discovery**: Help E-commerce users find relevant products +- **Workplace search**: Help employees find relevant information within an organization -$$$elasticsearch-reference-semantic-search-semantic-text$$$ +{{es}} provides various semantic search capabilities using [natural language processing (NLP)](../../raw-migrated-files/elasticsearch/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). -$$$elasticsearch-reference-semantic-search-inference-api$$$ +## Overview of semantic search workflows [semantic-search-workflows-overview] -$$$elasticsearch-reference-semantic-search-model-deployment$$$ +You have several options for using NLP models for semantic search in the {{stack}}: -$$$using-nlp-models$$$ +* Option 1: Use the `semantic_text` workflow (recommended) +* Option 2: Use the {{infer}} API workflow +* Option 3: Deploy models directly in {{es}} -$$$using-query$$$ \ No newline at end of file +This diagram summarizes the relative complexity of each workflow: + +:::{image} ../../../images/elasticsearch-reference-semantic-options.svg +:alt: Overview of semantic search workflows in {{es}} +::: + +## Choose a semantic search workflow [using-nlp-models] + +### Option 1: `semantic_text` [_semantic_text_workflow] + +The simplest way to use NLP models in the {{stack}} is through the [`semantic_text` workflow](semantic-search/semantic-search-semantic-text.md). We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an {{infer}} endpoint and an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {{infer}} ingest pipelines. Refer to the [Create an {{infer}} endpoint API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html) documentation for a list of supported services. + +For an end-to-end tutorial, refer to [Semantic search with `semantic_text`](semantic-search/semantic-search-semantic-text.md). + + +### Option 2: Inference API [_infer_api_workflow] + +The [{{infer}} API workflow](inference-api.md) is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {{infer}} ingest pipeline with the appropriate settings. + +For an end-to-end tutorial, refer to [Semantic search with the {{infer}} API](inference-api.md). + + +### Option 3: Manual model deployment [_model_deployment_workflow] + +You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {{stack}}. You need to select an NLP model from the [list of supported dense and sparse vector models](../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. + +For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](semantic-search/semantic-search-deployed-nlp-model.md). + +::::{tip} +Refer to [vector queries and field types](vector.md#choosing-vector-query) for a quick reference overview. +:::: + +## Learn more [semantic-search-read-more] + +### Interactive examples + +- The [`elasticsearch-labs`](https://github.com/elastic/elasticsearch-labs) repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {{es}} Python client +- [Semantic search with ELSER using the model deployment workflow](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb) +- [Semantic search with `semantic_text`](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb) + +### Blogs + +- [{{es}} new semantic_text mapping: Simplifying semantic search](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text) +- [Introducing Elastic Learned Sparse Encoder: Elastic's AI model for semantic search](https://www.elastic.co/blog/may-2023-launch-sparse-encoder-ai-model) +- [How to get the best of lexical and AI-powered search with Elastic's vector database](https://www.elastic.co/blog/lexical-ai-powered-search-elastic-vector-database) +- Information retrieval blog series: + - [Part 1: Steps to improve search relevance](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-search-relevance) + - [Part 2: Benchmarking passage retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval) + - [Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) + - [Part 4: Hybrid retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid) \ No newline at end of file diff --git a/solutions/search/vector.md b/solutions/search/vector.md index f70ac83850..4b1f1a16d5 100644 --- a/solutions/search/vector.md +++ b/solutions/search/vector.md @@ -1,3 +1,98 @@ # Vector search -% What needs to be done: Write from scratch \ No newline at end of file +:::{tip} +Looking for a minimal configuration approach? The `semantic_text` field type provides an abstraction over these vector search implementations with sensible defaults and automatic model management. It's the recommended approach for most users. [Learn more about semantic_text](semantic-search/semantic-search-semantic-text.md). +::: + +Elasticsearch's vector search capabilities enable finding content based on meaning and similarity, instead of keyword or exact term matches. Vector search is an important component of most modern [semantic search](semantic-search.md) implementations. +Vector search can also be used independently for various similarity matching use cases. + +This guide explores the more manual technical implementation of vector search approaches, that **do not** use the `semantic_text` workflow. + +## Vector queries and field types [choosing-vector-query] + +Which query you use and which field you target in your queries depends on your chosen workflow. If you’re using the `semantic_text` workflow it’s quite simple. If not, it depends on which type of embeddings you’re working with. + +If you want {{es}} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. If you want to bring your own embeddings, use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used to generate the embeddings. + +| Vector type | Field type | Query type | Use case | +| ----------- | --------------- | --------------- | -------------------------------------------------- | +| Dense | `dense_vector` | `knn` | Semantic similarity matching via neural embeddings | +| Sparse | `sparse_vector` | `sparse_vector` | Enhanced text search with semantic term expansion | +|Dense or sparse | [`semantic_text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html) | [`semantic`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-semantic-query.html) | The `semantic_text` field handles generating embeddings for you at index time and query time. | + +## Dense vector vs. sparse vector + +### Dense vector + +Dense neural embeddings capture semantic meaning by translating content into a vector space. Fixed-length vectors of floating-point numbers represent the content's meaning, with similar content mapped to nearby points in the vector space. +When you need to find related items, these vectors work with distance metrics to identify semantic similarity. This is ideal for when you want to capture "what this content is about" rather than just what words it contains. + +Dense vectors are well-suited for: +- Finding semantically similar content ("more like this") +- Matching questions with answers +- Image similarity search +- Recommendations based on content similarity + +To implement dense vector search, you'll need to: +1. Generate document embeddings using a neural model (locally or in {{es}}) +2. Configure your vector similarity settings +3. Generate query embeddings at search time +4. Use the `knn` query for searches + +[Learn more about dense vector search implementation](vector/dense-vector.md) + +### Sparse vector (ELSER) + +While dense vectors capture overall meaning, sparse vectors excel at intelligent vocabulary expansion. The model enriches each key concept with closely related terms and ideas, creating semantic connections that go beyond simple word-level substitutions. + +Using [Elastic's learned sparse encoder model (ELSER)](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), both queries and documents get expanded into weighted terms. The result is powerful search that is also transparent - you can see exactly why matches occur. This contrasts with the "black-box" nature of dense vector matching, where similarity is based on abstract vector distances. + +Sparse vectors are ideal for: + +- Enhanced keyword search with semantic understanding +- Cases where you need explainable search results +- Scenarios requiring precise term matching with semantic awareness +- Domain-specific search where vocabulary relationships matter + +To implement sparse vector search with ELSER manually, you'll need to: +1. Configure ELSER to embed your content +2. Configure ELSER to embed your queries +3. Use the `sparse_vector` query for searches + +[Learn more about implementing sparse vector search](vector/sparse-vector-elser.md) + +### Key considerations + +If you've chosen to implement vector search manually rather than using the guided `semantic_text` workflow, consider: + +1. **Data characteristics** + - **Text length:** Dense vectors work best with shorter texts like product descriptions or chat messages. For longer content like articles or documentation, sparse vectors handle length better. + - **Domain specificity:** For specialized content (medical literature, legal documents, technical documentation), sparse vectors preserve domain-specific terminology. Dense vectors can miss nuanced technical terms. + - **Update frequency:** If your content changes frequently, dense vectors require recomputing embeddings for each update. Sparse vectors handle incremental updates more efficiently. + +2. **Performance requirements** + - **Query latency:** Dense vectors with HNSW offer fast search times for moderate-sized datasets. Sparse vectors have higher but consistent latency across dataset sizes. + - **Memory footprint:** Dense vectors' size is determined by their chosen dimensionality, while sparse vectors' size varies with content complexity. + - **Resource needs:** Dense vectors benefit significantly from GPU acceleration. Sparse vectors perform well on CPU-only setups. + +3. **Explainability needs** + - **Transparency is important:** For transparency and relevance debugging, sparse vectors show exactly which terms contributed. + - **Transparency isn't important:** For recommendations and "more like this" features, dense vectors work well. + +4. **Implementation complexity** + - **Dense vector setup:** More complex due to wide range of embedding models to choose from, configure, and manage. + - **Sparse vector setup:** Simpler since ELSER is Elastic's standard sparse encoding model and is available out-of-the-box. + +## Implementation tutorials + +TODO + +- [](semantic-search/bring-own-vectors.md) +- [Sparse vector search with ELSER](vector/sparse-vector-elser.md) +- [Semantic search with a model deployed in {{es}}](semantic-search/semantic-search-deployed-nlp-model.md) +- [Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model](../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md) + +## Additional resources + +T \ No newline at end of file diff --git a/solutions/search/vector/sparse-vector-elser.md b/solutions/search/vector/sparse-vector-elser.md index e523f751da..55c7d4f3c5 100644 --- a/solutions/search/vector/sparse-vector-elser.md +++ b/solutions/search/vector/sparse-vector-elser.md @@ -1,5 +1,4 @@ --- -navigation_title: "Semantic search with ELSER" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-elser.html --- diff --git a/solutions/toc.yml b/solutions/toc.yml index a3c7e84f57..f3378e39a4 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -634,6 +634,7 @@ toc: - file: search/vector/dense-vector.md children: - file: search/vector/knn.md + - file: search/semantic-search/bring-own-vectors.md - file: search/vector/sparse-vector-elser.md - file: search/hybrid-search.md - file: search/semantic-search.md @@ -644,7 +645,6 @@ toc: - file: search/semantic-search/semantic-search-elser.md - file: search/semantic-search/cohere-es.md - file: search/semantic-search/semantic-search-deployed-nlp-model.md - - file: search/semantic-search/bring-own-vectors.md - file: search/ranking.md children: - file: search/ranking/semantic-reranking.md From d2da35d037ed865315a139cf7374c7395a0e5357 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:28:44 +0100 Subject: [PATCH 02/30] Restore mapped pages --- solutions/search/semantic-search.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index 855fb90864..a41c667fcd 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -1,3 +1,9 @@ +--- +mapped_urls: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html + - https://www.elastic.co/guide/en/serverless/current/elasticsearch-reference-semantic-search.html +--- + # Semantic search [semantic-search] :::{note} From d650e451101d6327ffdacf000f13c52f768ab200 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:32:09 +0100 Subject: [PATCH 03/30] Hoist up the John B sail --- raw-migrated-files/toc.yml | 1 - solutions/search/semantic-search.md | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 200c9814b1..f03abb7bb4 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -282,7 +282,6 @@ toc: - file: docs-content/serverless/elasticsearch-knn-search.md - file: docs-content/serverless/elasticsearch-manage-project.md - file: docs-content/serverless/elasticsearch-playground.md - - file: docs-content/serverless/elasticsearch-reference-semantic-search.md - file: docs-content/serverless/elasticsearch-search-your-data-the-search-api.md - file: docs-content/serverless/elasticsearch-search-your-data.md - file: docs-content/serverless/endpoint-protection-rules.md diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index a41c667fcd..350db2c8ed 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -22,7 +22,7 @@ Semantic search has a wide range of use cases, including: - **Product discovery**: Help E-commerce users find relevant products - **Workplace search**: Help employees find relevant information within an organization -{{es}} provides various semantic search capabilities using [natural language processing (NLP)](../../raw-migrated-files/elasticsearch/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). +{{es}} provides various semantic search capabilities using [natural language processing (NLP)](/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). ## Overview of semantic search workflows [semantic-search-workflows-overview] From 3b2cf00b983bd856baf0dff5dba9d4f0636d9aa5 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:34:23 +0100 Subject: [PATCH 04/30] Hello darkness --- raw-migrated-files/toc.yml | 1 - 1 file changed, 1 deletion(-) diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index f03abb7bb4..62abda3cb7 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -646,7 +646,6 @@ toc: - file: elasticsearch/elasticsearch-reference/security-files.md - file: elasticsearch/elasticsearch-reference/security-limitations.md - file: elasticsearch/elasticsearch-reference/semantic-search-inference.md - - file: elasticsearch/elasticsearch-reference/semantic-search.md - file: elasticsearch/elasticsearch-reference/setting-up-authentication.md - file: elasticsearch/elasticsearch-reference/setup.md - file: elasticsearch/elasticsearch-reference/shard-allocation-filtering.md From 5d86c7540a2187eaae0cb6285392da5db49ee7cd Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:38:17 +0100 Subject: [PATCH 05/30] wheres me image --- solutions/search/semantic-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index 350db2c8ed..e95a1ac9a7 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -34,7 +34,7 @@ You have several options for using NLP models for semantic search in the {{stack This diagram summarizes the relative complexity of each workflow: -:::{image} ../../../images/elasticsearch-reference-semantic-options.svg +:::{image} /images/elasticsearch-reference-semantic-options.svg :alt: Overview of semantic search workflows in {{es}} ::: From 7e441a5aadfe1cd1046ddaf67f99a790f5f29e61 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 6 Feb 2025 18:44:20 +0100 Subject: [PATCH 06/30] Delete pesky straggler --- ...elasticsearch-reference-semantic-search.md | 43 ------------------- 1 file changed, 43 deletions(-) delete mode 100644 raw-migrated-files/docs-content/serverless/elasticsearch-reference-semantic-search.md diff --git a/raw-migrated-files/docs-content/serverless/elasticsearch-reference-semantic-search.md b/raw-migrated-files/docs-content/serverless/elasticsearch-reference-semantic-search.md deleted file mode 100644 index 8a68e727ec..0000000000 --- a/raw-migrated-files/docs-content/serverless/elasticsearch-reference-semantic-search.md +++ /dev/null @@ -1,43 +0,0 @@ -# Semantic search [elasticsearch-reference-semantic-search] - -Semantic search is a search method that helps you find data based on the intent and contextual meaning of a search query, instead of a match on query terms (lexical search). - -Elasticsearch provides various semantic search capabilities using natural language processing (NLP) and vector search. Using an NLP model enables you to extract text embeddings out of text. Embeddings are vectors that provide a numeric representation of a text. Pieces of content with similar meaning have similar representations. - -There are three main workflows for implementing semantic search with {{es}}, arranged in order of increasing complexity: - -* [The `semantic text` workflow](../../../solutions/search/semantic-search.md#elasticsearch-reference-semantic-search-semantic-text) -* [Inference API workflow](../../../solutions/search/semantic-search.md#elasticsearch-reference-semantic-search-inference-api) -* [Model deployment workflow](../../../solutions/search/semantic-search.md#elasticsearch-reference-semantic-search-model-deployment) - -:::{image} ../../../images/serverless-semantic-options.svg -:alt: Overview of semantic search workflows in {es} -::: - -::::{note} -Semantic search is available on all Elastic deployment types: self-managed clusters, Elastic Cloud Hosted deployments, and {{es-serverless}} projects. The links on this page will take you to the [{{es}} core documentation](../../../solutions/search/semantic-search.md). - -:::: - - - -## Semantic search with `semantic text` [elasticsearch-reference-semantic-search-semantic-text] - -The `semantic_text` field simplifies semantic search by providing inference at ingestion time with sensible default values, eliminating the need for complex configurations. - -Learn how to implement semantic search with `semantic text` in the [Elasticsearch docs →](../../../solutions/search/semantic-search/semantic-search-semantic-text.md). - - -## Semantic search with the inference API [elasticsearch-reference-semantic-search-inference-api] - -The inference API workflow enables you to perform semantic search using models from a variety of services, such as Cohere, OpenAI, HuggingFace, Azure AI Studio, and more. - -Learn how to implement semantic search with the inference API in the [Elasticsearch docs →](../../../solutions/search/inference-api.md). - - -## Semantic search with the model deployment workflow [elasticsearch-reference-semantic-search-model-deployment] - -The model deployment workflow enables you to deploy custom NLP models in Elasticsearch, giving you full control over text embedding generation and vector search. While this workflow offers advanced flexibility, it requires expertise in NLP and machine learning. - -Learn how to implement semantic search with the model deployment workflow in the [Elasticsearch docs →](../../../solutions/search/semantic-search/semantic-search-deployed-nlp-model.md). - From 8ca22c6af4316f057a686bc67bd50c1ee9125169 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 11:37:26 +0100 Subject: [PATCH 07/30] Reorganize AI search content under dedicated section 1. New structure - New ai-search dir & entry point - Moved vector.md under ai-search - Updated TOC hierarchy 2. Content streamlining - Simplified hybrid-search.md to overview - Trimmed vector.md to essentials - Removed redundant semantic-search.md content 3. TOC & links cleanup - Reorganized parent/child relationships - Fixed cross-references - Split implementation details to subpages --- solutions/search/ai-search/ai-search.md | 40 +++++ solutions/search/hybrid-search.md | 204 +---------------------- solutions/search/hybrid-semantic-text.md | 203 ++++++++++++++++++++++ solutions/search/retrievers-overview.md | 9 +- solutions/search/search-approaches.md | 2 +- solutions/search/semantic-search.md | 24 +-- solutions/search/vector.md | 101 +++-------- solutions/toc.yml | 30 ++-- 8 files changed, 300 insertions(+), 313 deletions(-) create mode 100644 solutions/search/ai-search/ai-search.md create mode 100644 solutions/search/hybrid-semantic-text.md diff --git a/solutions/search/ai-search/ai-search.md b/solutions/search/ai-search/ai-search.md new file mode 100644 index 0000000000..a977350d7e --- /dev/null +++ b/solutions/search/ai-search/ai-search.md @@ -0,0 +1,40 @@ +# AI-powered search + +Sometimes [full-text search](full-text.md) alone isn't enough. Machine learning techniques are powerful tools for helping users find data based on intent and contextual meaning. Natural language understanding and information retrieval go hand in hand in modern search systems. + +Depending on your team's technical expertise and requirements, you can choose from two main paths to implement AI-powered search in Elasticsearch. You can use managed workflows that abstract away much of the complexity, or you can work directly with the underlying vector search technology. + +## Use cases + +AI-powered search enables a wide range of applications: +- Natural language search +- Retrieval Augmented Generation (RAG) +- Question answering systems +- Content recommendation engines +- Information retrieval from large datasets +- Product discovery in e-commerce +- Workplace document search +- Similar item matching + +## Overview + +AI-powered search in Elasticsearch is built on vector search technology, which uses machine learning models to capture meaning in content. These vector representations come in two forms: dense vectors that capture overall meaning, and sparse vectors that focus on key terms and their relationships. + +:::{tip} +New to AI-powered search? Start with the `semantic_text` field type, which provides an easy-to-use abstraction over these capabilities with sensible defaults. [Learn more about semantic_text](semantic-search/semantic-search-semantic-text.md). +::: + +## Implementation paths + +Elasticsearch uses vector search as the foundation for AI-powered search capabilities. You can work with this technology in two ways: + +1. [**Semantic search**](semantic-search.md) provides managed workflows that use vector search under the hood: + - The `semantic_text` field type offers the simplest path with automatic embedding generation and model management + - Additional implementation options available for more complex needs + +2. [**Vector search**](vector.md) gives you direct access to the underlying technology: + - Manual configuration of dense or sparse vectors + - Flexibility to bring your own embeddings + - Direct implementation of vector similarity matching + +Once you've implemented either approach, you can combine it with traditional [full-text search](../full-text.md) to create hybrid search solutions that leverage both meaning-based and keyword-based matching. \ No newline at end of file diff --git a/solutions/search/hybrid-search.md b/solutions/search/hybrid-search.md index 8e373ead9d..4be5bf78d1 100644 --- a/solutions/search/hybrid-search.md +++ b/solutions/search/hybrid-search.md @@ -1,203 +1,7 @@ ---- -navigation_title: "Hybrid search with `semantic_text`" -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text-hybrid-search.html ---- +# Hybrid search +Hybrid search combines traditional [full-text search](full-text.md) with [AI-powered search](ai-search/ai-search.md) for more powerful search experiences that serve a wider range of user needs. +The recommended way to use hybrid search in the Elastic Stack is following the `semantic_text` workflow. Check out the [hands-on tutorial](hybrid-search/hybrid-semantic-text.md) for a step-by-step guide. -# Hybrid search [semantic-text-hybrid-search] - - -This tutorial demonstrates how to perform hybrid search, combining semantic search with traditional full-text search. - -In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient. - -The recommended way to use hybrid search in the {{stack}} is following the `semantic_text` workflow. This tutorial uses the [`elasticsearch` service](inference-api/elasticsearch-inference-integration.md) for demonstration, but you can use any service and their supported models offered by the {{infer-cap}} API. - - -## Create an index mapping [hybrid-search-create-index-mapping] - -The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search. - -```console -PUT semantic-embeddings -{ - "mappings": { - "properties": { - "semantic_text": { <1> - "type": "semantic_text", - }, - "content": { <2> - "type": "text", - "copy_to": "semantic_text" <3> - } - } - } -} -``` - -1. The name of the field to contain the generated embeddings for semantic search. -2. The name of the field to contain the original text for lexical search. -3. The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {{infer}} endpoint. - - -::::{note} -If you want to run a search on indices that were populated by web crawlers or connectors, you have to [update the index mappings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html) for these indices to include the `semantic_text` field. Once the mapping is updated, you’ll need to run a full web crawl or a full connector sync. This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling hybrid search on the updated data. - -:::: - - - -## Load data [semantic-text-hybrid-load-data] - -In this step, you load the data that you later use to create embeddings from. - -Use the `msmarco-passagetest2019-top1000` data set, which is a subset of the MS MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by a list of relevant text passages. All unique passages, along with their IDs, have been extracted from that data set and compiled into a [tsv file](https://github.com/elastic/stack-docs/blob/main/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv). - -Download the file and upload it to your cluster using the [Data Visualizer](../../manage-data/ingest.md#upload-data-kibana) in the {{ml-app}} UI. After your data is analyzed, click **Override settings**. Under **Edit field names***, assign `id` to the first column and `content` to the second. Click ***Apply***, then ***Import**. Name the index `test-data`, and click **Import**. After the upload is complete, you will see an index named `test-data` with 182,469 documents. - - -## Reindex the data for hybrid search [hybrid-search-reindex-data] - -Reindex the data from the `test-data` index into the `semantic-embeddings` index. The data in the `content` field of the source index is copied into the `content` field of the destination index. The `copy_to` parameter set in the index mapping creation ensures that the content is copied into the `semantic_text` field. The data is processed by the {{infer}} endpoint at ingest time to generate embeddings. - -::::{note} -This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed, rather than using the `test-data` set, reindexing is still required to ensure that the data is processed by the {{infer}} endpoint and the necessary embeddings are generated. - -:::: - - -```console -POST _reindex?wait_for_completion=false -{ - "source": { - "index": "test-data", - "size": 10 <1> - }, - "dest": { - "index": "semantic-embeddings" - } -} -``` - -1. The default batch size for reindexing is 1000. Reducing size to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. - - -The call returns a task ID to monitor the progress: - -```console -GET _tasks/ -``` - -Reindexing large datasets can take a long time. You can test this workflow using only a subset of the dataset. - -To cancel the reindexing process and generate embeddings for the subset that was reindexed: - -```console -POST _tasks//_cancel -``` - - -## Perform hybrid search [hybrid-search-perform-search] - -After reindexing the data into the `semantic-embeddings` index, you can perform hybrid search by using [reciprocal rank fusion (RRF)](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html). RRF is a technique that merges the rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. - -```console -GET semantic-embeddings/_search -{ - "retriever": { - "rrf": { - "retrievers": [ - { - "standard": { <1> - "query": { - "match": { - "content": "How to avoid muscle soreness while running?" <2> - } - } - } - }, - { - "standard": { <3> - "query": { - "semantic": { - "field": "semantic_text", <4> - "query": "How to avoid muscle soreness while running?" - } - } - } - } - ] - } - } -} -``` - -1. The first `standard` retriever represents the traditional lexical search. -2. Lexical search is performed on the `content` field using the specified phrase. -3. The second `standard` retriever refers to the semantic search. -4. The `semantic_text` field is used to perform the semantic search. - - -After performing the hybrid search, the query will return the top 10 documents that match both semantic and lexical search criteria. The results include detailed information about each document: - -```console-result -{ - "took": 107, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 473, - "relation": "eq" - }, - "max_score": null, - "hits": [ - { - "_index": "semantic-embeddings", - "_id": "wv65epIBEMBRnhfTsOFM", - "_score": 0.032786883, - "_rank": 1, - "_source": { - "semantic_text": { - "inference": { - "inference_id": "my-elser-endpoint", - "model_settings": { - "task_type": "sparse_embedding" - }, - "chunks": [ - { - "text": "What so many out there do not realize is the importance of what you do after you work out. You may have done the majority of the work, but how you treat your body in the minutes and hours after you exercise has a direct effect on muscle soreness, muscle strength and growth, and staying hydrated. Cool Down. After your last exercise, your workout is not over. The first thing you need to do is cool down. Even if running was all that you did, you still should do light cardio for a few minutes. This brings your heart rate down at a slow and steady pace, which helps you avoid feeling sick after a workout.", - "embeddings": { - "exercise": 1.571044, - "after": 1.3603843, - "sick": 1.3281639, - "cool": 1.3227621, - "muscle": 1.2645415, - "sore": 1.2561599, - "cooling": 1.2335974, - "running": 1.1750668, - "hours": 1.1104802, - "out": 1.0991782, - "##io": 1.0794281, - "last": 1.0474665, - (...) - } - } - ] - } - }, - "id": 8408852, - "content": "What so many out there do not realize is the importance of (...)" - } - } - ] - } -} -``` +We recommend implementing hybrid search with the [reciprocal rank fusion (RRF)](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html) algorithm. This approach merges rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. \ No newline at end of file diff --git a/solutions/search/hybrid-semantic-text.md b/solutions/search/hybrid-semantic-text.md new file mode 100644 index 0000000000..8e373ead9d --- /dev/null +++ b/solutions/search/hybrid-semantic-text.md @@ -0,0 +1,203 @@ +--- +navigation_title: "Hybrid search with `semantic_text`" +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text-hybrid-search.html +--- + + + +# Hybrid search [semantic-text-hybrid-search] + + +This tutorial demonstrates how to perform hybrid search, combining semantic search with traditional full-text search. + +In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient. + +The recommended way to use hybrid search in the {{stack}} is following the `semantic_text` workflow. This tutorial uses the [`elasticsearch` service](inference-api/elasticsearch-inference-integration.md) for demonstration, but you can use any service and their supported models offered by the {{infer-cap}} API. + + +## Create an index mapping [hybrid-search-create-index-mapping] + +The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search. + +```console +PUT semantic-embeddings +{ + "mappings": { + "properties": { + "semantic_text": { <1> + "type": "semantic_text", + }, + "content": { <2> + "type": "text", + "copy_to": "semantic_text" <3> + } + } + } +} +``` + +1. The name of the field to contain the generated embeddings for semantic search. +2. The name of the field to contain the original text for lexical search. +3. The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {{infer}} endpoint. + + +::::{note} +If you want to run a search on indices that were populated by web crawlers or connectors, you have to [update the index mappings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html) for these indices to include the `semantic_text` field. Once the mapping is updated, you’ll need to run a full web crawl or a full connector sync. This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling hybrid search on the updated data. + +:::: + + + +## Load data [semantic-text-hybrid-load-data] + +In this step, you load the data that you later use to create embeddings from. + +Use the `msmarco-passagetest2019-top1000` data set, which is a subset of the MS MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by a list of relevant text passages. All unique passages, along with their IDs, have been extracted from that data set and compiled into a [tsv file](https://github.com/elastic/stack-docs/blob/main/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv). + +Download the file and upload it to your cluster using the [Data Visualizer](../../manage-data/ingest.md#upload-data-kibana) in the {{ml-app}} UI. After your data is analyzed, click **Override settings**. Under **Edit field names***, assign `id` to the first column and `content` to the second. Click ***Apply***, then ***Import**. Name the index `test-data`, and click **Import**. After the upload is complete, you will see an index named `test-data` with 182,469 documents. + + +## Reindex the data for hybrid search [hybrid-search-reindex-data] + +Reindex the data from the `test-data` index into the `semantic-embeddings` index. The data in the `content` field of the source index is copied into the `content` field of the destination index. The `copy_to` parameter set in the index mapping creation ensures that the content is copied into the `semantic_text` field. The data is processed by the {{infer}} endpoint at ingest time to generate embeddings. + +::::{note} +This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed, rather than using the `test-data` set, reindexing is still required to ensure that the data is processed by the {{infer}} endpoint and the necessary embeddings are generated. + +:::: + + +```console +POST _reindex?wait_for_completion=false +{ + "source": { + "index": "test-data", + "size": 10 <1> + }, + "dest": { + "index": "semantic-embeddings" + } +} +``` + +1. The default batch size for reindexing is 1000. Reducing size to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. + + +The call returns a task ID to monitor the progress: + +```console +GET _tasks/ +``` + +Reindexing large datasets can take a long time. You can test this workflow using only a subset of the dataset. + +To cancel the reindexing process and generate embeddings for the subset that was reindexed: + +```console +POST _tasks//_cancel +``` + + +## Perform hybrid search [hybrid-search-perform-search] + +After reindexing the data into the `semantic-embeddings` index, you can perform hybrid search by using [reciprocal rank fusion (RRF)](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html). RRF is a technique that merges the rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. + +```console +GET semantic-embeddings/_search +{ + "retriever": { + "rrf": { + "retrievers": [ + { + "standard": { <1> + "query": { + "match": { + "content": "How to avoid muscle soreness while running?" <2> + } + } + } + }, + { + "standard": { <3> + "query": { + "semantic": { + "field": "semantic_text", <4> + "query": "How to avoid muscle soreness while running?" + } + } + } + } + ] + } + } +} +``` + +1. The first `standard` retriever represents the traditional lexical search. +2. Lexical search is performed on the `content` field using the specified phrase. +3. The second `standard` retriever refers to the semantic search. +4. The `semantic_text` field is used to perform the semantic search. + + +After performing the hybrid search, the query will return the top 10 documents that match both semantic and lexical search criteria. The results include detailed information about each document: + +```console-result +{ + "took": 107, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 473, + "relation": "eq" + }, + "max_score": null, + "hits": [ + { + "_index": "semantic-embeddings", + "_id": "wv65epIBEMBRnhfTsOFM", + "_score": 0.032786883, + "_rank": 1, + "_source": { + "semantic_text": { + "inference": { + "inference_id": "my-elser-endpoint", + "model_settings": { + "task_type": "sparse_embedding" + }, + "chunks": [ + { + "text": "What so many out there do not realize is the importance of what you do after you work out. You may have done the majority of the work, but how you treat your body in the minutes and hours after you exercise has a direct effect on muscle soreness, muscle strength and growth, and staying hydrated. Cool Down. After your last exercise, your workout is not over. The first thing you need to do is cool down. Even if running was all that you did, you still should do light cardio for a few minutes. This brings your heart rate down at a slow and steady pace, which helps you avoid feeling sick after a workout.", + "embeddings": { + "exercise": 1.571044, + "after": 1.3603843, + "sick": 1.3281639, + "cool": 1.3227621, + "muscle": 1.2645415, + "sore": 1.2561599, + "cooling": 1.2335974, + "running": 1.1750668, + "hours": 1.1104802, + "out": 1.0991782, + "##io": 1.0794281, + "last": 1.0474665, + (...) + } + } + ] + } + }, + "id": 8408852, + "content": "What so many out there do not realize is the importance of (...)" + } + } + ] + } +} +``` diff --git a/solutions/search/retrievers-overview.md b/solutions/search/retrievers-overview.md index 7f4fc7cf69..93dad4ff1a 100644 --- a/solutions/search/retrievers-overview.md +++ b/solutions/search/retrievers-overview.md @@ -4,6 +4,9 @@ A retriever is an abstraction that was added to the Search API in **8.14.0** and This document provides a general overview of the retriever abstraction. For implementation details, including notable restrictions, check out the [reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/retriever.html) in the `_search` API docs. +::::{tip} +Prefer to start with some sample code? Check out [](retrievers-examples.md) for a collection of retriever examples. +:::: ## Retriever types [retrievers-overview-types] @@ -71,7 +74,7 @@ This example demonstrates how you can combine different retrieval strategies int Compare to `RRF` with `sub_searches` approach (which is deprecated as of 8.16.0): -::::{dropdown} **Expand** for example +::::{dropdown} Expand for example ```js GET example-index/_search { @@ -101,11 +104,9 @@ GET example-index/_search } } ``` - :::: - -For more examples on how to use retrievers, please refer to [retriever examples](https://www.elastic.co/guide/en/elasticsearch/reference/current/retrievers-examples.html). +For more examples, refer to [retriever examples](retrievers-examples.md). ## Glossary [retrievers-overview-glossary] diff --git a/solutions/search/search-approaches.md b/solutions/search/search-approaches.md index 849efea1fa..43c850cf31 100644 --- a/solutions/search/search-approaches.md +++ b/solutions/search/search-approaches.md @@ -9,7 +9,7 @@ The following table provides an overview of the fundamental search techniques av | [**Full-text search**](full-text.md) | Traditional lexical search with analyzers and relevance tuning | Essential foundation for keyword matching, works out of the box | | [**Vector search**](vector.md) | Similarity search using numerical vectors | Requires extra setup/resources, ideal for finding similar documents | | [**Semantic search**](semantic-search.md) | Meaning-based search using natural language understanding | Requires ML models and vector infrastructure | -| [**Hybrid search**](hybrid-search.md) | Combines lexical and vector/semantic approaches | Best balance for both keyword precision and semantic relevance | +| [**Hybrid search**](hybrid-semantic-text.md) | Combines lexical and vector/semantic approaches | Best balance for both keyword precision and semantic relevance | | [**Re-ranking**](ranking/semantic-reranking.md) | Post-processing results to improve relevance | Optional ML-based enhancement for fine-tuned relevance | | [**Geospatial search**](/explore-analyze/geospatial-analysis.md) | Location-based search and spatial relationships | For maps, distance calculations, and shape queries | diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index e95a1ac9a7..da2f0bd066 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -10,27 +10,17 @@ mapped_urls: This page focuses on the semantic search workflows available in {{es}}. For detailed information about vector search implementations, refer to [vector search](vector.md). ::: -Sometimes [full-text search](full-text.md) isn't enough. - -Semantic search techniques help users find data based on intent and contextual meaning, going beyond traditional keyword matching. - -Semantic search has a wide range of use cases, including: - -- **Question answering**: Find the most relevant answers to user questions -- **Recommendation systems**: Suggest similar items based on user preferences -- **Information retrieval**: Retrieve relevant information from large datasets -- **Product discovery**: Help E-commerce users find relevant products -- **Workplace search**: Help employees find relevant information within an organization - {{es}} provides various semantic search capabilities using [natural language processing (NLP)](/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). +Learn more about use cases for AI-powered search in the [overview](ai-search/ai-search.md) page. + ## Overview of semantic search workflows [semantic-search-workflows-overview] You have several options for using NLP models for semantic search in the {{stack}}: -* Option 1: Use the `semantic_text` workflow (recommended) -* Option 2: Use the {{infer}} API workflow -* Option 3: Deploy models directly in {{es}} +* [Option 1](#_semantic_text_workflow): Use the `semantic_text` workflow (recommended) +* [Option 2](#_infer_api_workflow): Use the {{infer}} API workflow +* [Option 3](#_model_deployment_workflow): Deploy models directly in {{es}} This diagram summarizes the relative complexity of each workflow: @@ -49,7 +39,7 @@ For an end-to-end tutorial, refer to [Semantic search with `semantic_text`](sema ### Option 2: Inference API [_infer_api_workflow] -The [{{infer}} API workflow](inference-api.md) is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {{infer}} ingest pipeline with the appropriate settings. +The {{infer}} API workflow is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, define an index mapping, and set up an {{infer}} ingest pipeline with the appropriate settings. For an end-to-end tutorial, refer to [Semantic search with the {{infer}} API](inference-api.md). @@ -61,7 +51,7 @@ You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](semantic-search/semantic-search-deployed-nlp-model.md). ::::{tip} -Refer to [vector queries and field types](vector.md#choosing-vector-query) for a quick reference overview. +Refer to [vector queries and field types](vector.md#vector-queries-and-field-types) for a quick reference overview. :::: ## Learn more [semantic-search-read-more] diff --git a/solutions/search/vector.md b/solutions/search/vector.md index 4b1f1a16d5..68124d90e3 100644 --- a/solutions/search/vector.md +++ b/solutions/search/vector.md @@ -4,95 +4,40 @@ Looking for a minimal configuration approach? The `semantic_text` field type provides an abstraction over these vector search implementations with sensible defaults and automatic model management. It's the recommended approach for most users. [Learn more about semantic_text](semantic-search/semantic-search-semantic-text.md). ::: -Elasticsearch's vector search capabilities enable finding content based on meaning and similarity, instead of keyword or exact term matches. Vector search is an important component of most modern [semantic search](semantic-search.md) implementations. -Vector search can also be used independently for various similarity matching use cases. +Vector embeddings are a core technology in modern search, enabling models to learn and represent complex relationships in input data. When your content is vectorized, Elasticsearch can help users find content based on meaning and similarity, instead of just keyword or exact term matches. -This guide explores the more manual technical implementation of vector search approaches, that **do not** use the `semantic_text` workflow. +Vector search is an important component of most modern [semantic search](semantic-search.md) implementations. It can also be used independently for various similarity matching use cases. Learn more about use cases for AI-powered search in the [overview](ai-search/ai-search.md) page. -## Vector queries and field types [choosing-vector-query] +This guide explores the more manual technical implementation of vector search approaches, that do not use the higher-level `semantic_text` workflow. -Which query you use and which field you target in your queries depends on your chosen workflow. If you’re using the `semantic_text` workflow it’s quite simple. If not, it depends on which type of embeddings you’re working with. +Which approach you use depends on your specific requirements and use case. -If you want {{es}} to generate embeddings at both index and query time, use the `semantic_text` field and the `semantic` query. If you want to bring your own embeddings, use the `sparse_vector` or `dense_vector` field type and the associated query depending on the NLP model you used to generate the embeddings. +## Vector queries and field types -| Vector type | Field type | Query type | Use case | -| ----------- | --------------- | --------------- | -------------------------------------------------- | -| Dense | `dense_vector` | `knn` | Semantic similarity matching via neural embeddings | -| Sparse | `sparse_vector` | `sparse_vector` | Enhanced text search with semantic term expansion | -|Dense or sparse | [`semantic_text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html) | [`semantic`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-semantic-query.html) | The `semantic_text` field handles generating embeddings for you at index time and query time. | - -## Dense vector vs. sparse vector +Here's a quick reference overview of vector search field types and queries available in Elasticsearch: -### Dense vector +| Vector type | Field type | Query type | Primary use case | +| ----------- | --------------- | --------------- | -------------------------------------------------- | +| Dense | `dense_vector` | `knn` | Semantic similarity via neural embeddings | +| Sparse | `sparse_vector` | `sparse_vector` | Semantic term expansion with (ELSER) | +| Sparse or dense | `semantic_text` | `semantic` | Managed semantic search that is agnostic to implementation details | -Dense neural embeddings capture semantic meaning by translating content into a vector space. Fixed-length vectors of floating-point numbers represent the content's meaning, with similar content mapped to nearby points in the vector space. -When you need to find related items, these vectors work with distance metrics to identify semantic similarity. This is ideal for when you want to capture "what this content is about" rather than just what words it contains. +## Dense vector search -Dense vectors are well-suited for: -- Finding semantically similar content ("more like this") +Dense neural embeddings capture semantic meaning by translating content into fixed-length vectors of floating-point bumbers. Similar content maps to nearby points in the vector space, making them ideal for: +- Finding semantically similar content - Matching questions with answers - Image similarity search -- Recommendations based on content similarity - -To implement dense vector search, you'll need to: -1. Generate document embeddings using a neural model (locally or in {{es}}) -2. Configure your vector similarity settings -3. Generate query embeddings at search time -4. Use the `knn` query for searches - -[Learn more about dense vector search implementation](vector/dense-vector.md) - -### Sparse vector (ELSER) - -While dense vectors capture overall meaning, sparse vectors excel at intelligent vocabulary expansion. The model enriches each key concept with closely related terms and ideas, creating semantic connections that go beyond simple word-level substitutions. - -Using [Elastic's learned sparse encoder model (ELSER)](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), both queries and documents get expanded into weighted terms. The result is powerful search that is also transparent - you can see exactly why matches occur. This contrasts with the "black-box" nature of dense vector matching, where similarity is based on abstract vector distances. - -Sparse vectors are ideal for: - -- Enhanced keyword search with semantic understanding -- Cases where you need explainable search results -- Scenarios requiring precise term matching with semantic awareness -- Domain-specific search where vocabulary relationships matter - -To implement sparse vector search with ELSER manually, you'll need to: -1. Configure ELSER to embed your content -2. Configure ELSER to embed your queries -3. Use the `sparse_vector` query for searches - -[Learn more about implementing sparse vector search](vector/sparse-vector-elser.md) - -### Key considerations - -If you've chosen to implement vector search manually rather than using the guided `semantic_text` workflow, consider: - -1. **Data characteristics** - - **Text length:** Dense vectors work best with shorter texts like product descriptions or chat messages. For longer content like articles or documentation, sparse vectors handle length better. - - **Domain specificity:** For specialized content (medical literature, legal documents, technical documentation), sparse vectors preserve domain-specific terminology. Dense vectors can miss nuanced technical terms. - - **Update frequency:** If your content changes frequently, dense vectors require recomputing embeddings for each update. Sparse vectors handle incremental updates more efficiently. - -2. **Performance requirements** - - **Query latency:** Dense vectors with HNSW offer fast search times for moderate-sized datasets. Sparse vectors have higher but consistent latency across dataset sizes. - - **Memory footprint:** Dense vectors' size is determined by their chosen dimensionality, while sparse vectors' size varies with content complexity. - - **Resource needs:** Dense vectors benefit significantly from GPU acceleration. Sparse vectors perform well on CPU-only setups. - -3. **Explainability needs** - - **Transparency is important:** For transparency and relevance debugging, sparse vectors show exactly which terms contributed. - - **Transparency isn't important:** For recommendations and "more like this" features, dense vectors work well. - -4. **Implementation complexity** - - **Dense vector setup:** More complex due to wide range of embedding models to choose from, configure, and manage. - - **Sparse vector setup:** Simpler since ELSER is Elastic's standard sparse encoding model and is available out-of-the-box. - -## Implementation tutorials +- Content-based recommendations -TODO +[Learn more about dense vector search](vector/dense-vector.md). -- [](semantic-search/bring-own-vectors.md) -- [Sparse vector search with ELSER](vector/sparse-vector-elser.md) -- [Semantic search with a model deployed in {{es}}](semantic-search/semantic-search-deployed-nlp-model.md) -- [Semantic search with the msmarco-MiniLM-L-12-v3 sentence-transformer model](../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md) +## Sparse vector search -## Additional resources +Sparse vectors use ELSER to expand content with semantically related terms. This approach preserves explainability while adding semantic understanding, making it well-suited for: +- Enhanced keyword search +- Cases requiring explainable results +- Domain-specific search +- Large-scale deployments -T \ No newline at end of file +[Learn more about sparse vector search with ELSER](vector/sparse-vector-elser.md). \ No newline at end of file diff --git a/solutions/toc.yml b/solutions/toc.yml index f3378e39a4..1dd51d4a3f 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -629,22 +629,26 @@ toc: children: - file: search/full-text/search-with-synonyms.md - file: search/full-text/text-analysis-during-search.md - - file: search/vector.md + - file: search/ai-search/ai-search.md children: - - file: search/vector/dense-vector.md + - file: search/vector.md children: - - file: search/vector/knn.md - - file: search/semantic-search/bring-own-vectors.md - - file: search/vector/sparse-vector-elser.md + - file: search/vector/dense-vector.md + children: + - file: search/vector/knn.md + - file: search/semantic-search/bring-own-vectors.md + - file: search/vector/sparse-vector-elser.md + - file: search/semantic-search.md + children: + - file: search/semantic-search/semantic-search-semantic-text.md + - file: search/semantic-search/semantic-text-hybrid-search.md + - file: search/semantic-search/semantic-search-inference.md + - file: search/semantic-search/semantic-search-elser.md + - file: search/semantic-search/cohere-es.md + - file: search/semantic-search/semantic-search-deployed-nlp-model.md - file: search/hybrid-search.md - - file: search/semantic-search.md - children: - - file: search/semantic-search/semantic-search-semantic-text.md - - file: search/semantic-search/semantic-text-hybrid-search.md - - file: search/semantic-search/semantic-search-inference.md - - file: search/semantic-search/semantic-search-elser.md - - file: search/semantic-search/cohere-es.md - - file: search/semantic-search/semantic-search-deployed-nlp-model.md + children: + - file: search/hybrid-semantic-text.md - file: search/ranking.md children: - file: search/ranking/semantic-reranking.md From 21c3ac0512dd1e57a36be643b6943bcf1687bf41 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 11:40:54 +0100 Subject: [PATCH 08/30] Fix link path --- solutions/search/hybrid-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/hybrid-search.md b/solutions/search/hybrid-search.md index 4be5bf78d1..900254fe72 100644 --- a/solutions/search/hybrid-search.md +++ b/solutions/search/hybrid-search.md @@ -2,6 +2,6 @@ Hybrid search combines traditional [full-text search](full-text.md) with [AI-powered search](ai-search/ai-search.md) for more powerful search experiences that serve a wider range of user needs. -The recommended way to use hybrid search in the Elastic Stack is following the `semantic_text` workflow. Check out the [hands-on tutorial](hybrid-search/hybrid-semantic-text.md) for a step-by-step guide. +The recommended way to use hybrid search in the Elastic Stack is following the `semantic_text` workflow. Check out the [hands-on tutorial](hybrid-semantic-text.md) for a step-by-step guide. We recommend implementing hybrid search with the [reciprocal rank fusion (RRF)](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html) algorithm. This approach merges rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. \ No newline at end of file From a3db1b0d9bae00fe45c161305abe5756b36677a7 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 11:46:21 +0100 Subject: [PATCH 09/30] idem --- solutions/search/ai-search/ai-search.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/solutions/search/ai-search/ai-search.md b/solutions/search/ai-search/ai-search.md index a977350d7e..5d3336bb34 100644 --- a/solutions/search/ai-search/ai-search.md +++ b/solutions/search/ai-search/ai-search.md @@ -1,6 +1,6 @@ # AI-powered search -Sometimes [full-text search](full-text.md) alone isn't enough. Machine learning techniques are powerful tools for helping users find data based on intent and contextual meaning. Natural language understanding and information retrieval go hand in hand in modern search systems. +Sometimes [full-text search](../full-text.md) alone isn't enough. Machine learning techniques are powerful tools for helping users find data based on intent and contextual meaning. Natural language understanding and information retrieval go hand in hand in modern search systems. Depending on your team's technical expertise and requirements, you can choose from two main paths to implement AI-powered search in Elasticsearch. You can use managed workflows that abstract away much of the complexity, or you can work directly with the underlying vector search technology. @@ -21,18 +21,18 @@ AI-powered search enables a wide range of applications: AI-powered search in Elasticsearch is built on vector search technology, which uses machine learning models to capture meaning in content. These vector representations come in two forms: dense vectors that capture overall meaning, and sparse vectors that focus on key terms and their relationships. :::{tip} -New to AI-powered search? Start with the `semantic_text` field type, which provides an easy-to-use abstraction over these capabilities with sensible defaults. [Learn more about semantic_text](semantic-search/semantic-search-semantic-text.md). +New to AI-powered search? Start with the `semantic_text` workflow, which provides an easy-to-use abstraction over these capabilities with sensible defaults. Learn more [in this hands-on tutorial](../semantic-search/semantic-search-semantic-text.md). ::: ## Implementation paths Elasticsearch uses vector search as the foundation for AI-powered search capabilities. You can work with this technology in two ways: -1. [**Semantic search**](semantic-search.md) provides managed workflows that use vector search under the hood: +1. [**Semantic search**](../semantic-search.md) provides managed workflows that use vector search under the hood: - The `semantic_text` field type offers the simplest path with automatic embedding generation and model management - Additional implementation options available for more complex needs -2. [**Vector search**](vector.md) gives you direct access to the underlying technology: +2. [**Vector search**](../vector.md) gives you direct access to the underlying technology: - Manual configuration of dense or sparse vectors - Flexibility to bring your own embeddings - Direct implementation of vector similarity matching From 0e157ca26498c5a5cb6d220d0b46a1cbfac38537 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 11:53:23 +0100 Subject: [PATCH 10/30] reorder intro sentences for mellifluousness --- solutions/search/ai-search/ai-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/ai-search/ai-search.md b/solutions/search/ai-search/ai-search.md index 5d3336bb34..c80bca6c94 100644 --- a/solutions/search/ai-search/ai-search.md +++ b/solutions/search/ai-search/ai-search.md @@ -1,6 +1,6 @@ # AI-powered search -Sometimes [full-text search](../full-text.md) alone isn't enough. Machine learning techniques are powerful tools for helping users find data based on intent and contextual meaning. Natural language understanding and information retrieval go hand in hand in modern search systems. +Natural language understanding and information retrieval go hand in hand in modern search systems. Sometimes [full-text search](../full-text.md) alone isn't enough. Machine learning techniques are powerful tools for helping users find data based on intent and contextual meaning. Depending on your team's technical expertise and requirements, you can choose from two main paths to implement AI-powered search in Elasticsearch. You can use managed workflows that abstract away much of the complexity, or you can work directly with the underlying vector search technology. From f81ca87699a62214ec895c225883e764629ac1bb Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 11:55:08 +0100 Subject: [PATCH 11/30] linkswap --- solutions/search/ai-search/ai-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/ai-search/ai-search.md b/solutions/search/ai-search/ai-search.md index c80bca6c94..a002bca2f6 100644 --- a/solutions/search/ai-search/ai-search.md +++ b/solutions/search/ai-search/ai-search.md @@ -37,4 +37,4 @@ Elasticsearch uses vector search as the foundation for AI-powered search capabil - Flexibility to bring your own embeddings - Direct implementation of vector similarity matching -Once you've implemented either approach, you can combine it with traditional [full-text search](../full-text.md) to create hybrid search solutions that leverage both meaning-based and keyword-based matching. \ No newline at end of file +Once you've implemented either approach, you can combine it with full-text search to create [hybrid search](../hybrid-search.md) solutions that leverage both meaning-based and keyword-based matching. \ No newline at end of file From 37c3caedae55bce6058a1b6967d052be0765a760 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 14:01:54 +0100 Subject: [PATCH 12/30] Move/dust up knn --- .../serverless/elasticsearch-knn-search.md | 18 - raw-migrated-files/toc.yml | 1 - solutions/search/api-quickstarts.md | 2 +- .../bring-own-vectors.md | 5 +- solutions/search/vector/knn.md | 1114 ++++++++++++++++- solutions/toc.yml | 2 +- 6 files changed, 1108 insertions(+), 34 deletions(-) delete mode 100644 raw-migrated-files/docs-content/serverless/elasticsearch-knn-search.md rename solutions/search/{semantic-search => vector}/bring-own-vectors.md (97%) diff --git a/raw-migrated-files/docs-content/serverless/elasticsearch-knn-search.md b/raw-migrated-files/docs-content/serverless/elasticsearch-knn-search.md deleted file mode 100644 index c3d23b859c..0000000000 --- a/raw-migrated-files/docs-content/serverless/elasticsearch-knn-search.md +++ /dev/null @@ -1,18 +0,0 @@ -# k-nearest neighbor (kNN) search [elasticsearch-knn-search] - -A *k-nearest neighbor* (kNN) search finds the *k* nearest vectors to a query vector, as measured by a similarity metric. - -Common use cases for kNN include: - -* Relevance ranking based on natural language processing (NLP) algorithms -* Product recommendations and recommendation engines -* Similarity search for images or videos - -Learn more in the [{{es}} core documentation](../../../solutions/search/vector/knn.md). - -::::{tip} -Check out our [hands-on tutorial](../../../solutions/search/semantic-search/bring-own-vectors.md) to learn how to ingest dense vector embeddings into Elasticsearch. - -:::: - - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 62abda3cb7..110bc49660 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -279,7 +279,6 @@ toc: - file: docs-content/serverless/elasticsearch-ingest-data-file-upload.md - file: docs-content/serverless/elasticsearch-ingest-data-through-api.md - file: docs-content/serverless/elasticsearch-ingest-your-data.md - - file: docs-content/serverless/elasticsearch-knn-search.md - file: docs-content/serverless/elasticsearch-manage-project.md - file: docs-content/serverless/elasticsearch-playground.md - file: docs-content/serverless/elasticsearch-search-your-data-the-search-api.md diff --git a/solutions/search/api-quickstarts.md b/solutions/search/api-quickstarts.md index cdb431af9d..95df931323 100644 --- a/solutions/search/api-quickstarts.md +++ b/solutions/search/api-quickstarts.md @@ -8,7 +8,7 @@ Use the following quickstarts to get hands-on experience with Elasticsearch APIs % - [Getting started with ES|QL](esql-getting-started.md): Learn how to query and aggregate your data using ES|QL. - [Semantic search](semantic-search/semantic-search-semantic-text.md): Learn how to create embeddings for your data with `semantic_text` and query using the `semantic` query. - [Hybrid search](semantic-search/semantic-text-hybrid-search.md): Learn how to combine semantic search using`semantic_text` with full-text search. -- [Bring your own dense vector embeddings](semantic-search/bring-own-vectors.md): Learn how to ingest dense vector embeddings into Elasticsearch. +- [Bring your own dense vector embeddings](vector/bring-own-vectors.md): Learn how to ingest dense vector embeddings into Elasticsearch. :::{tip} To run the quickstarts, you need a running Elasticsearch cluster. Use [`start-local`](https://github.com/elastic/start-local) to set up a fast local dev environment in Docker, together with Kibana. Run the following command in your terminal: diff --git a/solutions/search/semantic-search/bring-own-vectors.md b/solutions/search/vector/bring-own-vectors.md similarity index 97% rename from solutions/search/semantic-search/bring-own-vectors.md rename to solutions/search/vector/bring-own-vectors.md index bea1464103..ecac5f1932 100644 --- a/solutions/search/semantic-search/bring-own-vectors.md +++ b/solutions/search/vector/bring-own-vectors.md @@ -4,8 +4,6 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/bring-your-own-vectors.html --- - - # Bring your own dense vectors [bring-your-own-vectors] @@ -19,7 +17,6 @@ This is an advanced use case. Refer to [Semantic search](../semantic-search.md) :::: - ## Step 1: Create an index with `dense_vector` mapping [bring-your-own-vectors-create-index] Each document in our simple dataset will have: @@ -129,7 +126,7 @@ In this simple example, we’re sending a raw vector for the query text. In a re For this you’ll need to deploy a text embedding model in {{es}} and use the [`query_vector_builder` parameter](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request. -Learn how to [use a deployed text embedding model](semantic-search-deployed-nlp-model.md) for semantic search. +Learn how to [use a deployed text embedding model](../semantic-search/semantic-search-deployed-nlp-model.md) for semantic search. ::::{tip} If you’re just getting started with vector search in {{es}}, refer to [Semantic search](../semantic-search.md). diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md index bdfa14ae10..09eae8ddcf 100644 --- a/solutions/search/vector/knn.md +++ b/solutions/search/vector/knn.md @@ -4,19 +4,1115 @@ mapped_urls: - https://www.elastic.co/guide/en/serverless/current/elasticsearch-knn-search.html --- -# KNN +# kNN search [knn-search] -% What needs to be done: Refine -% Use migrated content from existing pages that map to this page: +A *k-nearest neighbor* (kNN) search finds the *k* nearest vectors to a query vector, as measured by a similarity metric. -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/knn-search.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-knn-search.md +Common use cases for kNN include: -% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc): +* Relevance ranking based on natural language processing (NLP) algorithms +* Product recommendations and recommendation engines +* Similarity search for images or videos -$$$approximate-knn$$$ -$$$knn-semantic-search$$$ +## Prerequisites [knn-prereqs] -$$$exact-knn$$$ \ No newline at end of file +* To run a kNN search, your data must be transformed into vectors. You can [use an NLP model in {{es}}](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md), or generate them outside {{es}}. + - Dense vectors need to use the [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) field type. + - Queries are represented as vectors with the same dimension. You should use the same model to generate the query vector as you used to generate the document vectors. + - If you already have vectors, refer to the [Bring your own dense vectors](bring-own-vectors.md) guide. + +* To complete the steps in this guide, you must have the following [index privileges](../../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices): + + * `create_index` or `manage` to create an index with a `dense_vector` field + * `create`, `index`, or `write` to add data to the index you created + * `read` to search the index + + + +## kNN methods [knn-methods] + +{{es}} supports two methods for kNN search: + +* [Approximate kNN](../../../solutions/search/vector/knn.md#approximate-knn) using the `knn` search option or `knn` query +* [Exact, brute-force kNN](../../../solutions/search/vector/knn.md#exact-knn) using a `script_score` query with a vector function + +In most cases, you’ll want to use approximate kNN. Approximate kNN offers lower latency at the cost of slower indexing and imperfect accuracy. + +Exact, brute-force kNN guarantees accurate results but doesn’t scale well with large datasets. With this approach, a `script_score` query must scan each matching document to compute the vector function, which can result in slow search speeds. However, you can improve latency by using a [query](../../../explore-analyze/query-filter/languages/querydsl.md) to limit the number of matching documents passed to the function. If you filter your data to a small subset of documents, you can get good search performance using this approach. + + +## Approximate kNN [approximate-knn] + +::::{warning} +Compared to other types of search, approximate kNN search has specific resource requirements. In particular, all vector data must fit in the node’s page cache for it to be efficient. Please consult the [approximate kNN search tuning guide](../../../solutions/search/vector/knn.md) for important notes on configuration and sizing. +:::: + + +To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) to search one or more `dense_vector` fields with indexing enabled. + +1. Explicitly map one or more `dense_vector` fields. Approximate kNN search requires the following mapping options: + + * A `similarity` value. This value determines the similarity metric used to score documents based on similarity between the query and document vector. For a list of available metrics, see the [`similarity`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) parameter documentation. The `similarity` setting defaults to `cosine`. + + ```console + PUT image-index + { + "mappings": { + "properties": { + "image-vector": { + "type": "dense_vector", + "dims": 3, + "similarity": "l2_norm" + }, + "title-vector": { + "type": "dense_vector", + "dims": 5, + "similarity": "l2_norm" + }, + "title": { + "type": "text" + }, + "file-type": { + "type": "keyword" + } + } + } + } + ``` + +2. Index your data. + + ```console + POST image-index/_bulk?refresh=true + { "index": { "_id": "1" } } + { "image-vector": [1, 5, -20], "title-vector": [12, 50, -10, 0, 1], "title": "moose family", "file-type": "jpg" } + { "index": { "_id": "2" } } + { "image-vector": [42, 8, -15], "title-vector": [25, 1, 4, -12, 2], "title": "alpine lake", "file-type": "png" } + { "index": { "_id": "3" } } + { "image-vector": [15, 11, 23], "title-vector": [1, 5, 25, 50, 20], "title": "full moon", "file-type": "jpg" } + ... + ``` + +3. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) or the [`knn` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html) (expert case). + + ```console + POST image-index/_search + { + "knn": { + "field": "image-vector", + "query_vector": [-5, 9, -12], + "k": 10, + "num_candidates": 100 + }, + "fields": [ "title", "file-type" ] + } + ``` + + +The [document `_score`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-response-body-score) is determined by the similarity between the query and document vector. See [`similarity`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) for more information on how kNN search scores are computed. + +::::{note} +Support for approximate kNN search was added in version 8.0. Before this, `dense_vector` fields did not support enabling `index` in the mapping. If you created an index prior to 8.0 containing `dense_vector` fields, then to support approximate kNN search the data must be reindexed using a new field mapping that sets `index: true` which is the default option. +:::: + + + +### Tune approximate kNN for speed or accuracy [tune-approximate-knn-for-speed-accuracy] + +To gather results, the kNN search API finds a `num_candidates` number of approximate nearest neighbor candidates on each shard. The search computes the similarity of these candidate vectors to the query vector, selecting the `k` most similar results from each shard. The search then merges the results from each shard to return the global top `k` nearest neighbors. + +You can increase `num_candidates` for more accurate results at the cost of slower search speeds. A search with a high value for `num_candidates` considers more candidates from each shard. This takes more time, but the search has a higher probability of finding the true `k` top nearest neighbors. + +Similarly, you can decrease `num_candidates` for faster searches with potentially less accurate results. + + +### Approximate kNN using byte vectors [approximate-knn-using-byte-vectors] + +The approximate kNN search API supports `byte` value vectors in addition to `float` value vectors. Use the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) to search a `dense_vector` field with [`element_type`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) set to `byte` and indexing enabled. + +1. Explicitly map one or more `dense_vector` fields with [`element_type`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) set to `byte` and indexing enabled. + + ```console + PUT byte-image-index + { + "mappings": { + "properties": { + "byte-image-vector": { + "type": "dense_vector", + "element_type": "byte", + "dims": 2 + }, + "title": { + "type": "text" + } + } + } + } + ``` + +2. Index your data ensuring all vector values are integers within the range [-128, 127]. + + ```console + POST byte-image-index/_bulk?refresh=true + { "index": { "_id": "1" } } + { "byte-image-vector": [5, -20], "title": "moose family" } + { "index": { "_id": "2" } } + { "byte-image-vector": [8, -15], "title": "alpine lake" } + { "index": { "_id": "3" } } + { "byte-image-vector": [11, 23], "title": "full moon" } + ``` + +3. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) ensuring the `query_vector` values are integers within the range [-128, 127]. + + ```console + POST byte-image-index/_search + { + "knn": { + "field": "byte-image-vector", + "query_vector": [-5, 9], + "k": 10, + "num_candidates": 100 + }, + "fields": [ "title" ] + } + ``` + + +*Note*: In addition to the standard byte array, one can also provide a hex-encoded string value for the `query_vector` param. As an example, the search request above can also be expressed as follows, which would yield the same results + +```console +POST byte-image-index/_search +{ + "knn": { + "field": "byte-image-vector", + "query_vector": "fb09", + "k": 10, + "num_candidates": 100 + }, + "fields": [ "title" ] +} +``` + + +### Byte quantized kNN search [knn-search-quantized-example] + +If you want to provide `float` vectors, but want the memory savings of `byte` vectors, you can use the [quantization](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-quantization) feature. Quantization allows you to provide `float` vectors, but internally they are indexed as `byte` vectors. Additionally, the original `float` vectors are still retained in the index. + +::::{note} +The default index type for `dense_vector` is `int8_hnsw`. +:::: + + +To use quantization, you can use the index type `int8_hnsw` or `int4_hnsw` object in the `dense_vector` mapping. + +```console +PUT quantized-image-index +{ + "mappings": { + "properties": { + "image-vector": { + "type": "dense_vector", + "element_type": "float", + "dims": 2, + "index": true, + "index_options": { + "type": "int8_hnsw" + } + }, + "title": { + "type": "text" + } + } + } +} +``` + +1. Index your `float` vectors. + + ```console + POST quantized-image-index/_bulk?refresh=true + { "index": { "_id": "1" } } + { "image-vector": [0.1, -2], "title": "moose family" } + { "index": { "_id": "2" } } + { "image-vector": [0.75, -1], "title": "alpine lake" } + { "index": { "_id": "3" } } + { "image-vector": [1.2, 0.1], "title": "full moon" } + ``` + +2. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn). When searching, the `float` vector is automatically quantized to a `byte` vector. + + ```console + POST quantized-image-index/_search + { + "knn": { + "field": "image-vector", + "query_vector": [0.1, -2], + "k": 10, + "num_candidates": 100 + }, + "fields": [ "title" ] + } + ``` + + +Since the original `float` vectors are still retained in the index, you can optionally use them for re-scoring. Meaning, you can search over all the vectors quickly using the `int8_hnsw` index and then rescore only the top `k` results. This provides the best of both worlds, fast search and accurate scoring. + +```console +POST quantized-image-index/_search +{ + "knn": { + "field": "image-vector", + "query_vector": [0.1, -2], + "k": 15, + "num_candidates": 100 + }, + "fields": [ "title" ], + "rescore": { + "window_size": 10, + "query": { + "rescore_query": { + "script_score": { + "query": { + "match_all": {} + }, + "script": { + "source": "cosineSimilarity(params.query_vector, 'image-vector') + 1.0", + "params": { + "query_vector": [0.1, -2] + } + } + } + } + } + } +} +``` + + +### Filtered kNN search [knn-search-filter-example] + +The kNN search API supports restricting the search using a filter. The search will return the top `k` documents that also match the filter query. + +The following request performs an approximate kNN search filtered by the `file-type` field: + +```console +POST image-index/_search +{ + "knn": { + "field": "image-vector", + "query_vector": [54, 10, -2], + "k": 5, + "num_candidates": 50, + "filter": { + "term": { + "file-type": "png" + } + } + }, + "fields": ["title"], + "_source": false +} +``` + +::::{note} +The filter is applied **during** the approximate kNN search to ensure that `k` matching documents are returned. This contrasts with a post-filtering approach, where the filter is applied **after** the approximate kNN search completes. Post-filtering has the downside that it sometimes returns fewer than k results, even when there are enough matching documents. +:::: + + + +### Approximate kNN search and filtering [approximate-knn-search-and-filtering] + +Unlike conventional query filtering, where more restrictive filters typically lead to faster queries, applying filters in an approximate kNN search with an HNSW index can decrease performance. This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates` that meet the filter criteria. + +To avoid significant performance drawbacks, Lucene implements the following strategies per segment: + +* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and uses a brute force search on the filtered documents. +* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter, the search will stop exploring the graph and switch to a brute force search over the filtered documents. + + +### Combine approximate kNN with other features [_combine_approximate_knn_with_other_features] + +You can perform *hybrid retrieval* by providing both the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) and a [`query`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#request-body-search-query): + +```console +POST image-index/_search +{ + "query": { + "match": { + "title": { + "query": "mountain lake", + "boost": 0.9 + } + } + }, + "knn": { + "field": "image-vector", + "query_vector": [54, 10, -2], + "k": 5, + "num_candidates": 50, + "boost": 0.1 + }, + "size": 10 +} +``` + +This search finds the global top `k = 5` vector matches, combines them with the matches from the `match` query, and finally returns the 10 top-scoring results. The `knn` and `query` matches are combined through a disjunction, as if you took a boolean *or* between them. The top `k` vector results represent the global nearest neighbors across all index shards. + +The score of each hit is the sum of the `knn` and `query` scores. You can specify a `boost` value to give a weight to each score in the sum. In the example above, the scores will be calculated as + +``` +score = 0.9 * match_score + 0.1 * knn_score +``` + +The `knn` option can also be used with [`aggregations`](../../../explore-analyze/aggregations.md). In general, {{es}} computes aggregations over all documents that match the search. So for approximate kNN search, aggregations are calculated on the top `k` nearest documents. If the search also includes a `query`, then aggregations are calculated on the combined set of `knn` and `query` matches. + + +### Perform semantic search [knn-semantic-search] + +:::{tip} +Looking for a minimal configuration approach? The `semantic_text` field type provides an abstraction over these vector search implementations with sensible defaults and automatic model management. It's the recommended approach for most users. [Learn more about semantic_text](../semantic-search/semantic-search-semantic-text.md). +::: + +kNN search enables you to perform semantic search by using a previously deployed [text embedding model](../../../explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md#ml-nlp-text-embedding). Instead of literal matching on search terms, semantic search retrieves results based on the intent and the contextual meaning of a search query. + +Under the hood, the text embedding NLP model generates a dense vector from the input query string called `model_text` you provide. Then, it is searched against an index containing dense vectors created with the same text embedding {{ml}} model. The search results are semantically similar as learned by the model. + +::::{important} +To perform semantic search: + +* you need an index that contains the dense vector representation of the input data to search against, +* you must use the same text embedding model for search that you used to create the dense vectors from the input data, +* the text embedding NLP model deployment must be started. + +:::: + + +Reference the deployed text embedding model or the model deployment in the `query_vector_builder` object and provide the search query as `model_text`: + +```js +(...) +{ + "knn": { + "field": "dense-vector-field", + "k": 10, + "num_candidates": 100, + "query_vector_builder": { + "text_embedding": { <1> + "model_id": "my-text-embedding-model", <2> + "model_text": "The opposite of blue" <3> + } + } + } +} +(...) +``` + +1. The {{nlp}} task to perform. It must be `text_embedding`. +2. The ID of the text embedding model to use to generate the dense vectors from the query string. Use the same model that generated the embeddings from the input text in the index you search against. You can use the value of the `deployment_id` instead in the `model_id` argument. +3. The query string from which the model generates the dense vector representation. + + +For more information on how to deploy a trained model and use it to create text embeddings, refer to this [end-to-end example](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md). + + +### Search multiple kNN fields [_search_multiple_knn_fields] + +In addition to *hybrid retrieval*, you can search more than one kNN vector field at a time: + +```console +POST image-index/_search +{ + "query": { + "match": { + "title": { + "query": "mountain lake", + "boost": 0.9 + } + } + }, + "knn": [ { + "field": "image-vector", + "query_vector": [54, 10, -2], + "k": 5, + "num_candidates": 50, + "boost": 0.1 + }, + { + "field": "title-vector", + "query_vector": [1, 20, -52, 23, 10], + "k": 10, + "num_candidates": 10, + "boost": 0.5 + }], + "size": 10 +} +``` + +This search finds the global top `k = 5` vector matches for `image-vector` and the global `k = 10` for the `title-vector`. These top values are then combined with the matches from the `match` query and the top-10 documents are returned. The multiple `knn` entries and the `query` matches are combined through a disjunction, as if you took a boolean *or* between them. The top `k` vector results represent the global nearest neighbors across all index shards. + +The scoring for a doc with the above configured boosts would be: + +``` +score = 0.9 * match_score + 0.1 * knn_score_image-vector + 0.5 * knn_score_title-vector +``` + +### Search kNN with expected similarity [knn-similarity-search] + +While kNN is a powerful tool, it always tries to return `k` nearest neighbors. Consequently, when using `knn` with a `filter`, you could filter out all relevant documents and only have irrelevant ones left to search. In that situation, `knn` will still do its best to return `k` nearest neighbors, even though those neighbors could be far away in the vector space. + +To alleviate this worry, there is a `similarity` parameter available in the `knn` clause. This value is the required minimum similarity for a vector to be considered a match. The `knn` search flow with this parameter is as follows: + +* Apply any user provided `filter` queries +* Explore the vector space to get `k` vectors +* Do not return any vectors that are further away than the configured `similarity` + +::::{note} +`similarity` is the true [similarity](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) before it has been transformed into `_score` and boost applied. +:::: + + +For each configured [similarity](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity), here is the corresponding inverted `_score` function. This is so if you are wanting to filter from a `_score` perspective, you can do this minor transformation to correctly reject irrelevant results. + +* `l2_norm`: `sqrt((1 / _score) - 1)` +* `cosine`: `(2 * _score) - 1` +* `dot_product`: `(2 * _score) - 1` +* `max_inner_product`: + + * `_score < 1`: `1 - (1 / _score)` + * `_score >= 1`: `_score - 1` + + +Here is an example. In this example we search for the given `query_vector` for `k` nearest neighbors. However, with `filter` applied and requiring that the found vectors have at least the provided `similarity` between them. + +```console +POST image-index/_search +{ + "knn": { + "field": "image-vector", + "query_vector": [1, 5, -20], + "k": 5, + "num_candidates": 50, + "similarity": 36, + "filter": { + "term": { + "file-type": "png" + } + } + }, + "fields": ["title"], + "_source": false +} +``` + +In our data set, the only document with the file type of `png` has a vector of `[42, 8, -15]`. The `l2_norm` distance between `[42, 8, -15]` and `[1, 5, -20]` is `41.412`, which is greater than the configured similarity of `36`. Meaning, this search will return no hits. + + +### Nested kNN Search [nested-knn-search] + +It is common for text to exceed a particular model’s token limit and requires chunking before building the embeddings for individual chunks. When using [`nested`](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) with [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html), you can achieve nearest passage retrieval without copying top-level document metadata. + +Here is a simple passage vectors index that stores vectors and some top-level metadata for filtering. + +```console +PUT passage_vectors +{ + "mappings": { + "properties": { + "full_text": { + "type": "text" + }, + "creation_time": { + "type": "date" + }, + "paragraph": { + "type": "nested", + "properties": { + "vector": { + "type": "dense_vector", + "dims": 2, + "index_options": { + "type": "hnsw" + } + }, + "text": { + "type": "text", + "index": false + } + } + } + } + } +} +``` + +With the above mapping, we can index multiple passage vectors along with storing the individual passage text. + +```console +POST passage_vectors/_bulk?refresh=true +{ "index": { "_id": "1" } } +{ "full_text": "first paragraph another paragraph", "creation_time": "2019-05-04", "paragraph": [ { "vector": [ 0.45, 45 ], "text": "first paragraph", "paragraph_id": "1" }, { "vector": [ 0.8, 0.6 ], "text": "another paragraph", "paragraph_id": "2" } ] } +{ "index": { "_id": "2" } } +{ "full_text": "number one paragraph number two paragraph", "creation_time": "2020-05-04", "paragraph": [ { "vector": [ 1.2, 4.5 ], "text": "number one paragraph", "paragraph_id": "1" }, { "vector": [ -1, 42 ], "text": "number two paragraph", "paragraph_id": "2" } ] } +``` + +The query will seem very similar to a typical kNN search: + +```console +POST passage_vectors/_search +{ + "fields": ["full_text", "creation_time"], + "_source": false, + "knn": { + "query_vector": [ + 0.45, + 45 + ], + "field": "paragraph.vector", + "k": 2, + "num_candidates": 2 + } +} +``` + +Note below that even though we have 4 total vectors, we still return two documents. kNN search over nested dense_vectors will always diversify the top results over the top-level document. Meaning, `"k"` top-level documents will be returned, scored by their nearest passage vector (e.g. `"paragraph.vector"`). + +```console-result +{ + "took": 4, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": 1.0, + "hits": [ + { + "_index": "passage_vectors", + "_id": "1", + "_score": 1.0, + "fields": { + "creation_time": [ + "2019-05-04T00:00:00.000Z" + ], + "full_text": [ + "first paragraph another paragraph" + ] + } + }, + { + "_index": "passage_vectors", + "_id": "2", + "_score": 0.9997144, + "fields": { + "creation_time": [ + "2020-05-04T00:00:00.000Z" + ], + "full_text": [ + "number one paragraph number two paragraph" + ] + } + } + ] + } +} +``` + +What if you wanted to filter by some top-level document metadata? You can do this by adding `filter` to your `knn` clause. + +::::{note} +`filter` will always be over the top-level document metadata. This means you cannot filter based on `nested` field metadata. +:::: + + +```console +POST passage_vectors/_search +{ + "fields": [ + "creation_time", + "full_text" + ], + "_source": false, + "knn": { + "query_vector": [ + 0.45, + 45 + ], + "field": "paragraph.vector", + "k": 2, + "num_candidates": 2, + "filter": { + "bool": { + "filter": [ + { + "range": { + "creation_time": { + "gte": "2019-05-01", + "lte": "2019-05-05" + } + } + } + ] + } + } + } +} +``` + +Now we have filtered based on the top level `"creation_time"` and only one document falls within that range. + +```console-result +{ + "took": 4, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 1.0, + "hits": [ + { + "_index": "passage_vectors", + "_id": "1", + "_score": 1.0, + "fields": { + "creation_time": [ + "2019-05-04T00:00:00.000Z" + ], + "full_text": [ + "first paragraph another paragraph" + ] + } + } + ] + } +} +``` + + +### Nested kNN Search with Inner hits [nested-knn-search-inner-hits] + +Additionally, if you wanted to extract the nearest passage for a matched document, you can supply [inner_hits](https://www.elastic.co/guide/en/elasticsearch/reference/current/inner-hits.html) to the `knn` clause. + +::::{note} +When using `inner_hits` and multiple `knn` clauses, be sure to specify the [`inner_hits.name`](https://www.elastic.co/guide/en/elasticsearch/reference/current/inner-hits.html#inner-hits-options) field. Otherwise, a naming clash can occur and fail the search request. +:::: + + +```console +POST passage_vectors/_search +{ + "fields": [ + "creation_time", + "full_text" + ], + "_source": false, + "knn": { + "query_vector": [ + 0.45, + 45 + ], + "field": "paragraph.vector", + "k": 2, + "num_candidates": 2, + "inner_hits": { + "_source": false, + "fields": [ + "paragraph.text" + ], + "size": 1 + } + } +} +``` + +Now the result will contain the nearest found paragraph when searching. + +```console-result +{ + "took": 4, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": 1.0, + "hits": [ + { + "_index": "passage_vectors", + "_id": "1", + "_score": 1.0, + "fields": { + "creation_time": [ + "2019-05-04T00:00:00.000Z" + ], + "full_text": [ + "first paragraph another paragraph" + ] + }, + "inner_hits": { + "paragraph": { + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": 1.0, + "hits": [ + { + "_index": "passage_vectors", + "_id": "1", + "_nested": { + "field": "paragraph", + "offset": 0 + }, + "_score": 1.0, + "fields": { + "paragraph": [ + { + "text": [ + "first paragraph" + ] + } + ] + } + } + ] + } + } + } + }, + { + "_index": "passage_vectors", + "_id": "2", + "_score": 0.9997144, + "fields": { + "creation_time": [ + "2020-05-04T00:00:00.000Z" + ], + "full_text": [ + "number one paragraph number two paragraph" + ] + }, + "inner_hits": { + "paragraph": { + "hits": { + "total": { + "value": 2, + "relation": "eq" + }, + "max_score": 0.9997144, + "hits": [ + { + "_index": "passage_vectors", + "_id": "2", + "_nested": { + "field": "paragraph", + "offset": 1 + }, + "_score": 0.9997144, + "fields": { + "paragraph": [ + { + "text": [ + "number two paragraph" + ] + } + ] + } + } + ] + } + } + } + } + ] + } +} +``` + + +### Indexing considerations [knn-indexing-considerations] + +For approximate kNN search, {{es}} stores the dense vector values of each segment as an [HNSW graph](https://arxiv.org/abs/1603.09320). Indexing vectors for approximate kNN search can take substantial time because of how expensive it is to build these graphs. You may need to increase the client request timeout for index and bulk requests. The [approximate kNN tuning guide](../../../solutions/search/vector/knn.md) contains important guidance around indexing performance, and how the index configuration can affect search performance. + +In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-index-options) argument to adjust these parameters: + +```console +PUT image-index +{ + "mappings": { + "properties": { + "image-vector": { + "type": "dense_vector", + "dims": 3, + "similarity": "l2_norm", + "index_options": { + "type": "hnsw", + "m": 32, + "ef_construction": 100 + } + } + } + } +} +``` + + +### Limitations for approximate kNN search [approximate-knn-limitations] + +* When using kNN search in [{{ccs}}](../../../solutions/search/cross-cluster-search.md), the [`ccs_minimize_roundtrips`](../../../solutions/search/cross-cluster-search.md#ccs-min-roundtrips) option is not supported. +* {{es}} uses the [HNSW algorithm](https://arxiv.org/abs/1603.09320) to support efficient kNN search. Like most kNN algorithms, HNSW is an approximate method that sacrifices result accuracy for improved search speed. This means the results returned are not always the true *k* closest neighbors. + +::::{note} +Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#dfs-query-then-fetch) search type in order to gather the global top `k` matches across shards. You cannot set the `search_type` explicitly when running kNN search. +:::: + + + +### Oversampling and rescoring for quantized vectors [dense-vector-knn-search-rescoring] + +When using [quantized vectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing: + +* **Oversampling**: Retrieve more candidates per shard. +* **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates. + +As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines: + +* The performance and memory gains of approximate retrieval using quantized vectors for retrieving the top candidates. +* The accuracy of using the original vectors for rescoring the top candidates. + +All forms of quantization will result in some accuracy loss and as the quantization level increases the accuracy loss will also increase. Generally, we have found that: + +* `int8` requires minimal if any rescoring +* `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss. +* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. + +You can use the `rescore_vector` [preview] option to automatically perform reranking. When a rescore `oversample` parameter is specified, the approximate kNN search will: + +* Retrieve `num_candidates` candidates per shard. +* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors. +* The top `k` rescored candidates will be returned. + +Here is an example of using the `rescore_vector` option with the `oversample` parameter: + +```console +POST image-index/_search +{ + "knn": { + "field": "image-vector", + "query_vector": [-5, 9, -12], + "k": 10, + "num_candidates": 100, + "rescore_vector": { + "oversample": 2.0 + } + }, + "fields": [ "title", "file-type" ] +} +``` + +This example will: + +* Search using approximate kNN for the top 100 candidates. +* Rescore the top 20 candidates (`oversample * k`) per shard using the original, non quantized vectors. +* Return the top 10 (`k`) rescored candidates. +* Merge the rescored canddidates from all shards, and return the top 10 (`k`) results. + + +#### Additional rescoring techniques [dense-vector-knn-search-rescoring-rescore-additional] + +The following sections provide additional ways of rescoring: + + +##### Use the `rescore` section for top-level kNN search [dense-vector-knn-search-rescoring-rescore-section] + +You can use this option when you don’t want to rescore on each shard, but on the top results from all shards. + +Use the [rescore section](https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html#rescore) in the `_search` request to rescore the top results from a kNN search. + +Here is an example using the top level `knn` search with oversampling and using `rescore` to rerank the results: + +```console +POST /my-index/_search +{ + "size": 10, <1> + "knn": { + "query_vector": [0.04283529, 0.85670587, -0.51402352, 0], + "field": "my_int4_vector", + "k": 20, <2> + "num_candidates": 50 + }, + "rescore": { + "window_size": 20, <3> + "query": { + "rescore_query": { + "script_score": { + "query": { + "match_all": {} + }, + "script": { + "source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> + "params": { + "queryVector": [0.04283529, 0.85670587, -0.51402352, 0] + } + } + } + }, + "query_weight": 0, <5> + "rescore_query_weight": 1 <6> + } + } +} +``` + +1. The number of results to return, note its only 10 and we will oversample by 2x, gathering 20 nearest neighbors. +2. The number of results to return from the KNN search. This will do an approximate KNN search with 50 candidates per HNSW graph and use the quantized vectors, returning the 20 most similar vectors according to the quantized score. Additionally, since this is the top-level `knn` object, the global top 20 results will from all shards will be gathered before rescoring. Combining with `rescore`, this is oversampling by `2x`, meaning gathering 20 nearest neighbors according to quantized scoring and rescoring with higher fidelity float vectors. +3. The number of results to rescore, if you want to rescore all results, set this to the same value as `k` +4. The script to rescore the results. Script score will interact directly with the originally provided float32 vector. +5. The weight of the original query, here we simply throw away the original score +6. The weight of the rescore query, here we only use the rescore query + + + +##### Use a `script_score` query to rescore per shard [dense-vector-knn-search-rescoring-script-score] + +You can use this option when you want to rescore on each shard and want more fine-grained control on the rescoring than the `rescore_vector` option provides. + +Use rescore per shard with the [knn query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html) and [script_score query ](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html). Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute. + +```console +POST /my-index/_search +{ + "size": 10, <1> + "query": { + "script_score": { + "query": { + "knn": { <2> + "query_vector": [0.04283529, 0.85670587, -0.51402352, 0], + "field": "my_int4_vector", + "num_candidates": 20 <3> + } + }, + "script": { + "source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> + "params": { + "queryVector": [0.04283529, 0.85670587, -0.51402352, 0] + } + } + } + } +} +``` + +1. The number of results to return +2. The `knn` query to perform the initial search, this is executed per-shard +3. The number of candidates to use for the initial approximate `knn` search. This will search using the quantized vectors and return the top 20 candidates per shard to then be scored +4. The script to score the results. Script score will interact directly with the originally provided float32 vector. + + + +## Exact kNN [exact-knn] + +To run an exact kNN search, use a `script_score` query with a vector function. + +1. Explicitly map one or more `dense_vector` fields. If you don’t intend to use the field for approximate kNN, set the `index` mapping option to `false`. This can significantly improve indexing speed. + + ```console + PUT product-index + { + "mappings": { + "properties": { + "product-vector": { + "type": "dense_vector", + "dims": 5, + "index": false + }, + "price": { + "type": "long" + } + } + } + } + ``` + +2. Index your data. + + ```console + POST product-index/_bulk?refresh=true + { "index": { "_id": "1" } } + { "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 } + { "index": { "_id": "2" } } + { "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 } + { "index": { "_id": "3" } } + { "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 } + ... + ``` + +3. Use the [search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html) to run a `script_score` query containing a [vector function](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html#vector-functions). + + ::::{tip} + To limit the number of matched documents passed to the vector function, we recommend you specify a filter query in the `script_score.query` parameter. If needed, you can use a [`match_all` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html) in this parameter to match all documents. However, matching all documents can significantly increase search latency. + :::: + + + ```console + POST product-index/_search + { + "query": { + "script_score": { + "query" : { + "bool" : { + "filter" : { + "range" : { + "price" : { + "gte": 1000 + } + } + } + } + }, + "script": { + "source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0", + "params": { + "queryVector": [-0.5, 90.0, -10, 14.8, -156.0] + } + } + } + } + } + ``` + +A *k-nearest neighbor* (kNN) search finds the *k* nearest vectors to a query vector, as measured by a similarity metric. + +Common use cases for kNN include: + +* Relevance ranking based on natural language processing (NLP) algorithms +* Product recommendations and recommendation engines +* Similarity search for images or videos + +::::{tip} +Check out our [hands-on tutorial](bring-own-vectors.md) to learn how to ingest dense vector embeddings into Elasticsearch. +:::: \ No newline at end of file diff --git a/solutions/toc.yml b/solutions/toc.yml index 1dd51d4a3f..9d8869e1ba 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -636,7 +636,7 @@ toc: - file: search/vector/dense-vector.md children: - file: search/vector/knn.md - - file: search/semantic-search/bring-own-vectors.md + - file: search/vector/bring-own-vectors.md - file: search/vector/sparse-vector-elser.md - file: search/semantic-search.md children: From f506f77ac6685f9be606b662b32883e6a0d58d87 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 14:24:20 +0100 Subject: [PATCH 13/30] toc --- raw-migrated-files/toc.yml | 1 - .../search/semantic-search/semantic-search-semantic-text.md | 2 -- 2 files changed, 3 deletions(-) diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 110bc49660..1ae5646932 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -617,7 +617,6 @@ toc: - file: elasticsearch/elasticsearch-reference/ip-filtering.md - file: elasticsearch/elasticsearch-reference/jwt-auth-realm.md - file: elasticsearch/elasticsearch-reference/kerberos-realm.md - - file: elasticsearch/elasticsearch-reference/knn-search.md - file: elasticsearch/elasticsearch-reference/ldap-realm.md - file: elasticsearch/elasticsearch-reference/mapping-roles.md - file: elasticsearch/elasticsearch-reference/mapping.md diff --git a/solutions/search/semantic-search/semantic-search-semantic-text.md b/solutions/search/semantic-search/semantic-search-semantic-text.md index 72bb74a2e0..79aacbcd6c 100644 --- a/solutions/search/semantic-search/semantic-search-semantic-text.md +++ b/solutions/search/semantic-search/semantic-search-semantic-text.md @@ -4,8 +4,6 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-semantic-text.html --- - - # Semantic search with `semantic_text` [semantic-search-semantic-text] From f3a04b6836e11f1aca5041003fa17ded29792e5a Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 14:59:15 +0100 Subject: [PATCH 14/30] Cleanup knn Fix relative links to use proper paths/anchors Mov "Indexing considerations" section earlier in the document Improve link to kNN tuning guide Minor text/formatting adjustments --- solutions/search/vector/knn.md | 59 ++++++++++++++++------------------ 1 file changed, 27 insertions(+), 32 deletions(-) diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md index 09eae8ddcf..287fd2a192 100644 --- a/solutions/search/vector/knn.md +++ b/solutions/search/vector/knn.md @@ -35,8 +35,8 @@ Common use cases for kNN include: {{es}} supports two methods for kNN search: -* [Approximate kNN](../../../solutions/search/vector/knn.md#approximate-knn) using the `knn` search option or `knn` query -* [Exact, brute-force kNN](../../../solutions/search/vector/knn.md#exact-knn) using a `script_score` query with a vector function +* [Approximate kNN](#approximate-knn) using the `knn` search option or `knn` query +* [Exact, brute-force kNN](#exact-knn) using a `script_score` query with a vector function In most cases, you’ll want to use approximate kNN. Approximate kNN offers lower latency at the cost of slower indexing and imperfect accuracy. @@ -46,10 +46,9 @@ Exact, brute-force kNN guarantees accurate results but doesn’t scale well with ## Approximate kNN [approximate-knn] ::::{warning} -Compared to other types of search, approximate kNN search has specific resource requirements. In particular, all vector data must fit in the node’s page cache for it to be efficient. Please consult the [approximate kNN search tuning guide](../../../solutions/search/vector/knn.md) for important notes on configuration and sizing. +Compared to other types of search, approximate kNN search has specific resource requirements. In particular, all vector data must fit in the node’s page cache for it to be efficient. Please consult the [approximate kNN search tuning guide](/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md) for important notes on configuration and sizing. :::: - To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) to search one or more `dense_vector` fields with indexing enabled. 1. Explicitly map one or more `dense_vector` fields. Approximate kNN search requires the following mapping options: @@ -117,7 +116,31 @@ The [document `_score`](https://www.elastic.co/guide/en/elasticsearch/reference/ Support for approximate kNN search was added in version 8.0. Before this, `dense_vector` fields did not support enabling `index` in the mapping. If you created an index prior to 8.0 containing `dense_vector` fields, then to support approximate kNN search the data must be reindexed using a new field mapping that sets `index: true` which is the default option. :::: +### Indexing considerations [knn-indexing-considerations] + +For approximate kNN search, {{es}} stores the dense vector values of each segment as an [HNSW graph](https://arxiv.org/abs/1603.09320). Indexing vectors for approximate kNN search can take substantial time because of how expensive it is to build these graphs. You may need to increase the client request timeout for index and bulk requests. The [approximate kNN tuning guide](/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md) contains important guidance around indexing performance, and how the index configuration can affect search performance. +In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-index-options) argument to adjust these parameters: + +```console +PUT image-index +{ + "mappings": { + "properties": { + "image-vector": { + "type": "dense_vector", + "dims": 3, + "similarity": "l2_norm", + "index_options": { + "type": "hnsw", + "m": 32, + "ef_construction": 100 + } + } + } + } +} +``` ### Tune approximate kNN for speed or accuracy [tune-approximate-knn-for-speed-accuracy] @@ -853,34 +876,6 @@ Now the result will contain the nearest found paragraph when searching. } ``` - -### Indexing considerations [knn-indexing-considerations] - -For approximate kNN search, {{es}} stores the dense vector values of each segment as an [HNSW graph](https://arxiv.org/abs/1603.09320). Indexing vectors for approximate kNN search can take substantial time because of how expensive it is to build these graphs. You may need to increase the client request timeout for index and bulk requests. The [approximate kNN tuning guide](../../../solutions/search/vector/knn.md) contains important guidance around indexing performance, and how the index configuration can affect search performance. - -In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-index-options) argument to adjust these parameters: - -```console -PUT image-index -{ - "mappings": { - "properties": { - "image-vector": { - "type": "dense_vector", - "dims": 3, - "similarity": "l2_norm", - "index_options": { - "type": "hnsw", - "m": 32, - "ef_construction": 100 - } - } - } - } -} -``` - - ### Limitations for approximate kNN search [approximate-knn-limitations] * When using kNN search in [{{ccs}}](../../../solutions/search/cross-cluster-search.md), the [`ccs_minimize_roundtrips`](../../../solutions/search/cross-cluster-search.md#ccs-min-roundtrips) option is not supported. From 69e01a5aebcb18b62c0f69506cdfd0b14608384b Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 15:17:29 +0100 Subject: [PATCH 15/30] Tighten knn use cases --- solutions/search/vector/knn.md | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md index 287fd2a192..669020984d 100644 --- a/solutions/search/vector/knn.md +++ b/solutions/search/vector/knn.md @@ -11,10 +11,18 @@ A *k-nearest neighbor* (kNN) search finds the *k* nearest vectors to a query vec Common use cases for kNN include: -* Relevance ranking based on natural language processing (NLP) algorithms -* Product recommendations and recommendation engines -* Similarity search for images or videos - +* Search + * Semantic text search + * Image/video similarity + +* Recommendations + * Product suggestions + * Collaborative filtering + * Content discovery + +* Analysis + * Anomaly detection + * Pattern matching ## Prerequisites [knn-prereqs] From c9b4ca93e05b005badedbb6ec7b9872710e6e095 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 15:40:34 +0100 Subject: [PATCH 16/30] tweak --- solutions/search/retrievers-overview.md | 8 ++++++-- solutions/search/vector/knn.md | 4 +--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/solutions/search/retrievers-overview.md b/solutions/search/retrievers-overview.md index 93dad4ff1a..aba58d2e70 100644 --- a/solutions/search/retrievers-overview.md +++ b/solutions/search/retrievers-overview.md @@ -1,8 +1,12 @@ # Retrievers [retrievers-overview] -A retriever is an abstraction that was added to the Search API in **8.14.0** and was made generally available in **8.16.0**. This abstraction enables the configuration of multi-stage retrieval pipelines within a single `_search` call. This simplifies your search application logic, because you no longer need to configure complex searches via multiple {{es}} calls or implement additional client-side logic to combine results from different queries. +A retriever is an abstraction that was added to the `_search` API in **8.14.0** and was made generally available in **8.16.0**. -This document provides a general overview of the retriever abstraction. For implementation details, including notable restrictions, check out the [reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/retriever.html) in the `_search` API docs. +This syntax enables the configuration of multi-stage retrieval pipelines within a single `_search` call. This simplifies your search application logic, because you no longer need to configure complex searches via multiple {{es}} calls or implement additional client-side logic to combine results from different queries. + +::::{note} +This document provides a general overview of the retriever abstraction. For a full syntax reference and implementation overview, check out the [reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/retriever.html) in the `_search` API docs. +:::: ::::{tip} Prefer to start with some sample code? Check out [](retrievers-examples.md) for a collection of retriever examples. diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md index 669020984d..b4825160c3 100644 --- a/solutions/search/vector/knn.md +++ b/solutions/search/vector/knn.md @@ -37,13 +37,11 @@ Common use cases for kNN include: * `create`, `index`, or `write` to add data to the index you created * `read` to search the index - - ## kNN methods [knn-methods] {{es}} supports two methods for kNN search: -* [Approximate kNN](#approximate-knn) using the `knn` search option or `knn` query +* [Approximate kNN](#approximate-knn) using the `knn` search option, `knn` query or a `knn` [retriever](../retrievers-overview.md) * [Exact, brute-force kNN](#exact-knn) using a `script_score` query with a vector function In most cases, you’ll want to use approximate kNN. Approximate kNN offers lower latency at the cost of slower indexing and imperfect accuracy. From 399328b33e35c3f0b8dd6b540ff8bcc37a914c4e Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 15:42:49 +0100 Subject: [PATCH 17/30] Delete --- .../elasticsearch-reference/knn-search.md | 1100 ----------------- 1 file changed, 1100 deletions(-) delete mode 100644 raw-migrated-files/elasticsearch/elasticsearch-reference/knn-search.md diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/knn-search.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/knn-search.md deleted file mode 100644 index 0c1f4171dd..0000000000 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/knn-search.md +++ /dev/null @@ -1,1100 +0,0 @@ ---- -navigation_title: "kNN search" ---- - -# k-nearest neighbor (kNN) search [knn-search] - - -A *k-nearest neighbor* (kNN) search finds the *k* nearest vectors to a query vector, as measured by a similarity metric. - -Common use cases for kNN include: - -* Relevance ranking based on natural language processing (NLP) algorithms -* Product recommendations and recommendation engines -* Similarity search for images or videos - - -## Prerequisites [knn-prereqs] - -* To run a kNN search, you must be able to convert your data into meaningful vector values. You can [create these vectors using a natural language processing (NLP) model in {{es}}](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md), or generate them outside {{es}}. Vectors can be added to documents as [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) field values. Queries are represented as vectors with the same dimension. - - Design your vectors so that the closer a document’s vector is to a query vector, based on a similarity metric, the better its match. - -* To complete the steps in this guide, you must have the following [index privileges](../../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices): - - * `create_index` or `manage` to create an index with a `dense_vector` field - * `create`, `index`, or `write` to add data to the index you created - * `read` to search the index - - - -## kNN methods [knn-methods] - -{{es}} supports two methods for kNN search: - -* [Approximate kNN](../../../solutions/search/vector/knn.md#approximate-knn) using the `knn` search option or `knn` query -* [Exact, brute-force kNN](../../../solutions/search/vector/knn.md#exact-knn) using a `script_score` query with a vector function - -In most cases, you’ll want to use approximate kNN. Approximate kNN offers lower latency at the cost of slower indexing and imperfect accuracy. - -Exact, brute-force kNN guarantees accurate results but doesn’t scale well with large datasets. With this approach, a `script_score` query must scan each matching document to compute the vector function, which can result in slow search speeds. However, you can improve latency by using a [query](../../../explore-analyze/query-filter/languages/querydsl.md) to limit the number of matching documents passed to the function. If you filter your data to a small subset of documents, you can get good search performance using this approach. - - -## Approximate kNN [approximate-knn] - -::::{warning} -Compared to other types of search, approximate kNN search has specific resource requirements. In particular, all vector data must fit in the node’s page cache for it to be efficient. Please consult the [approximate kNN search tuning guide](../../../solutions/search/vector/knn.md) for important notes on configuration and sizing. -:::: - - -To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) to search one or more `dense_vector` fields with indexing enabled. - -1. Explicitly map one or more `dense_vector` fields. Approximate kNN search requires the following mapping options: - - * A `similarity` value. This value determines the similarity metric used to score documents based on similarity between the query and document vector. For a list of available metrics, see the [`similarity`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) parameter documentation. The `similarity` setting defaults to `cosine`. - - ```console - PUT image-index - { - "mappings": { - "properties": { - "image-vector": { - "type": "dense_vector", - "dims": 3, - "similarity": "l2_norm" - }, - "title-vector": { - "type": "dense_vector", - "dims": 5, - "similarity": "l2_norm" - }, - "title": { - "type": "text" - }, - "file-type": { - "type": "keyword" - } - } - } - } - ``` - -2. Index your data. - - ```console - POST image-index/_bulk?refresh=true - { "index": { "_id": "1" } } - { "image-vector": [1, 5, -20], "title-vector": [12, 50, -10, 0, 1], "title": "moose family", "file-type": "jpg" } - { "index": { "_id": "2" } } - { "image-vector": [42, 8, -15], "title-vector": [25, 1, 4, -12, 2], "title": "alpine lake", "file-type": "png" } - { "index": { "_id": "3" } } - { "image-vector": [15, 11, 23], "title-vector": [1, 5, 25, 50, 20], "title": "full moon", "file-type": "jpg" } - ... - ``` - -3. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) or the [`knn` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html) (expert case). - - ```console - POST image-index/_search - { - "knn": { - "field": "image-vector", - "query_vector": [-5, 9, -12], - "k": 10, - "num_candidates": 100 - }, - "fields": [ "title", "file-type" ] - } - ``` - - -The [document `_score`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-response-body-score) is determined by the similarity between the query and document vector. See [`similarity`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) for more information on how kNN search scores are computed. - -::::{note} -Support for approximate kNN search was added in version 8.0. Before this, `dense_vector` fields did not support enabling `index` in the mapping. If you created an index prior to 8.0 containing `dense_vector` fields, then to support approximate kNN search the data must be reindexed using a new field mapping that sets `index: true` which is the default option. -:::: - - - -### Tune approximate kNN for speed or accuracy [tune-approximate-knn-for-speed-accuracy] - -To gather results, the kNN search API finds a `num_candidates` number of approximate nearest neighbor candidates on each shard. The search computes the similarity of these candidate vectors to the query vector, selecting the `k` most similar results from each shard. The search then merges the results from each shard to return the global top `k` nearest neighbors. - -You can increase `num_candidates` for more accurate results at the cost of slower search speeds. A search with a high value for `num_candidates` considers more candidates from each shard. This takes more time, but the search has a higher probability of finding the true `k` top nearest neighbors. - -Similarly, you can decrease `num_candidates` for faster searches with potentially less accurate results. - - -### Approximate kNN using byte vectors [approximate-knn-using-byte-vectors] - -The approximate kNN search API supports `byte` value vectors in addition to `float` value vectors. Use the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) to search a `dense_vector` field with [`element_type`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) set to `byte` and indexing enabled. - -1. Explicitly map one or more `dense_vector` fields with [`element_type`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) set to `byte` and indexing enabled. - - ```console - PUT byte-image-index - { - "mappings": { - "properties": { - "byte-image-vector": { - "type": "dense_vector", - "element_type": "byte", - "dims": 2 - }, - "title": { - "type": "text" - } - } - } - } - ``` - -2. Index your data ensuring all vector values are integers within the range [-128, 127]. - - ```console - POST byte-image-index/_bulk?refresh=true - { "index": { "_id": "1" } } - { "byte-image-vector": [5, -20], "title": "moose family" } - { "index": { "_id": "2" } } - { "byte-image-vector": [8, -15], "title": "alpine lake" } - { "index": { "_id": "3" } } - { "byte-image-vector": [11, 23], "title": "full moon" } - ``` - -3. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) ensuring the `query_vector` values are integers within the range [-128, 127]. - - ```console - POST byte-image-index/_search - { - "knn": { - "field": "byte-image-vector", - "query_vector": [-5, 9], - "k": 10, - "num_candidates": 100 - }, - "fields": [ "title" ] - } - ``` - - -*Note*: In addition to the standard byte array, one can also provide a hex-encoded string value for the `query_vector` param. As an example, the search request above can also be expressed as follows, which would yield the same results - -```console -POST byte-image-index/_search -{ - "knn": { - "field": "byte-image-vector", - "query_vector": "fb09", - "k": 10, - "num_candidates": 100 - }, - "fields": [ "title" ] -} -``` - - -### Byte quantized kNN search [knn-search-quantized-example] - -If you want to provide `float` vectors, but want the memory savings of `byte` vectors, you can use the [quantization](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-quantization) feature. Quantization allows you to provide `float` vectors, but internally they are indexed as `byte` vectors. Additionally, the original `float` vectors are still retained in the index. - -::::{note} -The default index type for `dense_vector` is `int8_hnsw`. -:::: - - -To use quantization, you can use the index type `int8_hnsw` or `int4_hnsw` object in the `dense_vector` mapping. - -```console -PUT quantized-image-index -{ - "mappings": { - "properties": { - "image-vector": { - "type": "dense_vector", - "element_type": "float", - "dims": 2, - "index": true, - "index_options": { - "type": "int8_hnsw" - } - }, - "title": { - "type": "text" - } - } - } -} -``` - -1. Index your `float` vectors. - - ```console - POST quantized-image-index/_bulk?refresh=true - { "index": { "_id": "1" } } - { "image-vector": [0.1, -2], "title": "moose family" } - { "index": { "_id": "2" } } - { "image-vector": [0.75, -1], "title": "alpine lake" } - { "index": { "_id": "3" } } - { "image-vector": [1.2, 0.1], "title": "full moon" } - ``` - -2. Run the search using the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn). When searching, the `float` vector is automatically quantized to a `byte` vector. - - ```console - POST quantized-image-index/_search - { - "knn": { - "field": "image-vector", - "query_vector": [0.1, -2], - "k": 10, - "num_candidates": 100 - }, - "fields": [ "title" ] - } - ``` - - -Since the original `float` vectors are still retained in the index, you can optionally use them for re-scoring. Meaning, you can search over all the vectors quickly using the `int8_hnsw` index and then rescore only the top `k` results. This provides the best of both worlds, fast search and accurate scoring. - -```console -POST quantized-image-index/_search -{ - "knn": { - "field": "image-vector", - "query_vector": [0.1, -2], - "k": 15, - "num_candidates": 100 - }, - "fields": [ "title" ], - "rescore": { - "window_size": 10, - "query": { - "rescore_query": { - "script_score": { - "query": { - "match_all": {} - }, - "script": { - "source": "cosineSimilarity(params.query_vector, 'image-vector') + 1.0", - "params": { - "query_vector": [0.1, -2] - } - } - } - } - } - } -} -``` - - -### Filtered kNN search [knn-search-filter-example] - -The kNN search API supports restricting the search using a filter. The search will return the top `k` documents that also match the filter query. - -The following request performs an approximate kNN search filtered by the `file-type` field: - -```console -POST image-index/_search -{ - "knn": { - "field": "image-vector", - "query_vector": [54, 10, -2], - "k": 5, - "num_candidates": 50, - "filter": { - "term": { - "file-type": "png" - } - } - }, - "fields": ["title"], - "_source": false -} -``` - -::::{note} -The filter is applied **during** the approximate kNN search to ensure that `k` matching documents are returned. This contrasts with a post-filtering approach, where the filter is applied **after** the approximate kNN search completes. Post-filtering has the downside that it sometimes returns fewer than k results, even when there are enough matching documents. -:::: - - - -### Approximate kNN search and filtering [approximate-knn-search-and-filtering] - -Unlike conventional query filtering, where more restrictive filters typically lead to faster queries, applying filters in an approximate kNN search with an HNSW index can decrease performance. This is because searching the HNSW graph requires additional exploration to obtain the `num_candidates` that meet the filter criteria. - -To avoid significant performance drawbacks, Lucene implements the following strategies per segment: - -* If the filtered document count is less than or equal to num_candidates, the search bypasses the HNSW graph and uses a brute force search on the filtered documents. -* While exploring the HNSW graph, if the number of nodes explored exceeds the number of documents that satisfy the filter, the search will stop exploring the graph and switch to a brute force search over the filtered documents. - - -### Combine approximate kNN with other features [_combine_approximate_knn_with_other_features] - -You can perform *hybrid retrieval* by providing both the [`knn` option](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-api-knn) and a [`query`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#request-body-search-query): - -```console -POST image-index/_search -{ - "query": { - "match": { - "title": { - "query": "mountain lake", - "boost": 0.9 - } - } - }, - "knn": { - "field": "image-vector", - "query_vector": [54, 10, -2], - "k": 5, - "num_candidates": 50, - "boost": 0.1 - }, - "size": 10 -} -``` - -This search finds the global top `k = 5` vector matches, combines them with the matches from the `match` query, and finally returns the 10 top-scoring results. The `knn` and `query` matches are combined through a disjunction, as if you took a boolean *or* between them. The top `k` vector results represent the global nearest neighbors across all index shards. - -The score of each hit is the sum of the `knn` and `query` scores. You can specify a `boost` value to give a weight to each score in the sum. In the example above, the scores will be calculated as - -``` -score = 0.9 * match_score + 0.1 * knn_score -``` - -The `knn` option can also be used with [`aggregations`](../../../explore-analyze/aggregations.md). In general, {{es}} computes aggregations over all documents that match the search. So for approximate kNN search, aggregations are calculated on the top `k` nearest documents. If the search also includes a `query`, then aggregations are calculated on the combined set of `knn` and `query` matches. - - -### Perform semantic search [knn-semantic-search] - -kNN search enables you to perform semantic search by using a previously deployed [text embedding model](../../../explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md#ml-nlp-text-embedding). Instead of literal matching on search terms, semantic search retrieves results based on the intent and the contextual meaning of a search query. - -Under the hood, the text embedding NLP model generates a dense vector from the input query string called `model_text` you provide. Then, it is searched against an index containing dense vectors created with the same text embedding {{ml}} model. The search results are semantically similar as learned by the model. - -::::{important} -To perform semantic search: - -* you need an index that contains the dense vector representation of the input data to search against, -* you must use the same text embedding model for search that you used to create the dense vectors from the input data, -* the text embedding NLP model deployment must be started. - -:::: - - -Reference the deployed text embedding model or the model deployment in the `query_vector_builder` object and provide the search query as `model_text`: - -```js -(...) -{ - "knn": { - "field": "dense-vector-field", - "k": 10, - "num_candidates": 100, - "query_vector_builder": { - "text_embedding": { <1> - "model_id": "my-text-embedding-model", <2> - "model_text": "The opposite of blue" <3> - } - } - } -} -(...) -``` - -1. The {{nlp}} task to perform. It must be `text_embedding`. -2. The ID of the text embedding model to use to generate the dense vectors from the query string. Use the same model that generated the embeddings from the input text in the index you search against. You can use the value of the `deployment_id` instead in the `model_id` argument. -3. The query string from which the model generates the dense vector representation. - - -For more information on how to deploy a trained model and use it to create text embeddings, refer to this [end-to-end example](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md). - - -### Search multiple kNN fields [_search_multiple_knn_fields] - -In addition to *hybrid retrieval*, you can search more than one kNN vector field at a time: - -```console -POST image-index/_search -{ - "query": { - "match": { - "title": { - "query": "mountain lake", - "boost": 0.9 - } - } - }, - "knn": [ { - "field": "image-vector", - "query_vector": [54, 10, -2], - "k": 5, - "num_candidates": 50, - "boost": 0.1 - }, - { - "field": "title-vector", - "query_vector": [1, 20, -52, 23, 10], - "k": 10, - "num_candidates": 10, - "boost": 0.5 - }], - "size": 10 -} -``` - -This search finds the global top `k = 5` vector matches for `image-vector` and the global `k = 10` for the `title-vector`. These top values are then combined with the matches from the `match` query and the top-10 documents are returned. The multiple `knn` entries and the `query` matches are combined through a disjunction, as if you took a boolean *or* between them. The top `k` vector results represent the global nearest neighbors across all index shards. - -The scoring for a doc with the above configured boosts would be: - -``` -score = 0.9 * match_score + 0.1 * knn_score_image-vector + 0.5 * knn_score_title-vector -``` - - -### Search kNN with expected similarity [knn-similarity-search] - -While kNN is a powerful tool, it always tries to return `k` nearest neighbors. Consequently, when using `knn` with a `filter`, you could filter out all relevant documents and only have irrelevant ones left to search. In that situation, `knn` will still do its best to return `k` nearest neighbors, even though those neighbors could be far away in the vector space. - -To alleviate this worry, there is a `similarity` parameter available in the `knn` clause. This value is the required minimum similarity for a vector to be considered a match. The `knn` search flow with this parameter is as follows: - -* Apply any user provided `filter` queries -* Explore the vector space to get `k` vectors -* Do not return any vectors that are further away than the configured `similarity` - -::::{note} -`similarity` is the true [similarity](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity) before it has been transformed into `_score` and boost applied. -:::: - - -For each configured [similarity](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-similarity), here is the corresponding inverted `_score` function. This is so if you are wanting to filter from a `_score` perspective, you can do this minor transformation to correctly reject irrelevant results. - -* `l2_norm`: `sqrt((1 / _score) - 1)` -* `cosine`: `(2 * _score) - 1` -* `dot_product`: `(2 * _score) - 1` -* `max_inner_product`: - - * `_score < 1`: `1 - (1 / _score)` - * `_score >= 1`: `_score - 1` - - -Here is an example. In this example we search for the given `query_vector` for `k` nearest neighbors. However, with `filter` applied and requiring that the found vectors have at least the provided `similarity` between them. - -```console -POST image-index/_search -{ - "knn": { - "field": "image-vector", - "query_vector": [1, 5, -20], - "k": 5, - "num_candidates": 50, - "similarity": 36, - "filter": { - "term": { - "file-type": "png" - } - } - }, - "fields": ["title"], - "_source": false -} -``` - -In our data set, the only document with the file type of `png` has a vector of `[42, 8, -15]`. The `l2_norm` distance between `[42, 8, -15]` and `[1, 5, -20]` is `41.412`, which is greater than the configured similarity of `36`. Meaning, this search will return no hits. - - -### Nested kNN Search [nested-knn-search] - -It is common for text to exceed a particular model’s token limit and requires chunking before building the embeddings for individual chunks. When using [`nested`](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) with [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html), you can achieve nearest passage retrieval without copying top-level document metadata. - -Here is a simple passage vectors index that stores vectors and some top-level metadata for filtering. - -```console -PUT passage_vectors -{ - "mappings": { - "properties": { - "full_text": { - "type": "text" - }, - "creation_time": { - "type": "date" - }, - "paragraph": { - "type": "nested", - "properties": { - "vector": { - "type": "dense_vector", - "dims": 2, - "index_options": { - "type": "hnsw" - } - }, - "text": { - "type": "text", - "index": false - } - } - } - } - } -} -``` - -With the above mapping, we can index multiple passage vectors along with storing the individual passage text. - -```console -POST passage_vectors/_bulk?refresh=true -{ "index": { "_id": "1" } } -{ "full_text": "first paragraph another paragraph", "creation_time": "2019-05-04", "paragraph": [ { "vector": [ 0.45, 45 ], "text": "first paragraph", "paragraph_id": "1" }, { "vector": [ 0.8, 0.6 ], "text": "another paragraph", "paragraph_id": "2" } ] } -{ "index": { "_id": "2" } } -{ "full_text": "number one paragraph number two paragraph", "creation_time": "2020-05-04", "paragraph": [ { "vector": [ 1.2, 4.5 ], "text": "number one paragraph", "paragraph_id": "1" }, { "vector": [ -1, 42 ], "text": "number two paragraph", "paragraph_id": "2" } ] } -``` - -The query will seem very similar to a typical kNN search: - -```console -POST passage_vectors/_search -{ - "fields": ["full_text", "creation_time"], - "_source": false, - "knn": { - "query_vector": [ - 0.45, - 45 - ], - "field": "paragraph.vector", - "k": 2, - "num_candidates": 2 - } -} -``` - -Note below that even though we have 4 total vectors, we still return two documents. kNN search over nested dense_vectors will always diversify the top results over the top-level document. Meaning, `"k"` top-level documents will be returned, scored by their nearest passage vector (e.g. `"paragraph.vector"`). - -```console-result -{ - "took": 4, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 1.0, - "hits": [ - { - "_index": "passage_vectors", - "_id": "1", - "_score": 1.0, - "fields": { - "creation_time": [ - "2019-05-04T00:00:00.000Z" - ], - "full_text": [ - "first paragraph another paragraph" - ] - } - }, - { - "_index": "passage_vectors", - "_id": "2", - "_score": 0.9997144, - "fields": { - "creation_time": [ - "2020-05-04T00:00:00.000Z" - ], - "full_text": [ - "number one paragraph number two paragraph" - ] - } - } - ] - } -} -``` - -What if you wanted to filter by some top-level document metadata? You can do this by adding `filter` to your `knn` clause. - -::::{note} -`filter` will always be over the top-level document metadata. This means you cannot filter based on `nested` field metadata. -:::: - - -```console -POST passage_vectors/_search -{ - "fields": [ - "creation_time", - "full_text" - ], - "_source": false, - "knn": { - "query_vector": [ - 0.45, - 45 - ], - "field": "paragraph.vector", - "k": 2, - "num_candidates": 2, - "filter": { - "bool": { - "filter": [ - { - "range": { - "creation_time": { - "gte": "2019-05-01", - "lte": "2019-05-05" - } - } - } - ] - } - } - } -} -``` - -Now we have filtered based on the top level `"creation_time"` and only one document falls within that range. - -```console-result -{ - "took": 4, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 1.0, - "hits": [ - { - "_index": "passage_vectors", - "_id": "1", - "_score": 1.0, - "fields": { - "creation_time": [ - "2019-05-04T00:00:00.000Z" - ], - "full_text": [ - "first paragraph another paragraph" - ] - } - } - ] - } -} -``` - - -### Nested kNN Search with Inner hits [nested-knn-search-inner-hits] - -Additionally, if you wanted to extract the nearest passage for a matched document, you can supply [inner_hits](https://www.elastic.co/guide/en/elasticsearch/reference/current/inner-hits.html) to the `knn` clause. - -::::{note} -When using `inner_hits` and multiple `knn` clauses, be sure to specify the [`inner_hits.name`](https://www.elastic.co/guide/en/elasticsearch/reference/current/inner-hits.html#inner-hits-options) field. Otherwise, a naming clash can occur and fail the search request. -:::: - - -```console -POST passage_vectors/_search -{ - "fields": [ - "creation_time", - "full_text" - ], - "_source": false, - "knn": { - "query_vector": [ - 0.45, - 45 - ], - "field": "paragraph.vector", - "k": 2, - "num_candidates": 2, - "inner_hits": { - "_source": false, - "fields": [ - "paragraph.text" - ], - "size": 1 - } - } -} -``` - -Now the result will contain the nearest found paragraph when searching. - -```console-result -{ - "took": 4, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 1.0, - "hits": [ - { - "_index": "passage_vectors", - "_id": "1", - "_score": 1.0, - "fields": { - "creation_time": [ - "2019-05-04T00:00:00.000Z" - ], - "full_text": [ - "first paragraph another paragraph" - ] - }, - "inner_hits": { - "paragraph": { - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 1.0, - "hits": [ - { - "_index": "passage_vectors", - "_id": "1", - "_nested": { - "field": "paragraph", - "offset": 0 - }, - "_score": 1.0, - "fields": { - "paragraph": [ - { - "text": [ - "first paragraph" - ] - } - ] - } - } - ] - } - } - } - }, - { - "_index": "passage_vectors", - "_id": "2", - "_score": 0.9997144, - "fields": { - "creation_time": [ - "2020-05-04T00:00:00.000Z" - ], - "full_text": [ - "number one paragraph number two paragraph" - ] - }, - "inner_hits": { - "paragraph": { - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 0.9997144, - "hits": [ - { - "_index": "passage_vectors", - "_id": "2", - "_nested": { - "field": "paragraph", - "offset": 1 - }, - "_score": 0.9997144, - "fields": { - "paragraph": [ - { - "text": [ - "number two paragraph" - ] - } - ] - } - } - ] - } - } - } - } - ] - } -} -``` - - -### Indexing considerations [knn-indexing-considerations] - -For approximate kNN search, {{es}} stores the dense vector values of each segment as an [HNSW graph](https://arxiv.org/abs/1603.09320). Indexing vectors for approximate kNN search can take substantial time because of how expensive it is to build these graphs. You may need to increase the client request timeout for index and bulk requests. The [approximate kNN tuning guide](../../../solutions/search/vector/knn.md) contains important guidance around indexing performance, and how the index configuration can affect search performance. - -In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-index-options) argument to adjust these parameters: - -```console -PUT image-index -{ - "mappings": { - "properties": { - "image-vector": { - "type": "dense_vector", - "dims": 3, - "similarity": "l2_norm", - "index_options": { - "type": "hnsw", - "m": 32, - "ef_construction": 100 - } - } - } - } -} -``` - - -### Limitations for approximate kNN search [approximate-knn-limitations] - -* When using kNN search in [{{ccs}}](../../../solutions/search/cross-cluster-search.md), the [`ccs_minimize_roundtrips`](../../../solutions/search/cross-cluster-search.md#ccs-min-roundtrips) option is not supported. -* {{es}} uses the [HNSW algorithm](https://arxiv.org/abs/1603.09320) to support efficient kNN search. Like most kNN algorithms, HNSW is an approximate method that sacrifices result accuracy for improved search speed. This means the results returned are not always the true *k* closest neighbors. - -::::{note} -Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#dfs-query-then-fetch) search type in order to gather the global top `k` matches across shards. You cannot set the `search_type` explicitly when running kNN search. -:::: - - - -### Oversampling and rescoring for quantized vectors [dense-vector-knn-search-rescoring] - -When using [quantized vectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing: - -* **Oversampling**: Retrieve more candidates per shard. -* **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates. - -As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines: - -* The performance and memory gains of approximate retrieval using quantized vectors for retrieving the top candidates. -* The accuracy of using the original vectors for rescoring the top candidates. - -All forms of quantization will result in some accuracy loss and as the quantization level increases the accuracy loss will also increase. Generally, we have found that: - -* `int8` requires minimal if any rescoring -* `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss. -* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. - -You can use the `rescore_vector` [preview] option to automatically perform reranking. When a rescore `oversample` parameter is specified, the approximate kNN search will: - -* Retrieve `num_candidates` candidates per shard. -* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors. -* The top `k` rescored candidates will be returned. - -Here is an example of using the `rescore_vector` option with the `oversample` parameter: - -```console -POST image-index/_search -{ - "knn": { - "field": "image-vector", - "query_vector": [-5, 9, -12], - "k": 10, - "num_candidates": 100, - "rescore_vector": { - "oversample": 2.0 - } - }, - "fields": [ "title", "file-type" ] -} -``` - -This example will: - -* Search using approximate kNN for the top 100 candidates. -* Rescore the top 20 candidates (`oversample * k`) per shard using the original, non quantized vectors. -* Return the top 10 (`k`) rescored candidates. -* Merge the rescored canddidates from all shards, and return the top 10 (`k`) results. - - -#### Additional rescoring techniques [dense-vector-knn-search-rescoring-rescore-additional] - -The following sections provide additional ways of rescoring: - - -##### Use the `rescore` section for top-level kNN search [dense-vector-knn-search-rescoring-rescore-section] - -You can use this option when you don’t want to rescore on each shard, but on the top results from all shards. - -Use the [rescore section](https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html#rescore) in the `_search` request to rescore the top results from a kNN search. - -Here is an example using the top level `knn` search with oversampling and using `rescore` to rerank the results: - -```console -POST /my-index/_search -{ - "size": 10, <1> - "knn": { - "query_vector": [0.04283529, 0.85670587, -0.51402352, 0], - "field": "my_int4_vector", - "k": 20, <2> - "num_candidates": 50 - }, - "rescore": { - "window_size": 20, <3> - "query": { - "rescore_query": { - "script_score": { - "query": { - "match_all": {} - }, - "script": { - "source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> - "params": { - "queryVector": [0.04283529, 0.85670587, -0.51402352, 0] - } - } - } - }, - "query_weight": 0, <5> - "rescore_query_weight": 1 <6> - } - } -} -``` - -1. The number of results to return, note its only 10 and we will oversample by 2x, gathering 20 nearest neighbors. -2. The number of results to return from the KNN search. This will do an approximate KNN search with 50 candidates per HNSW graph and use the quantized vectors, returning the 20 most similar vectors according to the quantized score. Additionally, since this is the top-level `knn` object, the global top 20 results will from all shards will be gathered before rescoring. Combining with `rescore`, this is oversampling by `2x`, meaning gathering 20 nearest neighbors according to quantized scoring and rescoring with higher fidelity float vectors. -3. The number of results to rescore, if you want to rescore all results, set this to the same value as `k` -4. The script to rescore the results. Script score will interact directly with the originally provided float32 vector. -5. The weight of the original query, here we simply throw away the original score -6. The weight of the rescore query, here we only use the rescore query - - - -##### Use a `script_score` query to rescore per shard [dense-vector-knn-search-rescoring-script-score] - -You can use this option when you want to rescore on each shard and want more fine-grained control on the rescoring than the `rescore_vector` option provides. - -Use rescore per shard with the [knn query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html) and [script_score query ](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html). Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute. - -```console -POST /my-index/_search -{ - "size": 10, <1> - "query": { - "script_score": { - "query": { - "knn": { <2> - "query_vector": [0.04283529, 0.85670587, -0.51402352, 0], - "field": "my_int4_vector", - "num_candidates": 20 <3> - } - }, - "script": { - "source": "(dotProduct(params.queryVector, 'my_int4_vector') + 1.0)", <4> - "params": { - "queryVector": [0.04283529, 0.85670587, -0.51402352, 0] - } - } - } - } -} -``` - -1. The number of results to return -2. The `knn` query to perform the initial search, this is executed per-shard -3. The number of candidates to use for the initial approximate `knn` search. This will search using the quantized vectors and return the top 20 candidates per shard to then be scored -4. The script to score the results. Script score will interact directly with the originally provided float32 vector. - - - -## Exact kNN [exact-knn] - -To run an exact kNN search, use a `script_score` query with a vector function. - -1. Explicitly map one or more `dense_vector` fields. If you don’t intend to use the field for approximate kNN, set the `index` mapping option to `false`. This can significantly improve indexing speed. - - ```console - PUT product-index - { - "mappings": { - "properties": { - "product-vector": { - "type": "dense_vector", - "dims": 5, - "index": false - }, - "price": { - "type": "long" - } - } - } - } - ``` - -2. Index your data. - - ```console - POST product-index/_bulk?refresh=true - { "index": { "_id": "1" } } - { "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 } - { "index": { "_id": "2" } } - { "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 } - { "index": { "_id": "3" } } - { "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 } - ... - ``` - -3. Use the [search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html) to run a `script_score` query containing a [vector function](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html#vector-functions). - - ::::{tip} - To limit the number of matched documents passed to the vector function, we recommend you specify a filter query in the `script_score.query` parameter. If needed, you can use a [`match_all` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html) in this parameter to match all documents. However, matching all documents can significantly increase search latency. - :::: - - - ```console - POST product-index/_search - { - "query": { - "script_score": { - "query" : { - "bool" : { - "filter" : { - "range" : { - "price" : { - "gte": 1000 - } - } - } - } - }, - "script": { - "source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0", - "params": { - "queryVector": [-0.5, 90.0, -10, 14.8, -156.0] - } - } - } - } - } - ``` From caa50e164d5f8caf903cf80dc58bd30c035ebcaf Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 17:17:20 +0100 Subject: [PATCH 18/30] wip cleanup, del duplicates, start arranging tutorials --- solutions/search/hybrid-semantic-text.md | 4 +- solutions/search/semantic-search.md | 2 +- .../semantic-search/semantic-search-elser.md | 14 +- .../semantic-search-semantic-text.md | 4 +- .../semantic-text-hybrid-search.md | 203 ------------ solutions/search/vector.md | 2 +- solutions/search/vector/bring-own-vectors.md | 2 +- solutions/search/vector/dense-vector.md | 18 +- .../dense-versus-sparse-ingest-pipelines.md} | 14 +- .../search/vector/sparse-vector-elser.md | 296 ------------------ solutions/search/vector/sparse-vector.md | 22 ++ solutions/toc.yml | 6 +- 12 files changed, 61 insertions(+), 526 deletions(-) delete mode 100644 solutions/search/semantic-search/semantic-text-hybrid-search.md rename solutions/search/{semantic-search/semantic-search-deployed-nlp-model.md => vector/dense-versus-sparse-ingest-pipelines.md} (90%) delete mode 100644 solutions/search/vector/sparse-vector-elser.md create mode 100644 solutions/search/vector/sparse-vector.md diff --git a/solutions/search/hybrid-semantic-text.md b/solutions/search/hybrid-semantic-text.md index 8e373ead9d..3fe9fb0cee 100644 --- a/solutions/search/hybrid-semantic-text.md +++ b/solutions/search/hybrid-semantic-text.md @@ -4,9 +4,7 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text-hybrid-search.html --- - - -# Hybrid search [semantic-text-hybrid-search] +# Hybrid search with `semantic_text` [semantic-text-hybrid-search] This tutorial demonstrates how to perform hybrid search, combining semantic search with traditional full-text search. diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index da2f0bd066..983cfb0480 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -48,7 +48,7 @@ For an end-to-end tutorial, refer to [Semantic search with the {{infer}} API](in You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {{stack}}. You need to select an NLP model from the [list of supported dense and sparse vector models](../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. -For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](semantic-search/semantic-search-deployed-nlp-model.md). +For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](vector/dense-versus-sparse-ingest-pipelines.md). ::::{tip} Refer to [vector queries and field types](vector.md#vector-queries-and-field-types) for a quick reference overview. diff --git a/solutions/search/semantic-search/semantic-search-elser.md b/solutions/search/semantic-search/semantic-search-elser.md index 565ee0c107..83f1f4de7d 100644 --- a/solutions/search/semantic-search/semantic-search-elser.md +++ b/solutions/search/semantic-search/semantic-search-elser.md @@ -1,5 +1,5 @@ --- -navigation_title: "Semantic search with ELSER" +navigation_title: "Semantic search with ELSER (ingest pipelines)" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-elser.html --- @@ -19,21 +19,19 @@ For the easiest way to perform semantic search in the {{stack}}, refer to the [` ::::{note} -Only the first 512 extracted tokens per field are considered during semantic search with ELSER. Refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) for more information. +Only the first 512 extracted tokens per field are considered during semantic search with ELSER. Refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) for more information. :::: ### Requirements [requirements] -To perform semantic search by using ELSER, you must have the NLP model deployed in your cluster. Refer to the [ELSER documentation](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to learn how to download and deploy the model. +To perform semantic search by using ELSER, you must have the NLP model deployed in your cluster. Refer to the [ELSER documentation](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to learn how to download and deploy the model. ::::{note} The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in Elasticsearch Service if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself. :::: - - ### Create the index mapping [elser-mappings] First, the mapping of the destination index - the index that contains the tokens that the model created based on your text - must be created. The destination index must have a field with the [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) or [`rank_features`](https://www.elastic.co/guide/en/elasticsearch/reference/current/rank-features.html) field type to index the ELSER output. @@ -162,7 +160,7 @@ GET my-index/_search } ``` -The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](../vector/sparse-vector-elser.md#save-space) to learn more. +The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](../vector/sparse-vector-elser.md#save-space) to learn more. ```console-result "hits": { @@ -287,8 +285,8 @@ Depending on your data, the `sparse_vector` query may be faster with `track_tota ### Further reading [further-reading] -* [How to download and deploy ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) -* [ELSER limitation](../../../explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) +* [How to download and deploy ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) +* [ELSER limitation](/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) * [Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) diff --git a/solutions/search/semantic-search/semantic-search-semantic-text.md b/solutions/search/semantic-search/semantic-search-semantic-text.md index 79aacbcd6c..1cc21ad3ee 100644 --- a/solutions/search/semantic-search/semantic-search-semantic-text.md +++ b/solutions/search/semantic-search/semantic-search-semantic-text.md @@ -128,5 +128,5 @@ As a result, you receive the top 10 documents that are closest in meaning to the ## Further examples and reading [semantic-text-further-examples] * If you want to use `semantic_text` in hybrid search, refer to [this notebook](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb) for a step-by-step guide. -* For more information on how to optimize your ELSER endpoints, refer to [the ELSER recommendations](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-recommendations) section in the model documentation. -* To learn more about model autoscaling, refer to the [trained model autoscaling](../../../explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md) page. +* For more information on how to optimize your ELSER endpoints, refer to [the ELSER recommendations](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-recommendations) section in the model documentation. +* To learn more about model autoscaling, refer to the [trained model autoscaling](/explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md) page. diff --git a/solutions/search/semantic-search/semantic-text-hybrid-search.md b/solutions/search/semantic-search/semantic-text-hybrid-search.md deleted file mode 100644 index 4406417eee..0000000000 --- a/solutions/search/semantic-search/semantic-text-hybrid-search.md +++ /dev/null @@ -1,203 +0,0 @@ ---- -navigation_title: "Hybrid search with `semantic_text`" -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text-hybrid-search.html ---- - - - -# Hybrid search with `semantic_text` [semantic-text-hybrid-search] - - -This tutorial demonstrates how to perform hybrid search, combining semantic search with traditional full-text search. - -In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient. - -The recommended way to use hybrid search in the {{stack}} is following the `semantic_text` workflow. This tutorial uses the [`elasticsearch` service](../inference-api/elasticsearch-inference-integration.md) for demonstration, but you can use any service and their supported models offered by the {{infer-cap}} API. - - -## Create an index mapping [hybrid-search-create-index-mapping] - -The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search. - -```console -PUT semantic-embeddings -{ - "mappings": { - "properties": { - "semantic_text": { <1> - "type": "semantic_text", - }, - "content": { <2> - "type": "text", - "copy_to": "semantic_text" <3> - } - } - } -} -``` - -1. The name of the field to contain the generated embeddings for semantic search. -2. The name of the field to contain the original text for lexical search. -3. The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {{infer}} endpoint. - - -::::{note} -If you want to run a search on indices that were populated by web crawlers or connectors, you have to [update the index mappings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html) for these indices to include the `semantic_text` field. Once the mapping is updated, you’ll need to run a full web crawl or a full connector sync. This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling hybrid search on the updated data. - -:::: - - - -## Load data [semantic-text-hybrid-load-data] - -In this step, you load the data that you later use to create embeddings from. - -Use the `msmarco-passagetest2019-top1000` data set, which is a subset of the MS MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by a list of relevant text passages. All unique passages, along with their IDs, have been extracted from that data set and compiled into a [tsv file](https://github.com/elastic/stack-docs/blob/main/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv). - -Download the file and upload it to your cluster using the [Data Visualizer](../../../manage-data/ingest.md#upload-data-kibana) in the {{ml-app}} UI. After your data is analyzed, click **Override settings**. Under **Edit field names***, assign `id` to the first column and `content` to the second. Click ***Apply***, then ***Import**. Name the index `test-data`, and click **Import**. After the upload is complete, you will see an index named `test-data` with 182,469 documents. - - -## Reindex the data for hybrid search [hybrid-search-reindex-data] - -Reindex the data from the `test-data` index into the `semantic-embeddings` index. The data in the `content` field of the source index is copied into the `content` field of the destination index. The `copy_to` parameter set in the index mapping creation ensures that the content is copied into the `semantic_text` field. The data is processed by the {{infer}} endpoint at ingest time to generate embeddings. - -::::{note} -This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed, rather than using the `test-data` set, reindexing is still required to ensure that the data is processed by the {{infer}} endpoint and the necessary embeddings are generated. - -:::: - - -```console -POST _reindex?wait_for_completion=false -{ - "source": { - "index": "test-data", - "size": 10 <1> - }, - "dest": { - "index": "semantic-embeddings" - } -} -``` - -1. The default batch size for reindexing is 1000. Reducing size to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. - - -The call returns a task ID to monitor the progress: - -```console -GET _tasks/ -``` - -Reindexing large datasets can take a long time. You can test this workflow using only a subset of the dataset. - -To cancel the reindexing process and generate embeddings for the subset that was reindexed: - -```console -POST _tasks//_cancel -``` - - -## Perform hybrid search [hybrid-search-perform-search] - -After reindexing the data into the `semantic-embeddings` index, you can perform hybrid search by using [reciprocal rank fusion (RRF)](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html). RRF is a technique that merges the rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. - -```console -GET semantic-embeddings/_search -{ - "retriever": { - "rrf": { - "retrievers": [ - { - "standard": { <1> - "query": { - "match": { - "content": "How to avoid muscle soreness while running?" <2> - } - } - } - }, - { - "standard": { <3> - "query": { - "semantic": { - "field": "semantic_text", <4> - "query": "How to avoid muscle soreness while running?" - } - } - } - } - ] - } - } -} -``` - -1. The first `standard` retriever represents the traditional lexical search. -2. Lexical search is performed on the `content` field using the specified phrase. -3. The second `standard` retriever refers to the semantic search. -4. The `semantic_text` field is used to perform the semantic search. - - -After performing the hybrid search, the query will return the top 10 documents that match both semantic and lexical search criteria. The results include detailed information about each document: - -```console-result -{ - "took": 107, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 473, - "relation": "eq" - }, - "max_score": null, - "hits": [ - { - "_index": "semantic-embeddings", - "_id": "wv65epIBEMBRnhfTsOFM", - "_score": 0.032786883, - "_rank": 1, - "_source": { - "semantic_text": { - "inference": { - "inference_id": "my-elser-endpoint", - "model_settings": { - "task_type": "sparse_embedding" - }, - "chunks": [ - { - "text": "What so many out there do not realize is the importance of what you do after you work out. You may have done the majority of the work, but how you treat your body in the minutes and hours after you exercise has a direct effect on muscle soreness, muscle strength and growth, and staying hydrated. Cool Down. After your last exercise, your workout is not over. The first thing you need to do is cool down. Even if running was all that you did, you still should do light cardio for a few minutes. This brings your heart rate down at a slow and steady pace, which helps you avoid feeling sick after a workout.", - "embeddings": { - "exercise": 1.571044, - "after": 1.3603843, - "sick": 1.3281639, - "cool": 1.3227621, - "muscle": 1.2645415, - "sore": 1.2561599, - "cooling": 1.2335974, - "running": 1.1750668, - "hours": 1.1104802, - "out": 1.0991782, - "##io": 1.0794281, - "last": 1.0474665, - (...) - } - } - ] - } - }, - "id": 8408852, - "content": "What so many out there do not realize is the importance of (...)" - } - } - ] - } -} -``` diff --git a/solutions/search/vector.md b/solutions/search/vector.md index 68124d90e3..3580a8ae0d 100644 --- a/solutions/search/vector.md +++ b/solutions/search/vector.md @@ -40,4 +40,4 @@ Sparse vectors use ELSER to expand content with semantically related terms. This - Domain-specific search - Large-scale deployments -[Learn more about sparse vector search with ELSER](vector/sparse-vector-elser.md). \ No newline at end of file +[Learn more about sparse vector search with ELSER](vector/sparse-vector.md). \ No newline at end of file diff --git a/solutions/search/vector/bring-own-vectors.md b/solutions/search/vector/bring-own-vectors.md index ecac5f1932..fd943f8c14 100644 --- a/solutions/search/vector/bring-own-vectors.md +++ b/solutions/search/vector/bring-own-vectors.md @@ -126,7 +126,7 @@ In this simple example, we’re sending a raw vector for the query text. In a re For this you’ll need to deploy a text embedding model in {{es}} and use the [`query_vector_builder` parameter](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-knn-query.html#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request. -Learn how to [use a deployed text embedding model](../semantic-search/semantic-search-deployed-nlp-model.md) for semantic search. +Learn how to [use a deployed text embedding model](dense-versus-sparse-ingest-pipelines.md) for semantic search. ::::{tip} If you’re just getting started with vector search in {{es}}, refer to [Semantic search](../semantic-search.md). diff --git a/solutions/search/vector/dense-vector.md b/solutions/search/vector/dense-vector.md index 2970ebcdca..bd1776856c 100644 --- a/solutions/search/vector/dense-vector.md +++ b/solutions/search/vector/dense-vector.md @@ -1,3 +1,19 @@ # Dense vector -% What needs to be done: Write from scratch \ No newline at end of file +Dense neural embeddings capture semantic meaning by translating content into fixed-length vectors of floating-point numbers. Similar content maps to nearby points in the vector space, making them ideal for: + +- Finding semantically similar content +- Matching questions with answers +- Image similarity search +- Content-based recommendations + +## Working with dense vectors in Elasticsearch + +Dense vector search requires both index configuration and a strategy for generating embeddings. To use dense vectors in {{es}}: + +1. Index documents with embeddings + - You can generate embeddings within {{es}} + - Refer to [this overview](../semantic-search.md#using-nlp-models) of the main options + - You can also [bring your own embeddings](bring-own-vectors.md) + - Use the `dense_vector` field type +2. Query the index using the [`knn` search](knn.md) \ No newline at end of file diff --git a/solutions/search/semantic-search/semantic-search-deployed-nlp-model.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md similarity index 90% rename from solutions/search/semantic-search/semantic-search-deployed-nlp-model.md rename to solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index b654244f10..63b7c5277d 100644 --- a/solutions/search/semantic-search/semantic-search-deployed-nlp-model.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -1,16 +1,16 @@ --- -navigation_title: "Semantic search with deployed model" +navigation_title: "Tutorial: Dense and sparse workflows with ingest pipelines" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-deployed-nlp-model.html --- -# Semantic search with deployed model [semantic-search-deployed-nlp-model] +# Semantic search with deployed model (dense and sparse tabs) [semantic-search-deployed-nlp-model] ::::{important} -* For the easiest way to perform semantic search in the {{stack}}, refer to the [`semantic_text`](semantic-search-semantic-text.md) end-to-end tutorial. +* For the easiest way to perform semantic search in the {{stack}}, refer to the [`semantic_text`](../semantic-search/semantic-search-semantic-text.md) end-to-end tutorial. * This tutorial was written before the [{{infer}} endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html) and [`semantic_text` field type](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html) was introduced. Today we have simpler options for performing semantic search. :::: @@ -25,7 +25,7 @@ This guide shows you how to implement semantic search with models deployed in {{ While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging. Selecting an appropriate model from our third-party model list is the first step. Training the model on your own data is essential to ensure better search results than using only BM25. However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming. -To address this issue, Elastic provides a pre-trained representational model called [Elastic Learned Sparse EncodeR (ELSER)](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md). ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. This adaptability makes it suitable for various NLP use cases out of the box. Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. +To address this issue, Elastic provides a pre-trained representational model called [Elastic Learned Sparse EncodeR (ELSER)](../explore-analyze/machine-learning/nlp/ml-nlp-elser.md). ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. This adaptability makes it suitable for various NLP use cases out of the box. Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values. This representation is commonly used for textual data. In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors. Each non-zero element of the vector corresponds to a term in the model vocabulary. The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero. Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is. @@ -55,7 +55,7 @@ Before you start using the deployed model to generate embeddings based on your i ::::::{tab-item} ELSER ELSER produces token-weight pairs as output from the input text and the query. The {{es}} [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) field type can store these token-weight pairs as numeric feature vectors. The index must have a field with the `sparse_vector` field type to index the tokens that ELSER generates. -To create a mapping for your ELSER index, refer to the [Create the index mapping section](../vector/sparse-vector-elser.md#elser-mappings) of the tutorial. The example shows how to create an index mapping for `my-index` that defines the `my_embeddings.tokens` field - which will contain the ELSER output - as a `sparse_vector` field. +To create a mapping for your ELSER index, refer to the [Create the index mapping section](../semantic-search/semantic-search-elser.md#elser-mappings) of the tutorial. The example shows how to create an index mapping for `my-index` that defines the `my_embeddings.tokens` field - which will contain the ELSER output - as a `sparse_vector` field. ```console PUT my-index @@ -142,7 +142,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](../vector/sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models @@ -201,7 +201,7 @@ GET my-index/_search :::::: ::::::{tab-item} Dense vector models -Text embeddings produced by dense vector models can be queried using a [kNN search](../vector/knn.md#knn-semantic-search). In the `knn` clause, provide the name of the dense vector field, and a `query_vector_builder` clause with the model ID and the query text. +Text embeddings produced by dense vector models can be queried using a [kNN search](knn.md#knn-semantic-search). In the `knn` clause, provide the name of the dense vector field, and a `query_vector_builder` clause with the model ID and the query text. ```console GET my-index/_search diff --git a/solutions/search/vector/sparse-vector-elser.md b/solutions/search/vector/sparse-vector-elser.md deleted file mode 100644 index 55c7d4f3c5..0000000000 --- a/solutions/search/vector/sparse-vector-elser.md +++ /dev/null @@ -1,296 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-elser.html ---- - - - -# Sparse vector (ELSER) [semantic-search-elser] - - -Elastic Learned Sparse EncodeR - or ELSER - is an NLP model trained by Elastic that enables you to perform semantic search by using sparse vector representation. Instead of literal matching on search terms, semantic search retrieves results based on the intent and the contextual meaning of a search query. - -The instructions in this tutorial shows you how to use ELSER to perform semantic search on your data. - -::::{important} -For the easiest way to perform semantic search in the {{stack}}, refer to the [`semantic_text`](../semantic-search/semantic-search-semantic-text.md) end-to-end tutorial. -:::: - - -::::{note} -Only the first 512 extracted tokens per field are considered during semantic search with ELSER. Refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) for more information. -:::: - - - -### Requirements [requirements] - -To perform semantic search by using ELSER, you must have the NLP model deployed in your cluster. Refer to the [ELSER documentation](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to learn how to download and deploy the model. - -::::{note} -The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in Elasticsearch Service if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself. -:::: - - - -### Create the index mapping [elser-mappings] - -First, the mapping of the destination index - the index that contains the tokens that the model created based on your text - must be created. The destination index must have a field with the [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) or [`rank_features`](https://www.elastic.co/guide/en/elasticsearch/reference/current/rank-features.html) field type to index the ELSER output. - -::::{note} -ELSER output must be ingested into a field with the `sparse_vector` or `rank_features` field type. Otherwise, {{es}} interprets the token-weight pairs as a massive amount of fields in a document. If you get an error similar to this: `"Limit of total fields [1000] has been exceeded while adding new fields"` then the ELSER output field is not mapped properly and it has a field type different than `sparse_vector` or `rank_features`. -:::: - - -```console -PUT my-index -{ - "mappings": { - "properties": { - "content_embedding": { <1> - "type": "sparse_vector" <2> - }, - "content": { <3> - "type": "text" <4> - } - } - } -} -``` - -1. The name of the field to contain the generated tokens. It must be referenced in the {{infer}} pipeline configuration in the next step. -2. The field to contain the tokens is a `sparse_vector` field. -3. The name of the field from which to create the sparse vector representation. In this example, the name of the field is `content`. It must be referenced in the {{infer}} pipeline configuration in the next step. -4. The field type which is text in this example. - - -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](#save-space) section. - - -### Create an ingest pipeline with an inference processor [inference-ingest-pipeline] - -Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) to use ELSER to infer against the data that is being ingested in the pipeline. - -```console -PUT _ingest/pipeline/elser-v2-test -{ - "processors": [ - { - "inference": { - "model_id": ".elser_model_2", - "input_output": [ <1> - { - "input_field": "content", - "output_field": "content_embedding" - } - ] - } - } - ] -} -``` - -1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. - - - -### Load data [load-data] - -In this step, you load the data that you later use in the {{infer}} ingest pipeline to extract tokens from it. - -Use the `msmarco-passagetest2019-top1000` data set, which is a subset of the MS MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by a list of relevant text passages. All unique passages, along with their IDs, have been extracted from that data set and compiled into a [tsv file](https://github.com/elastic/stack-docs/blob/main/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv). - -::::{important} -The `msmarco-passagetest2019-top1000` dataset was not utilized to train the model. We use this sample dataset in the tutorial because is easily accessible for demonstration purposes. You can use a different data set to test the workflow and become familiar with it. -:::: - - -Download the file and upload it to your cluster using the [File Uploader](../../../manage-data/ingest.md#upload-data-kibana) in the UI. After your data is analyzed, click **Override settings**. Under **Edit field names***, assign `id` to the first column and `content` to the second. Click ***Apply***, then ***Import**. Name the index `test-data`, and click **Import**. After the upload is complete, you will see an index named `test-data` with 182,469 documents. - - -### Ingest the data through the {{infer}} ingest pipeline [reindexing-data-elser] - -Create the tokens from the text by reindexing the data throught the {{infer}} pipeline that uses ELSER as the inference model. - -```console -POST _reindex?wait_for_completion=false -{ - "source": { - "index": "test-data", - "size": 50 <1> - }, - "dest": { - "index": "my-index", - "pipeline": "elser-v2-test" - } -} -``` - -1. The default batch size for reindexing is 1000. Reducing `size` to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. - - -The call returns a task ID to monitor the progress: - -```console -GET _tasks/ -``` - -You can also open the Trained Models UI, select the Pipelines tab under ELSER to follow the progress. - -Reindexing large datasets can take a long time. You can test this workflow using only a subset of the dataset. Do this by cancelling the reindexing process, and only generating embeddings for the subset that was reindexed. The following API request will cancel the reindexing task: - -```console -POST _tasks//_cancel -``` - - -### Semantic search by using the `sparse_vector` query [text-expansion-query] - -To perform semantic search, use the [`sparse_vector` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html), and provide the query text and the inference ID associated with your ELSER model. The example below uses the query text "How to avoid muscle soreness after running?", the `content_embedding` field contains the generated ELSER output: - -```console -GET my-index/_search -{ - "query":{ - "sparse_vector":{ - "field": "content_embedding", - "inference_id": "my-elser-endpoint", - "query": "How to avoid muscle soreness after running?" - } - } -} -``` - -The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](#save-space) to learn more. - -```console-result -"hits": { - "total": { - "value": 10000, - "relation": "gte" - }, - "max_score": 26.199875, - "hits": [ - { - "_index": "my-index", - "_id": "FPr9HYsBag9jXmT8lEpI", - "_score": 26.199875, - "_source": { - "content_embedding": { - "muscular": 0.2821541, - "bleeding": 0.37929374, - "foods": 1.1718726, - "delayed": 1.2112266, - "cure": 0.6848574, - "during": 0.5886185, - "fighting": 0.35022718, - "rid": 0.2752442, - "soon": 0.2967024, - "leg": 0.37649947, - "preparation": 0.32974035, - "advance": 0.09652356, - (...) - }, - "id": 1713868, - "model_id": ".elser_model_2", - "content": "For example, if you go for a run, you will mostly use the muscles in your lower body. Give yourself 2 days to rest those muscles so they have a chance to heal before you exercise them again. Not giving your muscles enough time to rest can cause muscle damage, rather than muscle development." - } - }, - (...) - ] -} -``` - - -### Combining semantic search with other queries [text-expansion-compound-query] - -You can combine [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html) with other queries in a [compound query](https://www.elastic.co/guide/en/elasticsearch/reference/current/compound-queries.html). For example, use a filter clause in a [Boolean](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html) or a full text query with the same (or different) query text as the `sparse_vector` query. This enables you to combine the search results from both queries. - -The search hits from the `sparse_vector` query tend to score higher than other {{es}} queries. Those scores can be regularized by increasing or decreasing the relevance scores of each query by using the `boost` parameter. Recall on the `sparse_vector` query can be high where there is a long tail of less relevant results. Use the `min_score` parameter to prune those less relevant documents. - -```console -GET my-index/_search -{ - "query": { - "bool": { <1> - "should": [ - { - "sparse_vector": { - "field": "content_embedding", - "inference_id": "my-elser-endpoint", - "query": "How to avoid muscle soreness after running?", - "boost": 1 <2> - } - }, - { - "query_string": { - "query": "toxins", - "boost": 4 <3> - } - } - ] - } - }, - "min_score": 10 <4> -} -``` - -1. Both the `sparse_vector` and the `query_string` queries are in a `should` clause of a `bool` query. -2. The `boost` value is `1` for the `sparse_vector` query which is the default value. This means that the relevance score of the results of this query are not boosted. -3. The `boost` value is `4` for the `query_string` query. The relevance score of the results of this query is increased causing them to rank higher in the search results. -4. Only the results with a score equal to or higher than `10` are displayed. - - - -## Optimizing performance [optimization] - - -### Saving disk space by excluding the ELSER tokens from document source [save-space] - -The tokens generated by ELSER must be indexed for use in the [sparse_vector query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html). However, it is not necessary to retain those terms in the document source. You can save disk space by using the [source exclude](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#include-exclude) mapping to remove the ELSER terms from the document source. - -::::{warning} -Reindex uses the document source to populate the destination index. **Once the ELSER terms have been excluded from the source, they cannot be recovered through reindexing.** Excluding the tokens from the source is a space-saving optimization that should only be applied if you are certain that reindexing will not be required in the future! It’s important to carefully consider this trade-off and make sure that excluding the ELSER terms from the source aligns with your specific requirements and use case. Review the [Disabling the `_source` field](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#disable-source-field) and [Including / Excluding fields from `_source`](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#include-exclude) sections carefully to learn more about the possible consequences of excluding the tokens from the `_source`. -:::: - - -The mapping that excludes `content_embedding` from the `_source` field can be created by the following API call: - -```console -PUT my-index -{ - "mappings": { - "_source": { - "excludes": [ - "content_embedding" - ] - }, - "properties": { - "content_embedding": { - "type": "sparse_vector" - }, - "content": { - "type": "text" - } - } - } -} -``` - -::::{note} -Depending on your data, the `sparse_vector` query may be faster with `track_total_hits: false`. - -:::: - - - -### Further reading [further-reading] - -* [How to download and deploy ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) -* [ELSER limitation](../../../explore-analyze/machine-learning/nlp/ml-nlp-limitations.md#ml-nlp-elser-v1-limit-512) -* [Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) - - -### Interactive example [interactive-example] - -* The `elasticsearch-labs` repo has an interactive example of running [ELSER-powered semantic search](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb) using the {{es}} Python client. diff --git a/solutions/search/vector/sparse-vector.md b/solutions/search/vector/sparse-vector.md new file mode 100644 index 0000000000..72f750e05d --- /dev/null +++ b/solutions/search/vector/sparse-vector.md @@ -0,0 +1,22 @@ +# Sparse vector [sparse-vector-foo] + +When working with sparse vectors in {{ess}}, you'll be using the Elastic learned sparse encoder (ELSER) model at index and query time to expand content with semantically related terms. + +This approach preserves explainability while adding semantic understanding, with each document or query expanded into a set of weighted terms. + +Sparse vector search with ELSER is ideal for: + +- Enhanced keyword search +- Cases requiring explainable results +- Domain-specific search +- Large-scale deployments + +## Working with sparse vectors in Elasticsearch + +Sparse vector search with ELSER expands both documents and queries into weighted terms. To use sparse vectors in {{es}}: + +1. Index documents with ELSER + - Deploy and configure the ELSER model + - Use the `sparse_vector` field type + - See [this overview](../semantic-search.md#using-nlp-models) for implementation options +2. Query the index using the [`sparse_vector` search](sparse-vector-elser.md#querying) \ No newline at end of file diff --git a/solutions/toc.yml b/solutions/toc.yml index 9d8869e1ba..ca4bc38295 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -637,15 +637,15 @@ toc: children: - file: search/vector/knn.md - file: search/vector/bring-own-vectors.md - - file: search/vector/sparse-vector-elser.md + - file: search/vector/sparse-vector.md + - file: search/vector/dense-versus-sparse-ingest-pipelines.md + - file: search/semantic-search.md children: - file: search/semantic-search/semantic-search-semantic-text.md - - file: search/semantic-search/semantic-text-hybrid-search.md - file: search/semantic-search/semantic-search-inference.md - file: search/semantic-search/semantic-search-elser.md - file: search/semantic-search/cohere-es.md - - file: search/semantic-search/semantic-search-deployed-nlp-model.md - file: search/hybrid-search.md children: - file: search/hybrid-semantic-text.md From 115ccaddab53fbebe221498127a8b792801008f8 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 18:13:03 +0100 Subject: [PATCH 19/30] links --- .../self-managed/install-elasticsearch-with-docker.md | 2 +- explore-analyze/machine-learning/nlp/ml-nlp-elser.md | 6 +++--- solutions/search/api-quickstarts.md | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md index 3bde5e1a06..f6f042d6dd 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md @@ -90,7 +90,7 @@ docker pull docker.elastic.co/elasticsearch/elasticsearch-wolfi:9.0.0-beta1 :::: - {{ml-cap}} features such as [semantic search with ELSER](../../../solutions/search/vector/sparse-vector-elser.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: + {{ml-cap}} features such as [semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: ```sh docker run --name es01 --net elastic -p 9200:9200 -it -m 6GB -e "xpack.ml.use_auto_machine_memory_percent=true" docker.elastic.co/elasticsearch/elasticsearch:9.0.0-beta1 diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index 8f8cb44c7a..d217f36627 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -6,7 +6,7 @@ mapped_pages: # ELSER [ml-nlp-elser] -Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](../../../solutions/search/vector/sparse-vector-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. +Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](/solutions/search/vector/sparse-vector-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. ELSER is an out-of-domain model which means it does not require fine-tuning on your own data, making it adaptable for various use cases out of the box. @@ -42,7 +42,7 @@ If you want to learn more about the ELSER V2 improvements, refer to [this blog p ### Upgrading to ELSER v2 [upgrade-elser-v2] -ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](../../../solutions/search/vector/sparse-vector-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. +ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/vector/sparse-vector-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. Additionally, the `elasticearch-labs` GitHub repository contains an interactive [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/model-upgrades/upgrading-index-to-use-elser.ipynb) that walks through upgrading an index to ELSER V2. @@ -298,7 +298,7 @@ To gain the biggest value out of ELSER trained models, consider to follow this l ## Further reading [further-readings] * [Perform semantic search with `semantic_text` using the ELSER endpoint](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) -* [Perform semantic search with ELSER](../../../solutions/search/vector/sparse-vector-elser.md) +* [Perform semantic search with ELSER](/solutions/search/vector/sparse-vector-elser.md) ## Benchmark information [elser-benchmarks] diff --git a/solutions/search/api-quickstarts.md b/solutions/search/api-quickstarts.md index 95df931323..9770686587 100644 --- a/solutions/search/api-quickstarts.md +++ b/solutions/search/api-quickstarts.md @@ -7,7 +7,7 @@ Use the following quickstarts to get hands-on experience with Elasticsearch APIs - [Analyze eCommerce data with aggregations using Query DSL](/explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md): Learn how to analyze data using different types of aggregations, including metrics, buckets, and pipelines. % - [Getting started with ES|QL](esql-getting-started.md): Learn how to query and aggregate your data using ES|QL. - [Semantic search](semantic-search/semantic-search-semantic-text.md): Learn how to create embeddings for your data with `semantic_text` and query using the `semantic` query. - - [Hybrid search](semantic-search/semantic-text-hybrid-search.md): Learn how to combine semantic search using`semantic_text` with full-text search. + - [Hybrid search](hybrid-semantic-text.md): Learn how to combine semantic search using`semantic_text` with full-text search. - [Bring your own dense vector embeddings](vector/bring-own-vectors.md): Learn how to ingest dense vector embeddings into Elasticsearch. :::{tip} From f9c7b8a8a8ce1d8e8c8db742ae739b469ce12738 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 18:15:14 +0100 Subject: [PATCH 20/30] links --- solutions/search/applications/search-application-api.md | 2 +- solutions/search/semantic-search/semantic-search-elser.md | 4 ++-- .../search/vector/dense-versus-sparse-ingest-pipelines.md | 6 +++--- solutions/search/vector/sparse-vector.md | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/solutions/search/applications/search-application-api.md b/solutions/search/applications/search-application-api.md index e3919b6a63..d4068dc73f 100644 --- a/solutions/search/applications/search-application-api.md +++ b/solutions/search/applications/search-application-api.md @@ -367,7 +367,7 @@ POST _application/search_application/my-search-app/_search ### Text search + ELSER [search-application-api-catchall-template] -The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](../vector/sparse-vector-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. +The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/vector/sparse-vector-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. This example provides a single template that you can use for various search application scenarios: text search, ELSER, or all of the above. It also provides a simple default `query_string` query if no parameters are specified. diff --git a/solutions/search/semantic-search/semantic-search-elser.md b/solutions/search/semantic-search/semantic-search-elser.md index 83f1f4de7d..4a95ce0753 100644 --- a/solutions/search/semantic-search/semantic-search-elser.md +++ b/solutions/search/semantic-search/semantic-search-elser.md @@ -63,7 +63,7 @@ PUT my-index 4. The field type which is text in this example. -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](../vector/sparse-vector-elser.md#save-space) section. +To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](/solutions/search/vector/sparse-vector-elser.md#save-space) section. ### Create an ingest pipeline with an inference processor [inference-ingest-pipeline] @@ -160,7 +160,7 @@ GET my-index/_search } ``` -The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](../vector/sparse-vector-elser.md#save-space) to learn more. +The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](/solutions/search/vector/sparse-vector-elser.md#save-space) to learn more. ```console-result "hits": { diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index 63b7c5277d..409a887556 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -25,7 +25,7 @@ This guide shows you how to implement semantic search with models deployed in {{ While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging. Selecting an appropriate model from our third-party model list is the first step. Training the model on your own data is essential to ensure better search results than using only BM25. However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming. -To address this issue, Elastic provides a pre-trained representational model called [Elastic Learned Sparse EncodeR (ELSER)](../explore-analyze/machine-learning/nlp/ml-nlp-elser.md). ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. This adaptability makes it suitable for various NLP use cases out of the box. Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. +To address this issue, Elastic provides a pre-trained representational model called [Elastic Learned Sparse EncodeR (ELSER)](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md). ELSER, currently available only for English, is an out-of-domain sparse vector model that does not require fine-tuning. This adaptability makes it suitable for various NLP use cases out of the box. Unless you have a team of ML specialists, it is highly recommended to use the ELSER model. In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values. This representation is commonly used for textual data. In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors. Each non-zero element of the vector corresponds to a term in the model vocabulary. The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero. Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is. @@ -37,7 +37,7 @@ After you decide which model you want to use for implementing semantic search, y :::::::{tab-set} ::::::{tab-item} ELSER -To deploy ELSER, refer to [Download and deploy ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#download-deploy-elser). +To deploy ELSER, refer to [Download and deploy ELSER](../..//explore-analyze/machine-learning/nlp/ml-nlp-elser.md#download-deploy-elser). :::::: ::::::{tab-item} Dense vector models @@ -142,7 +142,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../..//explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models diff --git a/solutions/search/vector/sparse-vector.md b/solutions/search/vector/sparse-vector.md index 72f750e05d..ea02c4af92 100644 --- a/solutions/search/vector/sparse-vector.md +++ b/solutions/search/vector/sparse-vector.md @@ -19,4 +19,4 @@ Sparse vector search with ELSER expands both documents and queries into weighted - Deploy and configure the ELSER model - Use the `sparse_vector` field type - See [this overview](../semantic-search.md#using-nlp-models) for implementation options -2. Query the index using the [`sparse_vector` search](sparse-vector-elser.md#querying) \ No newline at end of file +2. Query the index using the [`sparse_vector` search](/solutions/search/vector/sparse-vector-elser.md#querying) \ No newline at end of file From 640dd0986563d4fd20982abdf2d62ce36de7e591 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 7 Feb 2025 18:17:52 +0100 Subject: [PATCH 21/30] idem --- .../vector/dense-versus-sparse-ingest-pipelines.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index 409a887556..9d121d2716 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -21,7 +21,7 @@ This guide shows you how to implement semantic search with models deployed in {{ ## Select an NLP model [deployed-select-nlp-model] -{{es}} offers the usage of a [wide range of NLP models](../../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), including both dense and sparse vector models. Your choice of the language model is critical for implementing semantic search successfully. +{{es}} offers the usage of a [wide range of NLP models](/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), including both dense and sparse vector models. Your choice of the language model is critical for implementing semantic search successfully. While it is possible to bring your own text embedding model, achieving good search results through model tuning is challenging. Selecting an appropriate model from our third-party model list is the first step. Training the model on your own data is essential to ensure better search results than using only BM25. However, the model training process requires a team of data scientists and ML experts, making it expensive and time-consuming. @@ -37,11 +37,11 @@ After you decide which model you want to use for implementing semantic search, y :::::::{tab-set} ::::::{tab-item} ELSER -To deploy ELSER, refer to [Download and deploy ELSER](../..//explore-analyze/machine-learning/nlp/ml-nlp-elser.md#download-deploy-elser). +To deploy ELSER, refer to [Download and deploy ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#download-deploy-elser). :::::: ::::::{tab-item} Dense vector models -To deploy a third-party text embedding model, refer to [Deploy a text embedding model](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-te-vs-deploy). +To deploy a third-party text embedding model, refer to [Deploy a text embedding model](/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-te-vs-deploy). :::::: ::::::: @@ -82,7 +82,7 @@ PUT my-index ::::::{tab-item} Dense vector models The models compatible with {{es}} NLP generate dense vectors as output. The [`dense_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) field type is suitable for storing dense vectors of numeric values. The index must have a field with the `dense_vector` field type to index the embeddings that the supported third-party model that you selected generates. Keep in mind that the model produces embeddings with a certain number of dimensions. The `dense_vector` field must be configured with the same number of dimensions using the `dims` option. Refer to the respective model documentation to get information about the number of dimensions of the embeddings. -To review a mapping of an index for an NLP model, refer to the mapping code snippet in the [Add the text embedding model to an ingest inference pipeline](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-text-emb-ingest) section of the tutorial. The example shows how to create an index mapping that defines the `my_embeddings.predicted_value` field - which will contain the model output - as a `dense_vector` field. +To review a mapping of an index for an NLP model, refer to the mapping code snippet in the [Add the text embedding model to an ingest inference pipeline](/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-text-emb-ingest) section of the tutorial. The example shows how to create an index mapping that defines the `my_embeddings.predicted_value` field - which will contain the model output - as a `dense_vector` field. ```console PUT my-index @@ -142,7 +142,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../..//explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models @@ -170,7 +170,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 2. The `field_map` object maps the input document field name (which is `my_text_field` in this example) to the name of the field that the model expects (which is always `text_field`). -To ingest data through the pipeline to generate text embeddings with your chosen model, refer to the [Add the text embedding model to an inference ingest pipeline](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-text-emb-ingest) section. The example shows how to create the pipeline with the inference processor and reindex your data through the pipeline. After you successfully ingested documents by using the pipeline, your index will contain the text embeddings generated by the model. +To ingest data through the pipeline to generate text embeddings with your chosen model, refer to the [Add the text embedding model to an inference ingest pipeline](/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-text-emb-ingest) section. The example shows how to create the pipeline with the inference processor and reindex your data through the pipeline. After you successfully ingested documents by using the pipeline, your index will contain the text embeddings generated by the model. :::::: ::::::: From 36345a2a8013c57b6def61cd06bdd023c8f2edf5 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 09:14:27 +0100 Subject: [PATCH 22/30] Fix links, add tips, cleanups --- explore-analyze/machine-learning/nlp/ml-nlp-elser.md | 6 +++--- solutions/search/applications/search-application-api.md | 2 +- solutions/search/semantic-search.md | 2 +- solutions/search/semantic-search/semantic-search-elser.md | 5 ++--- solutions/search/vector.md | 6 +++--- solutions/search/vector/dense-vector.md | 4 ++++ .../search/vector/dense-versus-sparse-ingest-pipelines.md | 6 +++--- solutions/search/vector/sparse-vector.md | 8 ++++++-- 8 files changed, 23 insertions(+), 16 deletions(-) diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index d217f36627..3a7a1bb68e 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -6,7 +6,7 @@ mapped_pages: # ELSER [ml-nlp-elser] -Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](/solutions/search/vector/sparse-vector-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. +Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](/solutions/search/vector/semantic-search-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. ELSER is an out-of-domain model which means it does not require fine-tuning on your own data, making it adaptable for various use cases out of the box. @@ -42,7 +42,7 @@ If you want to learn more about the ELSER V2 improvements, refer to [this blog p ### Upgrading to ELSER v2 [upgrade-elser-v2] -ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/vector/sparse-vector-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. +ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/vector/semantic-search-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. Additionally, the `elasticearch-labs` GitHub repository contains an interactive [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/model-upgrades/upgrading-index-to-use-elser.ipynb) that walks through upgrading an index to ELSER V2. @@ -298,7 +298,7 @@ To gain the biggest value out of ELSER trained models, consider to follow this l ## Further reading [further-readings] * [Perform semantic search with `semantic_text` using the ELSER endpoint](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) -* [Perform semantic search with ELSER](/solutions/search/vector/sparse-vector-elser.md) +* [Perform semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) ## Benchmark information [elser-benchmarks] diff --git a/solutions/search/applications/search-application-api.md b/solutions/search/applications/search-application-api.md index d4068dc73f..d086fd9ea9 100644 --- a/solutions/search/applications/search-application-api.md +++ b/solutions/search/applications/search-application-api.md @@ -367,7 +367,7 @@ POST _application/search_application/my-search-app/_search ### Text search + ELSER [search-application-api-catchall-template] -The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/vector/sparse-vector-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. +The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. This example provides a single template that you can use for various search application scenarios: text search, ELSER, or all of the above. It also provides a simple default `query_string` query if no parameters are specified. diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index 983cfb0480..ece8262a30 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -7,7 +7,7 @@ mapped_urls: # Semantic search [semantic-search] :::{note} -This page focuses on the semantic search workflows available in {{es}}. For detailed information about vector search implementations, refer to [vector search](vector.md). +This page focuses on the semantic search workflows available in {{es}}. For detailed information about lower-level vector search implementations, refer to [vector search](vector.md). ::: {{es}} provides various semantic search capabilities using [natural language processing (NLP)](/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). diff --git a/solutions/search/semantic-search/semantic-search-elser.md b/solutions/search/semantic-search/semantic-search-elser.md index 4a95ce0753..fdbb42ebf6 100644 --- a/solutions/search/semantic-search/semantic-search-elser.md +++ b/solutions/search/semantic-search/semantic-search-elser.md @@ -63,7 +63,7 @@ PUT my-index 4. The field type which is text in this example. -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](/solutions/search/vector/sparse-vector-elser.md#save-space) section. +To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](#save-space. ### Create an ingest pipeline with an inference processor [inference-ingest-pipeline] @@ -160,7 +160,7 @@ GET my-index/_search } ``` -The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](/solutions/search/vector/sparse-vector-elser.md#save-space) to learn more. +The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](#save-space) to learn more. ```console-result "hits": { @@ -200,7 +200,6 @@ The result is the top 10 documents that are closest in meaning to your query tex } ``` - ### Combining semantic search with other queries [text-expansion-compound-query] You can combine [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html) with other queries in a [compound query](https://www.elastic.co/guide/en/elasticsearch/reference/current/compound-queries.html). For example, use a filter clause in a [Boolean](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html) or a full text query with the same (or different) query text as the `sparse_vector` query. This enables you to combine the search results from both queries. diff --git a/solutions/search/vector.md b/solutions/search/vector.md index 3580a8ae0d..1c802e182a 100644 --- a/solutions/search/vector.md +++ b/solutions/search/vector.md @@ -18,8 +18,8 @@ Here's a quick reference overview of vector search field types and queries avail | Vector type | Field type | Query type | Primary use case | | ----------- | --------------- | --------------- | -------------------------------------------------- | -| Dense | `dense_vector` | `knn` | Semantic similarity via neural embeddings | -| Sparse | `sparse_vector` | `sparse_vector` | Semantic term expansion with (ELSER) | +| Dense | `dense_vector` | `knn` | Semantic similarity using your chosen embeddings model | +| Sparse | `sparse_vector` | `sparse_vector` | Semantic term expansion with the ELSER model | | Sparse or dense | `semantic_text` | `semantic` | Managed semantic search that is agnostic to implementation details | ## Dense vector search @@ -34,7 +34,7 @@ Dense neural embeddings capture semantic meaning by translating content into fix ## Sparse vector search -Sparse vectors use ELSER to expand content with semantically related terms. This approach preserves explainability while adding semantic understanding, making it well-suited for: +The sparse vector approach uses the ELSER model to expand content with semantically related terms. This approach preserves explainability while adding semantic understanding, making it well-suited for: - Enhanced keyword search - Cases requiring explainable results - Domain-specific search diff --git a/solutions/search/vector/dense-vector.md b/solutions/search/vector/dense-vector.md index bd1776856c..4588000400 100644 --- a/solutions/search/vector/dense-vector.md +++ b/solutions/search/vector/dense-vector.md @@ -9,6 +9,10 @@ Dense neural embeddings capture semantic meaning by translating content into fix ## Working with dense vectors in Elasticsearch +:::{tip} +Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](semantic-search/semantic-search-semantic-text.md). +::: + Dense vector search requires both index configuration and a strategy for generating embeddings. To use dense vectors in {{es}}: 1. Index documents with embeddings diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index 9d121d2716..a776c61fb4 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -1,12 +1,12 @@ --- -navigation_title: "Tutorial: Dense and sparse workflows with ingest pipelines" +navigation_title: "Tutorial: Dense and sparse workflows" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-deployed-nlp-model.html --- -# Semantic search with deployed model (dense and sparse tabs) [semantic-search-deployed-nlp-model] +# Tutorial: Dense and sparse workflows using ingest pipelines [semantic-search-deployed-nlp-model] ::::{important} @@ -142,7 +142,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/sparse-vector-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/semantic-search-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models diff --git a/solutions/search/vector/sparse-vector.md b/solutions/search/vector/sparse-vector.md index ea02c4af92..9b52a7fac4 100644 --- a/solutions/search/vector/sparse-vector.md +++ b/solutions/search/vector/sparse-vector.md @@ -1,6 +1,6 @@ # Sparse vector [sparse-vector-foo] -When working with sparse vectors in {{ess}}, you'll be using the Elastic learned sparse encoder (ELSER) model at index and query time to expand content with semantically related terms. +When working with sparse vectors in {{es}}, you'll be using the Elastic learned sparse encoder (ELSER) model at index and query time to expand content with semantically related terms. This approach preserves explainability while adding semantic understanding, with each document or query expanded into a set of weighted terms. @@ -13,10 +13,14 @@ Sparse vector search with ELSER is ideal for: ## Working with sparse vectors in Elasticsearch +:::{tip} +Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](semantic-search/semantic-search-semantic-text.md). +::: + Sparse vector search with ELSER expands both documents and queries into weighted terms. To use sparse vectors in {{es}}: 1. Index documents with ELSER - Deploy and configure the ELSER model - Use the `sparse_vector` field type - See [this overview](../semantic-search.md#using-nlp-models) for implementation options -2. Query the index using the [`sparse_vector` search](/solutions/search/vector/sparse-vector-elser.md#querying) \ No newline at end of file +2. Query the index using [`sparse_vector` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-sparse-vector-query.html). \ No newline at end of file From 72343b44c189bcd6a1fde9d876be7e3c86902789 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 09:28:07 +0100 Subject: [PATCH 23/30] links --- .../self-managed/install-elasticsearch-with-docker.md | 2 +- explore-analyze/machine-learning/nlp/ml-nlp-elser.md | 6 +++--- solutions/search/applications/search-application-api.md | 2 +- solutions/search/vector/dense-vector.md | 2 +- .../search/vector/dense-versus-sparse-ingest-pipelines.md | 2 +- solutions/search/vector/sparse-vector.md | 2 +- 6 files changed, 8 insertions(+), 8 deletions(-) diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md index f6f042d6dd..18e88d7347 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md @@ -90,7 +90,7 @@ docker pull docker.elastic.co/elasticsearch/elasticsearch-wolfi:9.0.0-beta1 :::: - {{ml-cap}} features such as [semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: + {{ml-cap}} features such as [semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: ```sh docker run --name es01 --net elastic -p 9200:9200 -it -m 6GB -e "xpack.ml.use_auto_machine_memory_percent=true" docker.elastic.co/elasticsearch/elasticsearch:9.0.0-beta1 diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index 3a7a1bb68e..cff6b093cc 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -6,7 +6,7 @@ mapped_pages: # ELSER [ml-nlp-elser] -Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](/solutions/search/vector/semantic-search-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. +Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](/solutions/search/semantic-search.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. ELSER is an out-of-domain model which means it does not require fine-tuning on your own data, making it adaptable for various use cases out of the box. @@ -42,7 +42,7 @@ If you want to learn more about the ELSER V2 improvements, refer to [this blog p ### Upgrading to ELSER v2 [upgrade-elser-v2] -ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/vector/semantic-search-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. +ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/semantic-search/semantic-search-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. Additionally, the `elasticearch-labs` GitHub repository contains an interactive [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/model-upgrades/upgrading-index-to-use-elser.ipynb) that walks through upgrading an index to ELSER V2. @@ -298,7 +298,7 @@ To gain the biggest value out of ELSER trained models, consider to follow this l ## Further reading [further-readings] * [Perform semantic search with `semantic_text` using the ELSER endpoint](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) -* [Perform semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) +* [Perform semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) ## Benchmark information [elser-benchmarks] diff --git a/solutions/search/applications/search-application-api.md b/solutions/search/applications/search-application-api.md index d086fd9ea9..320f5a0e81 100644 --- a/solutions/search/applications/search-application-api.md +++ b/solutions/search/applications/search-application-api.md @@ -367,7 +367,7 @@ POST _application/search_application/my-search-app/_search ### Text search + ELSER [search-application-api-catchall-template] -The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/vector/semantic-search-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. +The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. This example provides a single template that you can use for various search application scenarios: text search, ELSER, or all of the above. It also provides a simple default `query_string` query if no parameters are specified. diff --git a/solutions/search/vector/dense-vector.md b/solutions/search/vector/dense-vector.md index 4588000400..4f333bf189 100644 --- a/solutions/search/vector/dense-vector.md +++ b/solutions/search/vector/dense-vector.md @@ -10,7 +10,7 @@ Dense neural embeddings capture semantic meaning by translating content into fix ## Working with dense vectors in Elasticsearch :::{tip} -Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](semantic-search/semantic-search-semantic-text.md). +Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](../semantic-search/semantic-search-semantic-text.md). ::: Dense vector search requires both index configuration and a strategy for generating embeddings. To use dense vectors in {{es}}: diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index a776c61fb4..250ff8b417 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -142,7 +142,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/vector/semantic-search-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/semantic-search/semantic-search-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models diff --git a/solutions/search/vector/sparse-vector.md b/solutions/search/vector/sparse-vector.md index 9b52a7fac4..7b121a123d 100644 --- a/solutions/search/vector/sparse-vector.md +++ b/solutions/search/vector/sparse-vector.md @@ -14,7 +14,7 @@ Sparse vector search with ELSER is ideal for: ## Working with sparse vectors in Elasticsearch :::{tip} -Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](semantic-search/semantic-search-semantic-text.md). +Using the `semantic_text` field type provides automatic model management and sensible defaults. [Learn more](../semantic-search/semantic-search-semantic-text.md). ::: Sparse vector search with ELSER expands both documents and queries into weighted terms. To use sparse vectors in {{es}}: From 9f8e3d5a3e221a058c2365b1d674cd1841ac7e72 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 09:43:19 +0100 Subject: [PATCH 24/30] rename file for clarity --- .../self-managed/install-elasticsearch-with-docker.md | 2 +- explore-analyze/machine-learning/nlp/ml-nlp-elser.md | 4 ++-- solutions/search/applications/search-application-api.md | 2 +- ...elser.md => semantic-search-elser-ingest-pipelines.md} | 7 +------ .../search/vector/dense-versus-sparse-ingest-pipelines.md | 8 +++----- solutions/toc.yml | 2 +- 6 files changed, 9 insertions(+), 16 deletions(-) rename solutions/search/semantic-search/{semantic-search-elser.md => semantic-search-elser-ingest-pipelines.md} (96%) diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md index 18e88d7347..896579f323 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md @@ -90,7 +90,7 @@ docker pull docker.elastic.co/elasticsearch/elasticsearch-wolfi:9.0.0-beta1 :::: - {{ml-cap}} features such as [semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: + {{ml-cap}} features such as [semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md) require a larger container with more than 1GB of memory. If you intend to use the {{ml}} capabilities, then start the container with this command: ```sh docker run --name es01 --net elastic -p 9200:9200 -it -m 6GB -e "xpack.ml.use_auto_machine_memory_percent=true" docker.elastic.co/elasticsearch/elasticsearch:9.0.0-beta1 diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index cff6b093cc..e15532e782 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -42,7 +42,7 @@ If you want to learn more about the ELSER V2 improvements, refer to [this blog p ### Upgrading to ELSER v2 [upgrade-elser-v2] -ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/semantic-search/semantic-search-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. +ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. Additionally, the `elasticearch-labs` GitHub repository contains an interactive [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/model-upgrades/upgrading-index-to-use-elser.ipynb) that walks through upgrading an index to ELSER V2. @@ -298,7 +298,7 @@ To gain the biggest value out of ELSER trained models, consider to follow this l ## Further reading [further-readings] * [Perform semantic search with `semantic_text` using the ELSER endpoint](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) -* [Perform semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) +* [Perform semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md) ## Benchmark information [elser-benchmarks] diff --git a/solutions/search/applications/search-application-api.md b/solutions/search/applications/search-application-api.md index 320f5a0e81..6304e54bc6 100644 --- a/solutions/search/applications/search-application-api.md +++ b/solutions/search/applications/search-application-api.md @@ -367,7 +367,7 @@ POST _application/search_application/my-search-app/_search ### Text search + ELSER [search-application-api-catchall-template] -The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. +The Elastic Learned Sparse EncodeR ([ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) improves search relevance through text-expansion, which enables semantic search. This experimental template requires ELSER to be enabled for one or more fields. Refer to [Semantic search with ELSER](/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md) for more information on how to use ELSER. In this case, ELSER is enabled on the `title` and `description` fields. This example provides a single template that you can use for various search application scenarios: text search, ELSER, or all of the above. It also provides a simple default `query_string` query if no parameters are specified. diff --git a/solutions/search/semantic-search/semantic-search-elser.md b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md similarity index 96% rename from solutions/search/semantic-search/semantic-search-elser.md rename to solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md index fdbb42ebf6..ecdeed5b54 100644 --- a/solutions/search/semantic-search/semantic-search-elser.md +++ b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md @@ -63,12 +63,7 @@ PUT my-index 4. The field type which is text in this example. -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](#save-space. - - -### Create an ingest pipeline with an inference processor [inference-ingest-pipeline] - -Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) to use ELSER to infer against the data that is being ingested in the pipeline. +To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) to use ELSER to infer against the data that is being ingested in the pipeline. ```console PUT _ingest/pipeline/elser-v2-test diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index 250ff8b417..1afa69c36f 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -1,11 +1,9 @@ --- -navigation_title: "Tutorial: Dense and sparse workflows" +navigation_title: "Tutorial: Manual dense and sparse workflows" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-deployed-nlp-model.html --- - - # Tutorial: Dense and sparse workflows using ingest pipelines [semantic-search-deployed-nlp-model] @@ -55,7 +53,7 @@ Before you start using the deployed model to generate embeddings based on your i ::::::{tab-item} ELSER ELSER produces token-weight pairs as output from the input text and the query. The {{es}} [`sparse_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) field type can store these token-weight pairs as numeric feature vectors. The index must have a field with the `sparse_vector` field type to index the tokens that ELSER generates. -To create a mapping for your ELSER index, refer to the [Create the index mapping section](../semantic-search/semantic-search-elser.md#elser-mappings) of the tutorial. The example shows how to create an index mapping for `my-index` that defines the `my_embeddings.tokens` field - which will contain the ELSER output - as a `sparse_vector` field. +To create a mapping for your ELSER index, refer to the [Create the index mapping section](../semantic-search/semantic-search-elser-ingest-pipelines.md#elser-mappings) of the tutorial. The example shows how to create an index mapping for `my-index` that defines the `my_embeddings.tokens` field - which will contain the ELSER output - as a `sparse_vector` field. ```console PUT my-index @@ -142,7 +140,7 @@ PUT _ingest/pipeline/my-text-embeddings-pipeline 1. Configuration object that defines the `input_field` for the {{infer}} process and the `output_field` that will contain the {{infer}} results. -To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/semantic-search/semantic-search-elser.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). +To ingest data through the pipeline to generate tokens with ELSER, refer to the [Ingest the data through the {{infer}} ingest pipeline](/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md#reindexing-data-elser) section of the tutorial. After you successfully ingested documents by using the pipeline, your index will contain the tokens generated by ELSER. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). :::::: ::::::{tab-item} Dense vector models diff --git a/solutions/toc.yml b/solutions/toc.yml index ca4bc38295..e1f2e5fc6f 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -644,7 +644,7 @@ toc: children: - file: search/semantic-search/semantic-search-semantic-text.md - file: search/semantic-search/semantic-search-inference.md - - file: search/semantic-search/semantic-search-elser.md + - file: search/semantic-search/semantic-search-elser-ingest-pipelines.md - file: search/semantic-search/cohere-es.md - file: search/hybrid-search.md children: From 629485f1a0fbe195fda5ded9cceaaaf42f19de0d Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 09:48:00 +0100 Subject: [PATCH 25/30] link --- .../semantic-search/semantic-search-elser-ingest-pipelines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md index ecdeed5b54..cc83ad1535 100644 --- a/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md +++ b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md @@ -63,7 +63,7 @@ PUT my-index 4. The field type which is text in this example. -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) to use ELSER to infer against the data that is being ingested in the pipeline. +To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](/manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) to use ELSER to infer against the data that is being ingested in the pipeline. ```console PUT _ingest/pipeline/elser-v2-test From 8686b287ac0298ca711d75191242d43d04434f4f Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 10:01:02 +0100 Subject: [PATCH 26/30] fix image path? --- solutions/search/semantic-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solutions/search/semantic-search.md b/solutions/search/semantic-search.md index ece8262a30..61ad2dd841 100644 --- a/solutions/search/semantic-search.md +++ b/solutions/search/semantic-search.md @@ -24,7 +24,7 @@ You have several options for using NLP models for semantic search in the {{stack This diagram summarizes the relative complexity of each workflow: -:::{image} /images/elasticsearch-reference-semantic-options.svg +:::{image} ../../images/elasticsearch-reference-semantic-options.svg :alt: Overview of semantic search workflows in {{es}} ::: From 4352c2b3b79118cfda774ad9dab9766044ea92fc Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 15:03:14 +0100 Subject: [PATCH 27/30] =?UTF-8?q?=F0=9F=9A=98=20playground=20consolidation?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../serverless/elasticsearch-playground.md | 10 - .../kibana/kibana/playground.md | 263 ----------------- raw-migrated-files/toc.yml | 2 - solutions/search/rag/playground.md | 265 +++++++++++++++++- 4 files changed, 257 insertions(+), 283 deletions(-) delete mode 100644 raw-migrated-files/docs-content/serverless/elasticsearch-playground.md delete mode 100644 raw-migrated-files/kibana/kibana/playground.md diff --git a/raw-migrated-files/docs-content/serverless/elasticsearch-playground.md b/raw-migrated-files/docs-content/serverless/elasticsearch-playground.md deleted file mode 100644 index 5ae1dd12a5..0000000000 --- a/raw-migrated-files/docs-content/serverless/elasticsearch-playground.md +++ /dev/null @@ -1,10 +0,0 @@ -# Playground [elasticsearch-playground] - -Use the Search Playground to test and edit {{es}} queries visually in the UI. Then use the Chat Playground to combine your {{es}} data with large language models (LLMs) for retrieval augmented generation (RAG). You can also view the underlying Python code that powers the chat interface, and use it in your own application. - -Find Playground in the {{es-serverless}} UI under **{{es}} > Build > Playground**. - -::::{note} -ℹ️ The Playground documentation currently lives in the [{{kib}} docs](../../../solutions/search/rag/playground.md). - -:::: diff --git a/raw-migrated-files/kibana/kibana/playground.md b/raw-migrated-files/kibana/kibana/playground.md deleted file mode 100644 index 68a7bc1c3c..0000000000 --- a/raw-migrated-files/kibana/kibana/playground.md +++ /dev/null @@ -1,263 +0,0 @@ -# Playground [playground] - -::::{warning} -This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. -:::: - - -Use Playground to combine your Elasticsearch data with the power of large language models (LLMs) for retrieval augmented generation (RAG). The chat interface translates your natural language questions into {{es}} queries, retrieves the most relevant results from your {{es}} documents, and passes those documents to the LLM to generate tailored responses. - -Once you start chatting, use the UI to view and modify the Elasticsearch queries that search your data. You can also view the underlying Python code that powers the chat interface, and download this code to integrate into your own application. - -Learn how to get started on this page. Refer to the following for more advanced topics: - -* [Optimize model context](../../../solutions/search/rag/playground-context.md) -* [View and modify queries](../../../solutions/search/rag/playground-query.md) -* [Troubleshooting](../../../solutions/search/rag/playground-troubleshooting.md) - -::::{admonition} 🍿 Getting started videos -Watch these video tutorials to help you get started: - -* [Getting Started](https://www.youtube.com/watch?v=zTHgJ3rhe10) -* [Using Playground with local LLMs](https://www.youtube.com/watch?v=ZtxoASFvkno) - -:::: - - - -## How Playground works [playground-how-it-works] - -Here’s a simpified overview of how Playground works: - -* User **creates a connection** to LLM provider -* User **selects a model** to use for generating responses -* User **define the model’s behavior and tone** with initial instructions - - * **Example**: "*You are a friendly assistant for question-answering tasks. Keep responses as clear and concise as possible.*" - -* User **selects {{es}} indices** to search -* User **enters a question** in the chat interface -* Playground **autogenerates an {{es}} query** to retrieve relevant documents - - * User can **view and modify underlying {{es}} query** in the UI - -* Playground **auto-selects relevant fields** from retrieved documents to pass to the LLM - - * User can **edit fields targeted** - -* Playground passes **filtered documents** to the LLM - - * The LLM generates a response based on the original query, initial instructions, chat history, and {{es}} context - -* User can **view the Python code** that powers the chat interface - - * User can also **Download the code** to integrate into application - - - -## Availability and prerequisites [playground-availability-prerequisites] - -For Elastic Cloud and self-managed deployments Playground is available in the **Search** space in {{kib}}, under **Content** > **Playground**. - -For Elastic Serverless, Playground is available in your {{es}} project UI. - -To use Playground, you’ll need the following: - -1. An Elastic **v8.14.0+** deployment or {{es}} **Serverless** project. (Start a [free trial](https://cloud.elastic.co/registration)). -2. At least one **{{es}} index** with documents to search. - - * See [ingest data](../../../solutions/search/rag/playground.md#playground-getting-started-ingest) if you’d like to ingest sample data. - -3. An account with a **supported LLM provider**. Playground supports the following: - - * **Amazon Bedrock** - - * Anthropic: Claude 3.5 Sonnet - * Anthropic: Claude 3 Haiku - - * **OpenAI** - - * GPT-3 turbo - * GPT-4 turbo - * GPT-4 omni - - * **Azure OpenAI** (note: Buffers responses in large chunks) - - * GPT-3 turbo - * GPT-4 turbo - - * **Google** - - * Google Gemini 1.5 Pro - * Google Gemini 1.5 Flash - - -::::{tip} -:name: playground-local-llms - -You can also use locally hosted LLMs that are compatible with the OpenAI SDK. Once you’ve set up your LLM, you can connect to it using the OpenAI connector. Refer to the following for examples: - -* [Using LM Studio](../../../solutions/security/ai/connect-to-own-local-llm.md) -* [LocalAI with `docker-compose`](https://www.elastic.co/search-labs/blog/localai-for-text-embeddings) - -:::: - - - -## Getting started [playground-getting-started] - -:::{image} ../../../images/kibana-get-started.png -:alt: get started -:class: screenshot -::: - - -### Connect to LLM provider [playground-getting-started-connect] - -To get started with Playground, you need to create a [connector](../../../deploy-manage/manage-connectors.md) for your LLM provider. You can also connect to [locally hosted LLMs](../../../solutions/search/rag/playground.md#playground-local-llms) which are compatible with the OpenAI API, by using the OpenAI connector. - -To connect to an LLM provider, follow these steps on the Playground landing page: - -1. Under **Connect to an LLM**, click **Create connector**. -2. Select your **LLM provider**. -3. **Name** your connector. -4. Select a **URL endpoint** (or use the default). -5. Enter **access credentials** for your LLM provider. (If you’re running a locally hosted LLM using the OpenAI connector, you must input a value in the API key form, but the specific value doesn’t matter.) - -::::{tip} -If you need to update a connector, or add a new one, click the 🔧 **Manage** button beside **Model settings**. - -:::: - - - -### Ingest data (optional) [playground-getting-started-ingest] - -*You can skip this step if you already have data in one or more {{es}} indices.* - -There are many options for ingesting data into {{es}}, including: - -* The [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) for web content (**NOTE**: Not yet available in *Serverless*) -* [Elastic connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html) for data synced from third-party sources -* The {{es}} [Bulk API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html) for JSON documents - - ::::{dropdown} **Expand** for example - To add a few documents to an index called `books` run the following in Dev Tools Console: - - ```console - POST /_bulk - { "index" : { "_index" : "books" } } - {"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} - { "index" : { "_index" : "books" } } - {"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} - { "index" : { "_index" : "books" } } - {"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} - { "index" : { "_index" : "books" } } - {"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} - { "index" : { "_index" : "books" } } - {"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} - { "index" : { "_index" : "books" } } - {"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} - ``` - - :::: - - -We’ve also provided some Jupyter notebooks to easily ingest sample data into {{es}}. Find these in the [elasticsearch-labs](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/ingestion-and-chunking) repository. These notebooks use the official {{es}} Python client. - - -### Select {{es}} indices [playground-getting-started-index] - -Once you’ve connected to your LLM provider, it’s time to choose the data you want to search. - -1. Click **Add data sources**. -2. Select one or more {{es}} indices. -3. Click **Save and continue** to launch the chat interface. - -:::::{tip} -You can always add or remove indices later by selecting the **Data** button from the main Playground UI. - -:::{image} ../../../images/kibana-data-button.png -:alt: data button -:class: screenshot -::: - -::::: - - - -### Chat and query modes [playground-getting-started-chat-query-modes] - -Since 8.15.0 (and earlier for {{es}} Serverless), the main Playground UI has two modes: - -* **Chat mode**: The default mode, where you can chat with your data via the LLM. -* **Query mode**: View and modify the {{es}} query generated by the chat interface. - -The **chat mode** is selected when you first set up your Playground instance. - -:::{image} ../../../images/kibana-chat-interface.png -:alt: chat interface -:class: screenshot -::: - -To switch to **query mode**, select **Query** from the main UI. - -:::{image} ../../../images/kibana-query-interface.png -:alt: query interface -:class: screenshot -::: - -::::{tip} -Learn more about the underlying {{es}} queries used to search your data in [View and modify queries](../../../solutions/search/rag/playground-query.md) - -:::: - - - -### Set up the chat interface [playground-getting-started-setup-chat] - -You can start chatting with your data immediately, but you might want to tweak some defaults first. - -You can adjust the following under **Model settings**: - -* **Model**. The model used for generating responses. -* **Instructions**. Also known as the *system prompt*, these initial instructions and guidelines define the behavior of the model throughout the conversation. Be **clear and specific** for best results. -* **Include citations**. A toggle to include citations from the relevant {{es}} documents in responses. - -Playground also uses another LLM under the hood, to encode all previous questions and responses, and make them available to the main model. This ensures the model has "conversational memory". - -Under **Indices**, you can edit which {{es}} indices will be searched. This will affect the underlying {{es}} query. - -::::{tip} -Click **✨ Regenerate** to resend the last query to the model for a fresh response. - -Click **⟳ Clear chat** to clear chat history and start a new conversation. - -:::: - - - -### View and download Python code [playground-getting-started-view-code] - -Use the **View code** button to see the Python code that powers the chat interface. You can integrate it into your own application, modifying as needed. We currently support two implementation options: - -* {{es}} Python Client + LLM provider -* LangChain + LLM provider - -:::{image} ../../../images/kibana-view-code-button.png -:alt: view code button -:class: screenshot -::: - - -### Next steps [playground-next-steps] - -Once you’ve got Playground up and running, and you’ve tested out the chat interface, you might want to explore some more advanced topics: - -* [Optimize model context](../../../solutions/search/rag/playground-context.md) -* [View and modify queries](../../../solutions/search/rag/playground-query.md) -* [Troubleshooting](../../../solutions/search/rag/playground-troubleshooting.md) - - - - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 1ae5646932..ad29c4a372 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -280,7 +280,6 @@ toc: - file: docs-content/serverless/elasticsearch-ingest-data-through-api.md - file: docs-content/serverless/elasticsearch-ingest-your-data.md - file: docs-content/serverless/elasticsearch-manage-project.md - - file: docs-content/serverless/elasticsearch-playground.md - file: docs-content/serverless/elasticsearch-search-your-data-the-search-api.md - file: docs-content/serverless/elasticsearch-search-your-data.md - file: docs-content/serverless/endpoint-protection-rules.md @@ -688,7 +687,6 @@ toc: - file: kibana/kibana/managing-tags.md - file: kibana/kibana/maps.md - file: kibana/kibana/osquery.md - - file: kibana/kibana/playground.md - file: kibana/kibana/reporting-getting-started.md - file: kibana/kibana/reporting-production-considerations.md - file: kibana/kibana/role-mappings.md diff --git a/solutions/search/rag/playground.md b/solutions/search/rag/playground.md index 871c292181..c2bf28b0bd 100644 --- a/solutions/search/rag/playground.md +++ b/solutions/search/rag/playground.md @@ -4,17 +4,266 @@ mapped_urls: - https://www.elastic.co/guide/en/kibana/current/playground.html --- -# Playground +# Playground [playground] -% What needs to be done: Lift-and-shift +::::{warning} +This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. +:::: -% Use migrated content from existing pages that map to this page: -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-playground.md -% - [ ] ./raw-migrated-files/kibana/kibana/playground.md +Use Playground to combine your Elasticsearch data with the power of large language models (LLMs) for retrieval augmented generation (RAG). The chat interface translates your natural language questions into {{es}} queries, retrieves the most relevant results from your {{es}} documents, and passes those documents to the LLM to generate tailored responses. + +Once you start chatting, use the UI to view and modify the Elasticsearch queries that search your data. You can also view the underlying Python code that powers the chat interface, and download this code to integrate into your own application. + +Learn how to get started on this page. Refer to the following for more advanced topics: + +* [Optimize model context](playground-context.md) +* [View and modify queries](playground-query.md) +* [Troubleshooting](playground-troubleshooting.md) + +::::{admonition} 🍿 Getting started videos +Watch these video tutorials to help you get started: + +* [Getting Started](https://www.youtube.com/watch?v=zTHgJ3rhe10) +* [Using Playground with local LLMs](https://www.youtube.com/watch?v=ZtxoASFvkno) + +:::: + + + +## How Playground works [playground-how-it-works] + +Here’s a simpified overview of how Playground works: + +* User **creates a connection** to LLM provider +* User **selects a model** to use for generating responses +* User **define the model’s behavior and tone** with initial instructions + + * **Example**: "*You are a friendly assistant for question-answering tasks. Keep responses as clear and concise as possible.*" + +* User **selects {{es}} indices** to search +* User **enters a question** in the chat interface +* Playground **autogenerates an {{es}} query** to retrieve relevant documents + + * User can **view and modify underlying {{es}} query** in the UI + +* Playground **auto-selects relevant fields** from retrieved documents to pass to the LLM + + * User can **edit fields targeted** + +* Playground passes **filtered documents** to the LLM + + * The LLM generates a response based on the original query, initial instructions, chat history, and {{es}} context + +* User can **view the Python code** that powers the chat interface + + * User can also **Download the code** to integrate into application + + + +## Availability and prerequisites [playground-availability-prerequisites] + +For Elastic Cloud and self-managed deployments Playground is available in the **Search** space in {{kib}}, under **Content** > **Playground**. + +For Elastic Serverless, Playground is available in your {{es}} project UI. + +To use Playground, you’ll need the following: + +1. An Elastic **v8.14.0+** deployment or {{es}} **Serverless** project. (Start a [free trial](https://cloud.elastic.co/registration)). +2. At least one **{{es}} index** with documents to search. + + * See [ingest data](playground.md#playground-getting-started-ingest) if you’d like to ingest sample data. + +3. An account with a **supported LLM provider**. Playground supports the following: + + * **Amazon Bedrock** + + * Anthropic: Claude 3.5 Sonnet + * Anthropic: Claude 3 Haiku + + * **OpenAI** + + * GPT-3 turbo + * GPT-4 turbo + * GPT-4 omni + + * **Azure OpenAI** (note: Buffers responses in large chunks) + + * GPT-3 turbo + * GPT-4 turbo + + * **Google** + + * Google Gemini 1.5 Pro + * Google Gemini 1.5 Flash + + +::::{tip} +:name: playground-local-llms + +You can also use locally hosted LLMs that are compatible with the OpenAI SDK. Once you’ve set up your LLM, you can connect to it using the OpenAI connector. Refer to the following for examples: + +* [Using LM Studio](../../security/ai/connect-to-own-local-llm.md) +* [LocalAI with `docker-compose`](https://www.elastic.co/search-labs/blog/localai-for-text-embeddings) + +:::: + + + +## Getting started [playground-getting-started] + +:::{image} ../../../images/kibana-get-started.png +:alt: get started +:class: screenshot +::: + + +### Connect to LLM provider [playground-getting-started-connect] + +To get started with Playground, you need to create a [connector](../../../deploy-manage/manage-connectors.md) for your LLM provider. You can also connect to [locally hosted LLMs](playground.md#playground-local-llms) which are compatible with the OpenAI API, by using the OpenAI connector. + +To connect to an LLM provider, follow these steps on the Playground landing page: + +1. Under **Connect to an LLM**, click **Create connector**. +2. Select your **LLM provider**. +3. **Name** your connector. +4. Select a **URL endpoint** (or use the default). +5. Enter **access credentials** for your LLM provider. (If you’re running a locally hosted LLM using the OpenAI connector, you must input a value in the API key form, but the specific value doesn’t matter.) + +::::{tip} +If you need to update a connector, or add a new one, click the 🔧 **Manage** button beside **Model settings**. + +:::: + + + +### Ingest data (optional) [playground-getting-started-ingest] + +*You can skip this step if you already have data in one or more {{es}} indices.* + +There are many options for ingesting data into {{es}}, including: + +* The [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) for web content (**NOTE**: Not yet available in *Serverless*) +* [Elastic connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html) for data synced from third-party sources +* The {{es}} [Bulk API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html) for JSON documents + + ::::{dropdown} **Expand** for example + To add a few documents to an index called `books` run the following in Dev Tools Console: + + ```console + POST /_bulk + { "index" : { "_index" : "books" } } + {"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} + { "index" : { "_index" : "books" } } + {"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} + { "index" : { "_index" : "books" } } + {"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} + { "index" : { "_index" : "books" } } + {"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} + { "index" : { "_index" : "books" } } + {"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} + { "index" : { "_index" : "books" } } + {"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} + ``` + + :::: + + +We’ve also provided some Jupyter notebooks to easily ingest sample data into {{es}}. Find these in the [elasticsearch-labs](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/ingestion-and-chunking) repository. These notebooks use the official {{es}} Python client. + + +### Select {{es}} indices [playground-getting-started-index] + +Once you’ve connected to your LLM provider, it’s time to choose the data you want to search. + +1. Click **Add data sources**. +2. Select one or more {{es}} indices. +3. Click **Save and continue** to launch the chat interface. + +:::::{tip} +You can always add or remove indices later by selecting the **Data** button from the main Playground UI. + +:::{image} ../../../images/kibana-data-button.png +:alt: data button +:class: screenshot +::: + +::::: + + + +### Chat and query modes [playground-getting-started-chat-query-modes] + +Since 8.15.0 (and earlier for {{es}} Serverless), the main Playground UI has two modes: + +* **Chat mode**: The default mode, where you can chat with your data via the LLM. +* **Query mode**: View and modify the {{es}} query generated by the chat interface. + +The **chat mode** is selected when you first set up your Playground instance. + +:::{image} ../../../images/kibana-chat-interface.png +:alt: chat interface +:class: screenshot +::: + +To switch to **query mode**, select **Query** from the main UI. + +:::{image} ../../../images/kibana-query-interface.png +:alt: query interface +:class: screenshot +::: + +::::{tip} +Learn more about the underlying {{es}} queries used to search your data in [View and modify queries](playground-query.md) + +:::: + + + +### Set up the chat interface [playground-getting-started-setup-chat] + +You can start chatting with your data immediately, but you might want to tweak some defaults first. + +You can adjust the following under **Model settings**: + +* **Model**. The model used for generating responses. +* **Instructions**. Also known as the *system prompt*, these initial instructions and guidelines define the behavior of the model throughout the conversation. Be **clear and specific** for best results. +* **Include citations**. A toggle to include citations from the relevant {{es}} documents in responses. + +Playground also uses another LLM under the hood, to encode all previous questions and responses, and make them available to the main model. This ensures the model has "conversational memory". + +Under **Indices**, you can edit which {{es}} indices will be searched. This will affect the underlying {{es}} query. + +::::{tip} +Click **✨ Regenerate** to resend the last query to the model for a fresh response. + +Click **⟳ Clear chat** to clear chat history and start a new conversation. + +:::: + + + +### View and download Python code [playground-getting-started-view-code] + +Use the **View code** button to see the Python code that powers the chat interface. You can integrate it into your own application, modifying as needed. We currently support two implementation options: + +* {{es}} Python Client + LLM provider +* LangChain + LLM provider + +:::{image} ../../../images/kibana-view-code-button.png +:alt: view code button +:class: screenshot +::: + + +### Next steps [playground-next-steps] + +Once you’ve got Playground up and running, and you’ve tested out the chat interface, you might want to explore some more advanced topics: + +* [Optimize model context](playground-context.md) +* [View and modify queries](playground-query.md) +* [Troubleshooting](playground-troubleshooting.md) + -% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc): -$$$playground-getting-started-ingest$$$ -$$$playground-local-llms$$$ \ No newline at end of file From 066694ba7d14607f5efcd5e2309479cc6e121421 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 15:14:03 +0100 Subject: [PATCH 28/30] image paths --- solutions/search/rag/playground-query.md | 2 +- solutions/search/rag/playground.md | 10 +++++----- solutions/search/ranking/learning-to-rank-ltr.md | 6 +++--- .../search-approaches/near-real-time-search.md | 4 ++-- solutions/search/search-connection-details.md | 12 ++++++------ .../search/serverless-elasticsearch-get-started.md | 2 +- .../search/site-or-app/behavioral-analytics-event.md | 8 ++++---- 7 files changed, 22 insertions(+), 22 deletions(-) diff --git a/solutions/search/rag/playground-query.md b/solutions/search/rag/playground-query.md index 65b70854ad..3bf44ff3ce 100644 --- a/solutions/search/rag/playground-query.md +++ b/solutions/search/rag/playground-query.md @@ -23,7 +23,7 @@ The `{{query}}` variable represents the user’s question, rewritten as an {{es} The following screenshot shows the query editor in the Playground UI. In this simple example, the `books` index has two fields: `author` and `name`. Selecting a field adds it to the `fields` array in the query. -:::{image} ../../../images/kibana-query-interface.png +:::{image} ../../images/kibana-query-interface.png :alt: View and modify queries :class: screenshot ::: diff --git a/solutions/search/rag/playground.md b/solutions/search/rag/playground.md index c2bf28b0bd..8c4c648541 100644 --- a/solutions/search/rag/playground.md +++ b/solutions/search/rag/playground.md @@ -112,7 +112,7 @@ You can also use locally hosted LLMs that are compatible with the OpenAI SDK. On ## Getting started [playground-getting-started] -:::{image} ../../../images/kibana-get-started.png +:::{image} ../../images/kibana-get-started.png :alt: get started :class: screenshot ::: @@ -183,7 +183,7 @@ Once you’ve connected to your LLM provider, it’s time to choose the data you :::::{tip} You can always add or remove indices later by selecting the **Data** button from the main Playground UI. -:::{image} ../../../images/kibana-data-button.png +:::{image} ../../images/kibana-data-button.png :alt: data button :class: screenshot ::: @@ -201,14 +201,14 @@ Since 8.15.0 (and earlier for {{es}} Serverless), the main Playground UI has two The **chat mode** is selected when you first set up your Playground instance. -:::{image} ../../../images/kibana-chat-interface.png +:::{image} ../../images/kibana-chat-interface.png :alt: chat interface :class: screenshot ::: To switch to **query mode**, select **Query** from the main UI. -:::{image} ../../../images/kibana-query-interface.png +:::{image} ../../images/kibana-query-interface.png :alt: query interface :class: screenshot ::: @@ -250,7 +250,7 @@ Use the **View code** button to see the Python code that powers the chat interfa * {{es}} Python Client + LLM provider * LangChain + LLM provider -:::{image} ../../../images/kibana-view-code-button.png +:::{image} ../../images/kibana-view-code-button.png :alt: view code button :class: screenshot ::: diff --git a/solutions/search/ranking/learning-to-rank-ltr.md b/solutions/search/ranking/learning-to-rank-ltr.md index 476ac4c06b..71eec97b54 100644 --- a/solutions/search/ranking/learning-to-rank-ltr.md +++ b/solutions/search/ranking/learning-to-rank-ltr.md @@ -12,7 +12,7 @@ This feature was introduced in version 8.12.0 and is only available to certain s Learning To Rank (LTR) uses a trained machine learning (ML) model to build a ranking function for your search engine. Typically, the model is used as a second stage re-ranker, to improve the relevance of search results returned by a simpler, first stage retrieval algorithm. The LTR function takes a list of documents and a search context and outputs ranked documents: -:::{image} ../../../images/elasticsearch-reference-learning-to-rank-overview.png +:::{image} ../../images/elasticsearch-reference-learning-to-rank-overview.png :alt: Learning To Rank overview :title: Learning To Rank overview :name: learning-to-rank-overview-diagram @@ -30,7 +30,7 @@ The LTR model is usually trained on a judgment list, which is a set of queries a The judgment list is the main input used to train the model. It consists of a dataset that contains pairs of queries and documents, along with their corresponding relevance labels. The relevance judgment is typically either a binary (relevant/irrelevant) or a more granular label, such as a grade between 0 (completely irrelevant) to 4 (highly relevant). The example below uses a graded relevance judgment. -:::{image} ../../../images/elasticsearch-reference-learning-to-rank-judgment-list.png +:::{image} ../../images/elasticsearch-reference-learning-to-rank-judgment-list.png :alt: Judgment list example :title: Judgment list example :name: learning-to-rank-judgment-list-example @@ -59,7 +59,7 @@ These features fall into one of three main categories: To prepare the dataset for training, the features are added to the judgment list: -:::{image} ../../../images/elasticsearch-reference-learning-to-rank-feature-extraction.png +:::{image} ../../images/elasticsearch-reference-learning-to-rank-feature-extraction.png :alt: Judgment list with features :title: Judgment list with features :name: learning-to-rank-judgement-feature-extraction diff --git a/solutions/search/search-approaches/near-real-time-search.md b/solutions/search/search-approaches/near-real-time-search.md index c3cb8a5c65..9b3c9c95ca 100644 --- a/solutions/search/search-approaches/near-real-time-search.md +++ b/solutions/search/search-approaches/near-real-time-search.md @@ -11,7 +11,7 @@ Lucene, the Java libraries on which {{es}} is based, introduced the concept of p Sitting between {{es}} and the disk is the filesystem cache. Documents in the in-memory indexing buffer ([Figure 1](#img-pre-refresh)) are written to a new segment ([Figure 2](#img-post-refresh)). The new segment is written to the filesystem cache first (which is cheap) and only later is it flushed to disk (which is expensive). However, after a file is in the cache, it can be opened and read just like any other file. -:::{image} ../../../images/elasticsearch-reference-lucene-in-memory-buffer.png +:::{image} ../../images/elasticsearch-reference-lucene-in-memory-buffer.png :alt: A Lucene index with new documents in the in-memory buffer :title: A Lucene index with new documents in the in-memory buffer :name: img-pre-refresh @@ -19,7 +19,7 @@ Sitting between {{es}} and the disk is the filesystem cache. Documents in the in Lucene allows new segments to be written and opened, making the documents they contain visible to search ​without performing a full commit. This is a much lighter process than a commit to disk, and can be done frequently without degrading performance. -:::{image} ../../../images/elasticsearch-reference-lucene-written-not-committed.png +:::{image} ../../images/elasticsearch-reference-lucene-written-not-committed.png :alt: The buffer contents are written to a segment, which is searchable, but is not yet committed :title: The buffer contents are written to a segment, which is searchable, but is not yet committed :name: img-post-refresh diff --git a/solutions/search/search-connection-details.md b/solutions/search/search-connection-details.md index 9824056a0a..48fc0bb424 100644 --- a/solutions/search/search-connection-details.md +++ b/solutions/search/search-connection-details.md @@ -21,14 +21,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the Elastic Cloud home page. 2. In the main menu, click **Manage this deployment**. - :::{image} ../../../images/kibana-manage-deployment.png + :::{image} ../../images/kibana-manage-deployment.png :alt: manage deployment :class: screenshot ::: 3. The Cloud ID is displayed on the right side of the page. - :::{image} ../../../images/kibana-cloud-id.png + :::{image} ../../images/kibana-cloud-id.png :alt: cloud id :class: screenshot ::: @@ -39,14 +39,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. To navigate to **API keys**, use the [**global search bar**](../../get-started/the-stack.md#kibana-navigation-search). - :::{image} ../../../images/kibana-api-keys-search-bar.png + :::{image} ../../images/kibana-api-keys-search-bar.png :alt: api keys search bar :class: screenshot ::: 2. Click **Create API key**. - :::{image} ../../../images/kibana-click-create-api-key.png + :::{image} ../../images/kibana-click-create-api-key.png :alt: click create api key :class: screenshot ::: @@ -62,7 +62,7 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Copy your connection details** section, and copy the **Elasticsearch endpoint**. - :::{image} ../../../images/kibana-serverless-connection-details.png + :::{image} ../../images/kibana-serverless-connection-details.png :alt: serverless connection details :class: screenshot ::: @@ -79,7 +79,7 @@ The **Cloud ID** is also displayed in the Copy your connection details section, 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Add an API Key** section, and click **New**. - :::{image} ../../../images/kibana-serverless-create-an-api-key.png + :::{image} ../../images/kibana-serverless-create-an-api-key.png :alt: serverless create an api key :class: screenshot ::: diff --git a/solutions/search/serverless-elasticsearch-get-started.md b/solutions/search/serverless-elasticsearch-get-started.md index f15f5ccbe2..19c7132c50 100644 --- a/solutions/search/serverless-elasticsearch-get-started.md +++ b/solutions/search/serverless-elasticsearch-get-started.md @@ -60,7 +60,7 @@ Once your project is set up, you’ll be directed to a page where you can create 1. Enter a name for your index. 2. Click **Create my index**. You can also create the index by clicking on **Code** to view and run code through the command line. - :::{image} ../../../images/serverless-get-started-create-an-index.png + :::{image} ../../images/serverless-get-started-create-an-index.png :alt: Create an index. ::: diff --git a/solutions/search/site-or-app/behavioral-analytics-event.md b/solutions/search/site-or-app/behavioral-analytics-event.md index c5a796678f..82947cab94 100644 --- a/solutions/search/site-or-app/behavioral-analytics-event.md +++ b/solutions/search/site-or-app/behavioral-analytics-event.md @@ -31,7 +31,7 @@ This allows you to quickly check both absolute numbers and trends (tracked in pe The following screenshot shows an example **Overview** dashboard: -:::{image} ../../../images/elasticsearch-reference-analytics-overview-dashboard.png +:::{image} ../../images/elasticsearch-reference-analytics-overview-dashboard.png :alt: Analytics Overview dashboard showing the number of searches :class: screenshot ::: @@ -53,7 +53,7 @@ You can also easily sort in ascending or descending order by clicking on the hea The following screenshot shows the **Locations** tab of an **Explorer** dashboard, with a list of top locations in descending order: -:::{image} ../../../images/elasticsearch-reference-analytics-explorer-dashboard.png +:::{image} ../../images/elasticsearch-reference-analytics-explorer-dashboard.png :alt: Explorer dashboard showing the top locations in descending order :class: screenshot ::: @@ -67,7 +67,7 @@ Discover works with [data views^](../../../explore-analyze/find-and-organize/dat The following screenshot shows you where to find the data view dropdown menu in Discover: -:::{image} ../../../images/elasticsearch-reference-discover-data-view-analytics.png +:::{image} ../../images/elasticsearch-reference-discover-data-view-analytics.png :alt: Analytics Discover app showing the data view dropdown menu :class: screenshot ::: @@ -84,7 +84,7 @@ Discover has a lot of options, but here’s a quick overview of how to get start The following screenshot shows a Lens visualization of an `event.action` distribution: -:::{image} ../../../images/elasticsearch-reference-discover-lens-analytics.png +:::{image} ../../images/elasticsearch-reference-discover-lens-analytics.png :alt: Analytics Discover app showing a Lens visualization of an event action distribution :class: screenshot ::: From 538ea6a6da0c09be659030d153c4f5d0001dc15d Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 15:18:02 +0100 Subject: [PATCH 29/30] Revert "image paths" This reverts commit 066694ba7d14607f5efcd5e2309479cc6e121421. --- solutions/search/rag/playground-query.md | 2 +- solutions/search/rag/playground.md | 10 +++++----- solutions/search/ranking/learning-to-rank-ltr.md | 6 +++--- .../search-approaches/near-real-time-search.md | 4 ++-- solutions/search/search-connection-details.md | 12 ++++++------ .../search/serverless-elasticsearch-get-started.md | 2 +- .../search/site-or-app/behavioral-analytics-event.md | 8 ++++---- 7 files changed, 22 insertions(+), 22 deletions(-) diff --git a/solutions/search/rag/playground-query.md b/solutions/search/rag/playground-query.md index 3bf44ff3ce..65b70854ad 100644 --- a/solutions/search/rag/playground-query.md +++ b/solutions/search/rag/playground-query.md @@ -23,7 +23,7 @@ The `{{query}}` variable represents the user’s question, rewritten as an {{es} The following screenshot shows the query editor in the Playground UI. In this simple example, the `books` index has two fields: `author` and `name`. Selecting a field adds it to the `fields` array in the query. -:::{image} ../../images/kibana-query-interface.png +:::{image} ../../../images/kibana-query-interface.png :alt: View and modify queries :class: screenshot ::: diff --git a/solutions/search/rag/playground.md b/solutions/search/rag/playground.md index 8c4c648541..c2bf28b0bd 100644 --- a/solutions/search/rag/playground.md +++ b/solutions/search/rag/playground.md @@ -112,7 +112,7 @@ You can also use locally hosted LLMs that are compatible with the OpenAI SDK. On ## Getting started [playground-getting-started] -:::{image} ../../images/kibana-get-started.png +:::{image} ../../../images/kibana-get-started.png :alt: get started :class: screenshot ::: @@ -183,7 +183,7 @@ Once you’ve connected to your LLM provider, it’s time to choose the data you :::::{tip} You can always add or remove indices later by selecting the **Data** button from the main Playground UI. -:::{image} ../../images/kibana-data-button.png +:::{image} ../../../images/kibana-data-button.png :alt: data button :class: screenshot ::: @@ -201,14 +201,14 @@ Since 8.15.0 (and earlier for {{es}} Serverless), the main Playground UI has two The **chat mode** is selected when you first set up your Playground instance. -:::{image} ../../images/kibana-chat-interface.png +:::{image} ../../../images/kibana-chat-interface.png :alt: chat interface :class: screenshot ::: To switch to **query mode**, select **Query** from the main UI. -:::{image} ../../images/kibana-query-interface.png +:::{image} ../../../images/kibana-query-interface.png :alt: query interface :class: screenshot ::: @@ -250,7 +250,7 @@ Use the **View code** button to see the Python code that powers the chat interfa * {{es}} Python Client + LLM provider * LangChain + LLM provider -:::{image} ../../images/kibana-view-code-button.png +:::{image} ../../../images/kibana-view-code-button.png :alt: view code button :class: screenshot ::: diff --git a/solutions/search/ranking/learning-to-rank-ltr.md b/solutions/search/ranking/learning-to-rank-ltr.md index 71eec97b54..476ac4c06b 100644 --- a/solutions/search/ranking/learning-to-rank-ltr.md +++ b/solutions/search/ranking/learning-to-rank-ltr.md @@ -12,7 +12,7 @@ This feature was introduced in version 8.12.0 and is only available to certain s Learning To Rank (LTR) uses a trained machine learning (ML) model to build a ranking function for your search engine. Typically, the model is used as a second stage re-ranker, to improve the relevance of search results returned by a simpler, first stage retrieval algorithm. The LTR function takes a list of documents and a search context and outputs ranked documents: -:::{image} ../../images/elasticsearch-reference-learning-to-rank-overview.png +:::{image} ../../../images/elasticsearch-reference-learning-to-rank-overview.png :alt: Learning To Rank overview :title: Learning To Rank overview :name: learning-to-rank-overview-diagram @@ -30,7 +30,7 @@ The LTR model is usually trained on a judgment list, which is a set of queries a The judgment list is the main input used to train the model. It consists of a dataset that contains pairs of queries and documents, along with their corresponding relevance labels. The relevance judgment is typically either a binary (relevant/irrelevant) or a more granular label, such as a grade between 0 (completely irrelevant) to 4 (highly relevant). The example below uses a graded relevance judgment. -:::{image} ../../images/elasticsearch-reference-learning-to-rank-judgment-list.png +:::{image} ../../../images/elasticsearch-reference-learning-to-rank-judgment-list.png :alt: Judgment list example :title: Judgment list example :name: learning-to-rank-judgment-list-example @@ -59,7 +59,7 @@ These features fall into one of three main categories: To prepare the dataset for training, the features are added to the judgment list: -:::{image} ../../images/elasticsearch-reference-learning-to-rank-feature-extraction.png +:::{image} ../../../images/elasticsearch-reference-learning-to-rank-feature-extraction.png :alt: Judgment list with features :title: Judgment list with features :name: learning-to-rank-judgement-feature-extraction diff --git a/solutions/search/search-approaches/near-real-time-search.md b/solutions/search/search-approaches/near-real-time-search.md index 9b3c9c95ca..c3cb8a5c65 100644 --- a/solutions/search/search-approaches/near-real-time-search.md +++ b/solutions/search/search-approaches/near-real-time-search.md @@ -11,7 +11,7 @@ Lucene, the Java libraries on which {{es}} is based, introduced the concept of p Sitting between {{es}} and the disk is the filesystem cache. Documents in the in-memory indexing buffer ([Figure 1](#img-pre-refresh)) are written to a new segment ([Figure 2](#img-post-refresh)). The new segment is written to the filesystem cache first (which is cheap) and only later is it flushed to disk (which is expensive). However, after a file is in the cache, it can be opened and read just like any other file. -:::{image} ../../images/elasticsearch-reference-lucene-in-memory-buffer.png +:::{image} ../../../images/elasticsearch-reference-lucene-in-memory-buffer.png :alt: A Lucene index with new documents in the in-memory buffer :title: A Lucene index with new documents in the in-memory buffer :name: img-pre-refresh @@ -19,7 +19,7 @@ Sitting between {{es}} and the disk is the filesystem cache. Documents in the in Lucene allows new segments to be written and opened, making the documents they contain visible to search ​without performing a full commit. This is a much lighter process than a commit to disk, and can be done frequently without degrading performance. -:::{image} ../../images/elasticsearch-reference-lucene-written-not-committed.png +:::{image} ../../../images/elasticsearch-reference-lucene-written-not-committed.png :alt: The buffer contents are written to a segment, which is searchable, but is not yet committed :title: The buffer contents are written to a segment, which is searchable, but is not yet committed :name: img-post-refresh diff --git a/solutions/search/search-connection-details.md b/solutions/search/search-connection-details.md index 48fc0bb424..9824056a0a 100644 --- a/solutions/search/search-connection-details.md +++ b/solutions/search/search-connection-details.md @@ -21,14 +21,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the Elastic Cloud home page. 2. In the main menu, click **Manage this deployment**. - :::{image} ../../images/kibana-manage-deployment.png + :::{image} ../../../images/kibana-manage-deployment.png :alt: manage deployment :class: screenshot ::: 3. The Cloud ID is displayed on the right side of the page. - :::{image} ../../images/kibana-cloud-id.png + :::{image} ../../../images/kibana-cloud-id.png :alt: cloud id :class: screenshot ::: @@ -39,14 +39,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. To navigate to **API keys**, use the [**global search bar**](../../get-started/the-stack.md#kibana-navigation-search). - :::{image} ../../images/kibana-api-keys-search-bar.png + :::{image} ../../../images/kibana-api-keys-search-bar.png :alt: api keys search bar :class: screenshot ::: 2. Click **Create API key**. - :::{image} ../../images/kibana-click-create-api-key.png + :::{image} ../../../images/kibana-click-create-api-key.png :alt: click create api key :class: screenshot ::: @@ -62,7 +62,7 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Copy your connection details** section, and copy the **Elasticsearch endpoint**. - :::{image} ../../images/kibana-serverless-connection-details.png + :::{image} ../../../images/kibana-serverless-connection-details.png :alt: serverless connection details :class: screenshot ::: @@ -79,7 +79,7 @@ The **Cloud ID** is also displayed in the Copy your connection details section, 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Add an API Key** section, and click **New**. - :::{image} ../../images/kibana-serverless-create-an-api-key.png + :::{image} ../../../images/kibana-serverless-create-an-api-key.png :alt: serverless create an api key :class: screenshot ::: diff --git a/solutions/search/serverless-elasticsearch-get-started.md b/solutions/search/serverless-elasticsearch-get-started.md index 19c7132c50..f15f5ccbe2 100644 --- a/solutions/search/serverless-elasticsearch-get-started.md +++ b/solutions/search/serverless-elasticsearch-get-started.md @@ -60,7 +60,7 @@ Once your project is set up, you’ll be directed to a page where you can create 1. Enter a name for your index. 2. Click **Create my index**. You can also create the index by clicking on **Code** to view and run code through the command line. - :::{image} ../../images/serverless-get-started-create-an-index.png + :::{image} ../../../images/serverless-get-started-create-an-index.png :alt: Create an index. ::: diff --git a/solutions/search/site-or-app/behavioral-analytics-event.md b/solutions/search/site-or-app/behavioral-analytics-event.md index 82947cab94..c5a796678f 100644 --- a/solutions/search/site-or-app/behavioral-analytics-event.md +++ b/solutions/search/site-or-app/behavioral-analytics-event.md @@ -31,7 +31,7 @@ This allows you to quickly check both absolute numbers and trends (tracked in pe The following screenshot shows an example **Overview** dashboard: -:::{image} ../../images/elasticsearch-reference-analytics-overview-dashboard.png +:::{image} ../../../images/elasticsearch-reference-analytics-overview-dashboard.png :alt: Analytics Overview dashboard showing the number of searches :class: screenshot ::: @@ -53,7 +53,7 @@ You can also easily sort in ascending or descending order by clicking on the hea The following screenshot shows the **Locations** tab of an **Explorer** dashboard, with a list of top locations in descending order: -:::{image} ../../images/elasticsearch-reference-analytics-explorer-dashboard.png +:::{image} ../../../images/elasticsearch-reference-analytics-explorer-dashboard.png :alt: Explorer dashboard showing the top locations in descending order :class: screenshot ::: @@ -67,7 +67,7 @@ Discover works with [data views^](../../../explore-analyze/find-and-organize/dat The following screenshot shows you where to find the data view dropdown menu in Discover: -:::{image} ../../images/elasticsearch-reference-discover-data-view-analytics.png +:::{image} ../../../images/elasticsearch-reference-discover-data-view-analytics.png :alt: Analytics Discover app showing the data view dropdown menu :class: screenshot ::: @@ -84,7 +84,7 @@ Discover has a lot of options, but here’s a quick overview of how to get start The following screenshot shows a Lens visualization of an `event.action` distribution: -:::{image} ../../images/elasticsearch-reference-discover-lens-analytics.png +:::{image} ../../../images/elasticsearch-reference-discover-lens-analytics.png :alt: Analytics Discover app showing a Lens visualization of an event action distribution :class: screenshot ::: From b5080757262d30d6186c2ef526805a8cdde2d65e Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Mon, 10 Feb 2025 15:22:35 +0100 Subject: [PATCH 30/30] images --- solutions/search/search-connection-details.md | 12 ++++++------ .../search/serverless-elasticsearch-get-started.md | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/solutions/search/search-connection-details.md b/solutions/search/search-connection-details.md index 9824056a0a..48fc0bb424 100644 --- a/solutions/search/search-connection-details.md +++ b/solutions/search/search-connection-details.md @@ -21,14 +21,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the Elastic Cloud home page. 2. In the main menu, click **Manage this deployment**. - :::{image} ../../../images/kibana-manage-deployment.png + :::{image} ../../images/kibana-manage-deployment.png :alt: manage deployment :class: screenshot ::: 3. The Cloud ID is displayed on the right side of the page. - :::{image} ../../../images/kibana-cloud-id.png + :::{image} ../../images/kibana-cloud-id.png :alt: cloud id :class: screenshot ::: @@ -39,14 +39,14 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. To navigate to **API keys**, use the [**global search bar**](../../get-started/the-stack.md#kibana-navigation-search). - :::{image} ../../../images/kibana-api-keys-search-bar.png + :::{image} ../../images/kibana-api-keys-search-bar.png :alt: api keys search bar :class: screenshot ::: 2. Click **Create API key**. - :::{image} ../../../images/kibana-click-create-api-key.png + :::{image} ../../images/kibana-click-create-api-key.png :alt: click create api key :class: screenshot ::: @@ -62,7 +62,7 @@ To connect to your {{es}} deployment, you need either a Cloud ID or an {{es}} en 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Copy your connection details** section, and copy the **Elasticsearch endpoint**. - :::{image} ../../../images/kibana-serverless-connection-details.png + :::{image} ../../images/kibana-serverless-connection-details.png :alt: serverless connection details :class: screenshot ::: @@ -79,7 +79,7 @@ The **Cloud ID** is also displayed in the Copy your connection details section, 1. Navigate to the serverless project’s home page. 2. Scroll down to the **Add an API Key** section, and click **New**. - :::{image} ../../../images/kibana-serverless-create-an-api-key.png + :::{image} ../../images/kibana-serverless-create-an-api-key.png :alt: serverless create an api key :class: screenshot ::: diff --git a/solutions/search/serverless-elasticsearch-get-started.md b/solutions/search/serverless-elasticsearch-get-started.md index f15f5ccbe2..19c7132c50 100644 --- a/solutions/search/serverless-elasticsearch-get-started.md +++ b/solutions/search/serverless-elasticsearch-get-started.md @@ -60,7 +60,7 @@ Once your project is set up, you’ll be directed to a page where you can create 1. Enter a name for your index. 2. Click **Create my index**. You can also create the index by clicking on **Code** to view and run code through the command line. - :::{image} ../../../images/serverless-get-started-create-an-index.png + :::{image} ../../images/serverless-get-started-create-an-index.png :alt: Create an index. :::