Skip to content

Commit bbdb35f

Browse files
szabostevekosabogileemthompo
authored
[SEARCH] Improves bring your own vectors page based on SEO guidelines (#2781)
## Description This PR optimizes the content of the `Bring your own vectors` page based on the SEO best practices outlined [here](https://stunning-adventure-qrvr1k2.pages.github.io/style-guide/seo/). --------- Co-authored-by: kosabogi <[email protected]> Co-authored-by: Liam Thompson <[email protected]>
1 parent 4a41495 commit bbdb35f

File tree

1 file changed

+17
-29
lines changed

1 file changed

+17
-29
lines changed

solutions/search/vector/bring-own-vectors.md

Lines changed: 17 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,13 @@ products:
1010
description: An introduction to vectors and knn search in Elasticsearch.
1111
---
1212

13-
# Bring your own dense vectors [bring-your-own-vectors]
13+
# Bring your own dense vectors to {{es}} [bring-your-own-vectors]
1414

15-
{{es}} enables you store and search mathematical representations of your content called _embeddings_ or _vectors_, which help machines understand and process your data more effectively.
16-
There are two types of representation (_dense_ and _sparse_), which are suited to different types of queries and use cases (for example, finding similar images and content or storing expanded terms and weights).
15+
{{es}} enables you to store and search mathematical representations of your content - _embeddings_ or _vectors_ - which power AI-driven relevance. There are two types of vector representation - _dense_ and _sparse_ - suited to different queries and use cases (for example, finding similar images and content or storing expanded terms and weights).
1716

18-
In this introduction to [vector search](/solutions/search/vector.md), you'll store and search for dense vectors.
19-
You'll also learn the syntax for searching these documents using a [k-nearest neighbour](/solutions/search/vector/knn.md) (kNN) query.
17+
In this introduction to [vector search](/solutions/search/vector.md), you’ll store and search for dense vectors in {{es}}. You’ll also learn the syntax for querying these documents with a [k-nearest neighbour](/solutions/search/vector/knn.md) (kNN) query.
2018

21-
## Prerequisites
19+
## Prerequisites for vector search
2220

2321
- If you're using {{es-serverless}}, create a project with the general purpose configuration. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role.
2422
- If you're using {{ech}} or a self-managed cluster, start {{es}} and {{kib}}. The simplest method to complete the steps in this guide is to log in with a user that has the `superuser` built-in role.
@@ -27,11 +25,9 @@ To learn about role-based access control, check out [](/deploy-manage/users-role
2725

2826
## Create a vector database
2927

30-
When you create vectors (or _vectorize_ your data), you convert complex and nuanced content (such as text, videos, images, or audio) into multidimensional numerical representations.
31-
They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations.
28+
When you create vectors (or _vectorize_ your data), you convert complex content (text, images, audio, video) into multidimensional numeric representations. These vectors are stored in specialized data structures that enable efficient similarity search and fast kNN distance calculations.
3229

33-
In this quide, you'll use documents that already have dense vector embeddings.
34-
To deploy a vector embedding model in {{es}} and generate vectors while ingesting and searching your data, refer to the links in [Learn more](#bring-your-own-vectors-learn-more).
30+
In this guide, you’ll use documents that already include dense vector embeddings. To deploy a vector embedding model in {{es}} and generate vectors during ingest and search, refer to the links in [Learn more](#bring-your-own-vectors-learn-more).
3531

3632
::::{tip}
3733
This is an advanced use case that uses the `dense_vector` field type. Refer to [](/solutions/search/semantic-search.md) for an overview of your options for semantic search with {{es}}.
@@ -47,7 +43,7 @@ Each document in our simple data set will have:
4743
* An embedding of that review: stored in a `review_vector` field, which is defined as a [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) data type.
4844

4945
:::{tip}
50-
The `dense_vector` type automatically uses `int8_hnsw` quantization by default to reduce the memory footprint required when searching float vectors. Learn more about balancing performance and accuracy in [Dense vector quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization).
46+
The `dense_vector` type automatically uses `int8_hnsw` quantization by default to reduce the memory footprint when searching float vectors. Learn how to balance performance and accuracy in [Dense vector quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization).
5147
:::
5248

5349
The following API request defines the `review_text` and `review_vector` fields:
@@ -75,9 +71,7 @@ PUT /amazon-reviews
7571
2. The `index` parameter is set to `true` to enable the use of the `knn` query.
7672
3. The `similarity` parameter defines the similarity function used to compare the query vector to the document vectors. `cosine` is the default similarity function for `dense_vector` fields in {{es}}.
7773

78-
Here we're using an 8-dimensional embedding for readability.
79-
The vectors that neural network models work with can have several hundreds or even thousands of dimensions that represent a point in a multi-dimensional space.
80-
Each vector dimension represents a _feature_ or a characteristic of the unstructured data.
74+
Here we’re using an 8-dimensional embedding for readability. The vectors that neural network models work with can have several hundreds or even thousands of dimensions that represent a point in a multi-dimensional space. Each dimension represents a feature or characteristic of the unstructured data.
8175
::::
8276
::::{step} Add documents with embeddings
8377

@@ -113,9 +107,7 @@ POST /_bulk
113107

114108
## Test vector search [bring-your-own-vectors-search-documents]
115109

116-
Now you can query these document vectors using a [`knn` retriever](elasticsearch://reference/elasticsearch/rest-apis/retrievers.md#knn-retriever).
117-
`knn` is a type of vector search, which finds the `k` most similar documents to a query vector.
118-
Here we're using a raw vector for the query text for demonstration purposes:
110+
Now you can query these document vectors using a [`knn` retriever](elasticsearch://reference/elasticsearch/rest-apis/retrievers.md#knn-retriever). `knn` is a type of vector similarity search that finds the `k` most similar documents to a query vector. Here we're using a raw vector for the query text for demonstration purposes:
119111

120112
```console
121113
POST /amazon-reviews/_search
@@ -131,31 +123,27 @@ POST /amazon-reviews/_search
131123
}
132124
```
133125

134-
1. A raw vector serves as the query text in this example. In a real-world scenario, you'll need to generate vectors for queries using an embedding model.
126+
1. A raw vector serves as the query text in this example. In a real-world scenario, you'll generate query vectors using an embedding model.
135127
2. The `k` parameter specifies the number of results to return.
136128
3. The `num_candidates` parameter is optional. It limits the number of candidates returned by the search node. This can improve performance and reduce costs.
137129

138-
## Next steps
130+
## Next steps: implementing vector search
139131

140-
If you want to try a similar set of steps from an {{es}} client, check out the guided index workflow:
132+
If you want to try a similar workflow from an {{es}} client, use the guided index workflow:
141133

142-
- If you're using Elasticsearch Serverless, go to **{{es}} > Home**, select the vector search workflow, and **Create a vector optimized index**.
143-
- If you're using {{ech}} or a self-managed cluster, go to **Elasticsearch > Home** and click **Create API index**. Select the vector search workflow.
134+
* If you're using {{es}} Serverless, go to **{{es}} > Home**, select the vector search workflow, and **Create a vector optimized index**.
135+
* If you're using {{ech}} or a self-managed cluster, go to **{{es}} > Home** and click **Create API index**. Select the vector search workflow.
144136

145137
When you finish your tests and no longer need the sample data set, delete the index:
146138

147139
```console
148140
DELETE /amazon-reviews
149141
```
150142

151-
## Learn more [bring-your-own-vectors-learn-more]
143+
## Learn more about vector search [bring-your-own-vectors-learn-more]
152144

153-
In these simple examples, we're sending a raw vector for the query text.
154-
In a real-world scenario you won't know the query text ahead of time.
155-
You'll need to generate query vectors, on the fly, using the same embedding model that generated the document vectors.
156-
For this you'll need to deploy a text embedding model in {{es}} and use the [`query_vector_builder` parameter](elasticsearch://reference/query-languages/query-dsl/query-dsl-knn-query.md#knn-query-top-level-parameters).
157-
Alternatively, you can generate vectors client-side and send them directly with the search request.
145+
In these simple examples, we send a raw vector for the query text. In a real-world scenario, you won’t know the query text ahead of time. You’ll generate query vectors on the fly using the same embedding model that produced the document vectors. For this, deploy a text embedding model in {{es}} and use the[`query_vector_builder` parameter](elasticsearch://reference/query-languages/query-dsl/query-dsl-knn-query.md#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request.
158146

159147
For an example of using pipelines to generate text embeddings, check out [](/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md).
160148

161-
To learn about more search options, such as semantic, full-text, and hybrid, go to [](/solutions/search/search-approaches.md).
149+
To learn more about the search options in {{es}}, such as semantic, full-text, and hybrid, refer to [](/solutions/search/search-approaches.md).

0 commit comments

Comments
 (0)