Skip to content

Commit 88a3793

Browse files
Merge pull request #1994 from redis/DOC-5557-lettuce-query-vec
DOC-5557 Lettuce vector index/query page
2 parents 3da409b + 3b15ff9 commit 88a3793

File tree

3 files changed

+954
-0
lines changed

3 files changed

+954
-0
lines changed
Lines changed: 272 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,272 @@
1+
---
2+
categories:
3+
- docs
4+
- develop
5+
- stack
6+
- oss
7+
- rs
8+
- rc
9+
- oss
10+
- kubernetes
11+
- clients
12+
description: Learn how to index and query vector embeddings with Redis
13+
linkTitle: Index and query vectors
14+
title: Index and query vectors
15+
weight: 3
16+
---
17+
18+
[Redis Query Engine]({{< relref "/develop/ai/search-and-query" >}})
19+
lets you index vector fields in [hash]({{< relref "/develop/data-types/hashes" >}})
20+
or [JSON]({{< relref "/develop/data-types/json" >}}) objects (see the
21+
[Vectors]({{< relref "/develop/ai/search-and-query/vectors" >}})
22+
reference page for more information).
23+
Among other things, vector fields can store *text embeddings*, which are AI-generated vector
24+
representations of the semantic information in pieces of text. The
25+
[vector distance]({{< relref "/develop/ai/search-and-query/vectors#distance-metrics" >}})
26+
between two embeddings indicates how similar they are semantically. By comparing the
27+
similarity of an embedding generated from some query text with embeddings stored in hash
28+
or JSON fields, Redis can retrieve documents that closely match the query in terms
29+
of their meaning.
30+
31+
The example below uses the [HuggingFace](https://huggingface.co/) model
32+
[`all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
33+
to generate the vector embeddings to store and index with Redis Query Engine.
34+
The code is first demonstrated for hash documents with a
35+
separate section to explain the
36+
[differences with JSON documents](#differences-with-json-documents).
37+
38+
## Initialize
39+
40+
If you are using [Maven](https://maven.apache.org/), add the following
41+
dependencies to your `pom.xml` file:
42+
43+
```xml
44+
<dependency>
45+
<groupId>io.lettuce</groupId>
46+
<artifactId>lettuce-core</artifactId>
47+
<!-- Check for the latest version on Maven Central -->
48+
<version>6.7.1.RELEASE</version>
49+
</dependency>
50+
51+
<dependency>
52+
<groupId>ai.djl.huggingface</groupId>
53+
<artifactId>tokenizers</artifactId>
54+
<version>0.33.0</version>
55+
</dependency>
56+
57+
<dependency>
58+
<groupId>ai.djl.pytorch</groupId>
59+
<artifactId>pytorch-model-zoo</artifactId>
60+
<version>0.33.0</version>
61+
</dependency>
62+
63+
<dependency>
64+
<groupId>ai.djl</groupId>
65+
<artifactId>api</artifactId>
66+
<version>0.33.0</version>
67+
</dependency>
68+
```
69+
70+
If you are using [Gradle](https://gradle.org/), add the following
71+
dependencies to your `build.gradle` file:
72+
73+
```bash
74+
compileOnly 'io.lettuce:lettuce-core:6.7.1.RELEASE'
75+
compileOnly 'ai.djl.huggingface:tokenizers:0.33.0'
76+
compileOnly 'ai.djl.pytorch:pytorch-model-zoo:0.33.0'
77+
compileOnly 'ai.djl:api:0.33.0'
78+
```
79+
80+
## Import dependencies
81+
82+
Import the following classes in your source file:
83+
84+
{{< clients-example set="home_query_vec" step="import" lang_filter="Java-Async,Java-Reactive" >}}
85+
{{< /clients-example >}}
86+
87+
## Define a helper method
88+
89+
When you store vectors in a hash object, or pass them as query parameters,
90+
you must encode the `float` components of the vector
91+
array as a `byte` string. The helper method `floatArrayToByteBuffer()`
92+
shown below does this for you:
93+
94+
{{< clients-example set="home_query_vec" step="helper_method" lang_filter="Java-Async,Java-Reactive" >}}
95+
{{< /clients-example >}}
96+
97+
## Create an embedding model instance
98+
99+
The example below uses the
100+
[`all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
101+
model to generate the embeddings. This model generates vectors with 384 dimensions, regardless of the length of the input text, but note that the input is truncated to 256
102+
tokens (see
103+
[Word piece tokenization](https://huggingface.co/learn/nlp-course/en/chapter6/6)
104+
at the [Hugging Face](https://huggingface.co/) docs to learn more about the way tokens
105+
are related to the original text).
106+
107+
The [`Predictor`](https://javadoc.io/doc/ai.djl/api/latest/ai/djl/inference/Predictor.html)
108+
class implements the model to generate the embeddings. The code below
109+
creates an instance of `Predictor` that uses the `all-MiniLM-L6-v2` model:
110+
111+
{{< clients-example set="home_query_vec" step="model" lang_filter="Java-Async,Java-Reactive" >}}
112+
{{< /clients-example >}}
113+
114+
## Create the index
115+
116+
As noted in [Define a helper method](#define-a-helper-method) above, you must
117+
pass the embeddings to the hash and query commands as a binary string.
118+
119+
Lettuce has an option to specify a `ByteBufferCodec` for the connection to Redis.
120+
This lets you construct binary strings for Redis keys and values conveniently using
121+
the standard
122+
[`ByteBuffer`](https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html)
123+
class (see [Codecs](https://redis.github.io/lettuce/integration-extension/#codecs)
124+
in the Lettuce documentation for more information). However, you will probably find
125+
it more convenient to use the default `StringCodec` for commands that don't require binary strings. It is therefore helpful to have two connections available, one using `ByteBufferCodec` and one using `StringCodec`.
126+
127+
The code below shows how to declare one connection with the
128+
`ByteBufferCodec` and another without in the try-with-resources
129+
block. You also need two separate instances of `RedisAsyncCommands` to
130+
use the two connections:
131+
132+
{{< clients-example set="home_query_vec" step="connect" lang_filter="Java-Async,Java-Reactive" >}}
133+
{{< /clients-example >}}
134+
135+
Next, create the index.
136+
The schema in the example below includes three fields:
137+
138+
- The text content to index
139+
- A [tag]({{< relref "/develop/ai/search-and-query/advanced-concepts/tags" >}})
140+
field to represent the "genre" of the text
141+
- The embedding vector generated from the original text content
142+
143+
The `embedding` field specifies
144+
[HNSW]({{< relref "/develop/ai/search-and-query/vectors#hnsw-index" >}})
145+
indexing, the
146+
[L2]({{< relref "/develop/ai/search-and-query/vectors#distance-metrics" >}})
147+
vector distance metric, `Float32` values to represent the vector's components,
148+
and 384 dimensions, as required by the `all-MiniLM-L6-v2` embedding model.
149+
150+
The `CreateArgs` object specifies hash objects for storage and a
151+
prefix `doc:` that identifies the hash objects to index.
152+
153+
{{< clients-example set="home_query_vec" step="create_index" lang_filter="Java-Async,Java-Reactive" >}}
154+
{{< /clients-example >}}
155+
156+
## Add data
157+
158+
You can now supply the data objects, which will be indexed automatically
159+
when you add them with [`hset()`]({{< relref "/commands/hset" >}}), as long as
160+
you use the `doc:` prefix specified in the index definition.
161+
162+
Use the `predict()` method of the `Predictor` object
163+
as shown below to create the embedding that represents the `content` field
164+
and use the `floatArrayToByteBuffer()` helper method to convert it to a binary string.
165+
Use the binary string representation when you are
166+
indexing hash objects, but use an array of `float` for
167+
JSON objects (see [Differences with JSON objects](#differences-with-json-documents)
168+
below).
169+
170+
You must use instances of `Map<ByteBuffer, ByteBuffer>` to supply the data to `hset()`
171+
when using the `ByteBufferCodec` connection, which adds a little complexity. Note
172+
that the `predict()` call is in a `try`/`catch` block because it will throw
173+
exceptions if it can't download the embedding model (you should add code to handle
174+
the exceptions in production code).
175+
176+
{{< clients-example set="home_query_vec" step="add_data" lang_filter="Java-Async,Java-Reactive" >}}
177+
{{< /clients-example >}}
178+
179+
## Run a query
180+
181+
After you have created the index and added the data, you are ready to run a query.
182+
To do this, you must create another embedding vector from your chosen query
183+
text. Redis calculates the vector distance between the query vector and each
184+
embedding vector in the index as it runs the query. You can request the results to be
185+
sorted to rank them in order of ascending distance.
186+
187+
The code below creates the query embedding using the `predict()` method, as with
188+
the indexing, and passes it as a parameter when the query executes (see
189+
[Vector search]({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
190+
for more information about using query parameters with embeddings).
191+
The query is a
192+
[K nearest neighbors (KNN)]({{< relref "/develop/ai/search-and-query/vectors#knn-vector-search" >}})
193+
search that sorts the results in order of vector distance from the query vector.
194+
195+
{{< clients-example set="home_query_vec" step="query" lang_filter="Java-Async,Java-Reactive" >}}
196+
{{< /clients-example >}}
197+
198+
Assuming you have added the code from the steps above to your source file,
199+
it is now ready to run, but note that it may take a while to complete when
200+
you run it for the first time (which happens because the model must download the
201+
`all-MiniLM-L6-v2` model data before it can
202+
generate the embeddings). When you run the code, it outputs the following result text:
203+
204+
```
205+
Results:
206+
ID: doc:1, Content: That is a very happy person, Distance: 0.114169836044
207+
ID: doc:2, Content: That is a happy dog, Distance: 0.610845506191
208+
ID: doc:3, Content: Today is a sunny day, Distance: 1.48624765873
209+
```
210+
211+
Note that the results are ordered according to the value of the `distance`
212+
field, with the lowest distance indicating the greatest similarity to the query.
213+
As you would expect, the result for `doc:1` with the content text
214+
*"That is a very happy person"*
215+
is the result that is most similar in meaning to the query text
216+
*"That is a happy person"*.
217+
218+
## Differences with JSON documents
219+
220+
Indexing JSON documents is similar to hash indexing, but there are some
221+
important differences. JSON allows much richer data modeling with nested fields, so
222+
you must supply a [path]({{< relref "/develop/data-types/json/path" >}}) in the schema
223+
to identify each field you want to index. However, you can declare a short alias for each
224+
of these paths (using the `as()` option) to avoid typing it in full for
225+
every query. Also, you must specify `CreateArgs.TargetType.JSON` when you create the index.
226+
227+
The code below shows these differences, but the index is otherwise very similar to
228+
the one created previously for hashes:
229+
230+
{{< clients-example set="home_query_vec" step="json_schema" lang_filter="Java-Async,Java-Reactive" >}}
231+
{{< /clients-example >}}
232+
233+
An important difference with JSON indexing is that the vectors are
234+
specified using arrays of `float` instead of binary strings. This means
235+
you don't need to use the `ByteBufferCodec` connection, and you can use
236+
[`Arrays.toString()`](https://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#toString-float:A-) to convert the `float` array to a suitable JSON string.
237+
238+
Use [`jsonSet()`]({{< relref "/commands/json.set" >}}) to add the data
239+
instead of [`hset()`]({{< relref "/commands/hset" >}}). Use instances
240+
of `JSONObject` to supply the data instead of `Map`, as you would for
241+
hash objects.
242+
243+
{{< clients-example set="home_query_vec" step="json_data" lang_filter="Java-Async,Java-Reactive" >}}
244+
{{< /clients-example >}}
245+
246+
The query is almost identical to the one for the hash documents. This
247+
demonstrates how the right choice of aliases for the JSON paths can
248+
save you having to write complex queries. An important thing to notice
249+
is that the vector parameter for the query is still specified as a
250+
binary string, even though the data for the `embedding` field of the JSON
251+
was specified as an array.
252+
253+
{{< clients-example set="home_query_vec" step="json_query" lang_filter="Java-Async,Java-Reactive" >}}
254+
{{< /clients-example >}}
255+
256+
The distance values are not identical to the hash query because the
257+
string representations of the vectors used here are stored with different
258+
precisions. However, the relative order of the results is the same:
259+
260+
```
261+
Results:
262+
ID: jdoc:1, Content: That is a very happy person, Distance: 0.628328084946
263+
ID: jdoc:2, Content: That is a happy dog, Distance: 0.895147025585
264+
ID: jdoc:3, Content: Today is a sunny day, Distance: 1.49569523335
265+
```
266+
267+
## Learn more
268+
269+
See
270+
[Vector search]({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
271+
for more information about the indexing options, distance metrics, and query format
272+
for vectors.

0 commit comments

Comments
 (0)