You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model to generate the embeddings. The vectors that represent the
102
-
embeddings have 384 components, regardless of the length of the input
103
-
text.
101
+
model to generate the embeddings. This model generates vectors with 384 dimensions, regardless of the length of the input text, but note that the input is truncated to 256
class (see [Codecs](https://redis.github.io/lettuce/integration-extension/#codecs)
121
-
in the Lettuce documentation for more information). However, it is more convenient
122
-
to use the default `StringCodec` for commands that don't require binary strings.
123
-
The code below shows how to declare both connections in the try-with-resources
124
+
in the Lettuce documentation for more information). However, you will probably find
125
+
it more convenient to use the default `StringCodec` for commands that don't require binary strings. It is therefore helpful to have two connections available, one using `ByteBufferCodec` and one using `StringCodec`.
126
+
127
+
The code below shows how to declare one connection with the
128
+
`ByteBufferCodec` and another without in the try-with-resources
124
129
block. You also need two separate instances of `RedisAsyncCommands` to
125
130
use the two connections:
126
131
@@ -143,7 +148,7 @@ vector distance metric, `Float32` values to represent the vector's components,
143
148
and 384 dimensions, as required by the `all-MiniLM-L6-v2` embedding model.
144
149
145
150
The `CreateArgs` object specifies hash objects for storage and a
146
-
prefix `doc:` that identifies the hash objects we want to index.
151
+
prefix `doc:` that identifies the hash objects to index.
@@ -158,13 +163,13 @@ Use the `predict()` method of the `Predictor` object
158
163
as shown below to create the embedding that represents the `content` field
159
164
and use the `floatArrayToByteBuffer()` helper method to convert it to a binary string.
160
165
Use the binary string representation when you are
161
-
indexing hash objects (as we are here), but use an array of `float` for
166
+
indexing hash objects, but use an array of `float` for
162
167
JSON objects (see [Differences with JSON objects](#differences-with-json-documents)
163
168
below).
164
169
165
170
You must use instances of `Map<ByteBuffer, ByteBuffer>` to supply the data to `hset()`
166
171
when using the `ByteBufferCodec` connection, which adds a little complexity. Note
167
-
that the `predict()`method is in a `try`/`catch` block because it can throw
172
+
that the `predict()`call is in a `try`/`catch` block because it will throw
168
173
exceptions if it can't download the embedding model (you should add code to handle
169
174
the exceptions in production code).
170
175
@@ -228,7 +233,7 @@ the one created previously for hashes:
228
233
An important difference with JSON indexing is that the vectors are
229
234
specified using arrays of `float` instead of binary strings. This means
230
235
you don't need to use the `ByteBufferCodec` connection, and you can use
231
-
`Arrays.toString()` to convert the `float` array to a suitable JSON string.
236
+
[`Arrays.toString()`](https://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#toString-float:A-) to convert the `float` array to a suitable JSON string.
232
237
233
238
Use [`jsonSet()`]({{< relref "/commands/json.set" >}}) to add the data
234
239
instead of [`hset()`]({{< relref "/commands/hset" >}}). Use instances
0 commit comments