@@ -56,16 +56,8 @@ pip install sentence-transformers
56
56
57
57
In a new Python source file, start by importing the required classes:
58
58
59
- ``` python
60
- from sentence_transformers import SentenceTransformer
61
- from redis.commands.search.query import Query
62
- from redis.commands.search.field import TextField, TagField, VectorField
63
- from redis.commands.search.indexDefinition import IndexDefinition, IndexType
64
- from redis.commands.json.path import Path
65
-
66
- import numpy as np
67
- import redis
68
- ```
59
+ {{< clients-example set="home_query_vec" step="import" >}}
60
+ {{< /clients-example >}}
69
61
70
62
The first of these imports is the
71
63
` SentenceTransformer ` class, which generates an embedding from a section of text.
@@ -78,9 +70,8 @@ tokens (see
78
70
at the [ Hugging Face] ( https://huggingface.co/ ) docs to learn more about the way tokens
79
71
are related to the original text).
80
72
81
- ``` python
82
- model = SentenceTransformer(" sentence-transformers/all-MiniLM-L6-v2" )
83
- ```
73
+ {{< clients-example set="home_query_vec" step="model" >}}
74
+ {{< /clients-example >}}
84
75
85
76
## Create the index
86
77
@@ -89,14 +80,8 @@ name `vector_idx`. (The `dropindex()` call throws an exception if
89
80
the index doesn't already exist, which is why you need the
90
81
` try: except: ` block.)
91
82
92
- ``` python
93
- r = redis.Redis(decode_responses = True )
94
-
95
- try :
96
- r.ft(" vector_idx" ).dropindex(True )
97
- except redis.exceptions.ResponseError:
98
- pass
99
- ```
83
+ {{< clients-example set="home_query_vec" step="connect" >}}
84
+ {{< /clients-example >}}
100
85
101
86
Next, create the index.
102
87
The schema in the example below specifies hash objects for storage and includes
@@ -110,24 +95,8 @@ indexing, the
110
95
vector distance metric, ` Float32 ` values to represent the vector's components,
111
96
and 384 dimensions, as required by the ` all-MiniLM-L6-v2 ` embedding model.
112
97
113
- ``` python
114
- schema = (
115
- TextField(" content" ),
116
- TagField(" genre" ),
117
- VectorField(" embedding" , " HNSW" , {
118
- " TYPE" : " FLOAT32" ,
119
- " DIM" : 384 ,
120
- " DISTANCE_METRIC" :" L2"
121
- })
122
- )
123
-
124
- r.ft(" vector_idx" ).create_index(
125
- schema,
126
- definition = IndexDefinition(
127
- prefix = [" doc:" ], index_type = IndexType.HASH
128
- )
129
- )
130
- ```
98
+ {{< clients-example set="home_query_vec" step="create_index" >}}
99
+ {{< /clients-example >}}
131
100
132
101
## Add data
133
102
@@ -144,31 +113,8 @@ Use the binary string representation when you are indexing hashes
144
113
or running a query (but use a list of ` float ` for
145
114
[ JSON documents] ( #differences-with-json-documents ) ).
146
115
147
- ``` python
148
- content = " That is a very happy person"
149
-
150
- r.hset(" doc:0" , mapping = {
151
- " content" : content,
152
- " genre" : " persons" ,
153
- " embedding" : model.encode(content).astype(np.float32).tobytes(),
154
- })
155
-
156
- content = " That is a happy dog"
157
-
158
- r.hset(" doc:1" , mapping = {
159
- " content" : content,
160
- " genre" : " pets" ,
161
- " embedding" : model.encode(content).astype(np.float32).tobytes(),
162
- })
163
-
164
- content = " Today is a sunny day"
165
-
166
- r.hset(" doc:2" , mapping = {
167
- " content" : content,
168
- " genre" : " weather" ,
169
- " embedding" : model.encode(content).astype(np.float32).tobytes(),
170
- })
171
- ```
116
+ {{< clients-example set="home_query_vec" step="add_data" >}}
117
+ {{< /clients-example >}}
172
118
173
119
## Run a query
174
120
@@ -184,21 +130,8 @@ the indexing, and passes it as a parameter when the query executes
184
130
[ Vector search] ({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
185
131
for more information about using query parameters with embeddings).
186
132
187
- ``` python
188
- q = Query(
189
- " *=>[KNN 3 @embedding $vec AS vector_distance]"
190
- ).return_field(" score" ).dialect(2 )
191
-
192
- query_text = " That is a happy person"
193
-
194
- res = r.ft(" vector_idx" ).search(
195
- q, query_params = {
196
- " vec" : model.encode(query_text).astype(np.float32).tobytes()
197
- }
198
- )
199
-
200
- print (res)
201
- ```
133
+ {{< clients-example set="home_query_vec" step="query" >}}
134
+ {{< /clients-example >}}
202
135
203
136
The code is now ready to run, but note that it may take a while to complete when
204
137
you run it for the first time (which happens because RedisVL must download the
@@ -250,27 +183,8 @@ every query. Also, you must specify `IndexType.JSON` when you create the index.
250
183
The code below shows these differences, but the index is otherwise very similar to
251
184
the one created previously for hashes:
252
185
253
- ``` py
254
- schema = (
255
- TextField(" $.content" , as_name = " content" ),
256
- TagField(" $.genre" , as_name = " genre" ),
257
- VectorField(
258
- " $.embedding" , " HNSW" , {
259
- " TYPE" : " FLOAT32" ,
260
- " DIM" : 384 ,
261
- " DISTANCE_METRIC" : " L2"
262
- },
263
- as_name = " embedding"
264
- )
265
- )
266
-
267
- r.ft(" vector_json_idx" ).create_index(
268
- schema,
269
- definition = IndexDefinition(
270
- prefix = [" jdoc:" ], index_type = IndexType.JSON
271
- )
272
- )
273
- ```
186
+ {{< clients-example set="home_query_vec" step="json_index" >}}
187
+ {{< /clients-example >}}
274
188
275
189
Use [ ` json().set() ` ] ({{< relref "/commands/json.set" >}}) to add the data
276
190
instead of [ ` hset() ` ] ({{< relref "/commands/hset" >}}). The dictionaries
@@ -283,31 +197,8 @@ specified using lists instead of binary strings. Generate the list
283
197
using the ` tolist() ` method instead of ` tobytes() ` as you would with a
284
198
hash.
285
199
286
- ``` py
287
- content = " That is a very happy person"
288
-
289
- r.json().set(" jdoc:0" , Path.root_path(), {
290
- " content" : content,
291
- " genre" : " persons" ,
292
- " embedding" : model.encode(content).astype(np.float32).tolist(),
293
- })
294
-
295
- content = " That is a happy dog"
296
-
297
- r.json().set(" jdoc:1" , Path.root_path(), {
298
- " content" : content,
299
- " genre" : " pets" ,
300
- " embedding" : model.encode(content).astype(np.float32).tolist(),
301
- })
302
-
303
- content = " Today is a sunny day"
304
-
305
- r.json().set(" jdoc:2" , Path.root_path(), {
306
- " content" : content,
307
- " genre" : " weather" ,
308
- " embedding" : model.encode(content).astype(np.float32).tolist(),
309
- })
310
- ```
200
+ {{< clients-example set="home_query_vec" step="json_data" >}}
201
+ {{< /clients-example >}}
311
202
312
203
The query is almost identical to the one for the hash documents. This
313
204
demonstrates how the right choice of aliases for the JSON paths can
@@ -316,19 +207,8 @@ is that the vector parameter for the query is still specified as a
316
207
binary string (using the ` tobytes() ` method), even though the data for
317
208
the ` embedding ` field of the JSON was specified as a list.
318
209
319
- ``` py
320
- q = Query(
321
- " *=>[KNN 3 @embedding $vec AS vector_distance]"
322
- ).return_field(" vector_distance" ).return_field(" content" ).dialect(2 )
323
-
324
- query_text = " That is a happy person"
325
-
326
- res = r.ft(" vector_json_idx" ).search(
327
- q, query_params = {
328
- " vec" : model.encode(query_text).astype(np.float32).tobytes()
329
- }
330
- )
331
- ```
210
+ {{< clients-example set="home_query_vec" step="json_query" >}}
211
+ {{< /clients-example >}}
332
212
333
213
Apart from the ` jdoc: ` prefixes for the keys, the result from the JSON
334
214
query is the same as for hash:
0 commit comments