Skip to content

Commit bc65551

Browse files
Merge pull request #1936 from redis/DOC-5537-localise-vec-query-examples
DOC-5537 localise vector examples
2 parents af39b1d + 7e90166 commit bc65551

File tree

10 files changed

+1086
-895
lines changed

10 files changed

+1086
-895
lines changed

content/develop/clients/go/vecsearch.md

Lines changed: 20 additions & 231 deletions
Original file line numberDiff line numberDiff line change
@@ -66,19 +66,8 @@ go get github.com/henomis/lingoose/embedder/huggingface
6666

6767
Add the following imports to your module's main program file:
6868

69-
```go
70-
package main
71-
72-
import (
73-
"context"
74-
"encoding/binary"
75-
"fmt"
76-
"math"
77-
78-
huggingfaceembedder "github.com/henomis/lingoose/embedder/huggingface"
79-
"github.com/redis/go-redis/v9"
80-
)
81-
```
69+
{{< clients-example set="home_query_vec" step="import" lang_filter="Go" >}}
70+
{{< /clients-example >}}
8271

8372
You must also create a [HuggingFace account](https://huggingface.co/join)
8473
and add a new access token to use the embedding model. See the
@@ -96,18 +85,8 @@ must convert this array to a `byte` string before adding it as a hash field.
9685
The function shown below uses Go's [`binary`](https://pkg.go.dev/encoding/binary)
9786
package to produce the `byte` string:
9887

99-
```go
100-
func floatsToBytes(fs []float32) []byte {
101-
buf := make([]byte, len(fs)*4)
102-
103-
for i, f := range fs {
104-
u := math.Float32bits(f)
105-
binary.NativeEndian.PutUint32(buf[i*4:], u)
106-
}
107-
108-
return buf
109-
}
110-
```
88+
{{< clients-example set="home_query_vec" step="helper" lang_filter="Go" >}}
89+
{{< /clients-example >}}
11190

11291
Note that if you are using [JSON]({{< relref "/develop/data-types/json" >}})
11392
objects to store your documents instead of hashes, then you should store
@@ -120,22 +99,8 @@ below).
12099
In the `main()` function, connect to Redis and delete any index previously
121100
created with the name `vector_idx`:
122101

123-
```go
124-
ctx := context.Background()
125-
rdb := redis.NewClient(&redis.Options{
126-
Addr: "localhost:6379",
127-
Password: "", // no password docs
128-
DB: 0, // use default DB
129-
Protocol: 2,
130-
})
131-
132-
rdb.FTDropIndexWithArgs(ctx,
133-
"vector_idx",
134-
&redis.FTDropIndexOptions{
135-
DeleteDocs: true,
136-
},
137-
)
138-
```
102+
{{< clients-example set="home_query_vec" step="connect" lang_filter="Go" >}}
103+
{{< /clients-example >}}
139104

140105
Next, create the index.
141106
The schema in the example below specifies hash objects for storage and includes
@@ -149,38 +114,8 @@ indexing, the
149114
vector distance metric, `Float32` values to represent the vector's components,
150115
and 384 dimensions, as required by the `all-MiniLM-L6-v2` embedding model.
151116

152-
```go
153-
_, err := rdb.FTCreate(ctx,
154-
"vector_idx",
155-
&redis.FTCreateOptions{
156-
OnHash: true,
157-
Prefix: []any{"doc:"},
158-
},
159-
&redis.FieldSchema{
160-
FieldName: "content",
161-
FieldType: redis.SearchFieldTypeText,
162-
},
163-
&redis.FieldSchema{
164-
FieldName: "genre",
165-
FieldType: redis.SearchFieldTypeTag,
166-
},
167-
&redis.FieldSchema{
168-
FieldName: "embedding",
169-
FieldType: redis.SearchFieldTypeVector,
170-
VectorArgs: &redis.FTVectorArgs{
171-
HNSWOptions: &redis.FTHNSWOptions{
172-
Dim: 384,
173-
DistanceMetric: "L2",
174-
Type: "FLOAT32",
175-
},
176-
},
177-
},
178-
).Result()
179-
180-
if err != nil {
181-
panic(err)
182-
}
183-
```
117+
{{< clients-example set="home_query_vec" step="create_index" lang_filter="Go" >}}
118+
{{< /clients-example >}}
184119

185120
## Create an embedder instance
186121

@@ -190,11 +125,8 @@ instance that uses the `sentence-transformers/all-MiniLM-L6-v2`
190125
model, passing your HuggingFace access token to the `WithToken()`
191126
method.
192127

193-
```go
194-
hf := huggingfaceembedder.New().
195-
WithToken("<your-access-token>").
196-
WithModel("sentence-transformers/all-MiniLM-L6-v2")
197-
```
128+
{{< clients-example set="home_query_vec" step="embedder" lang_filter="Go" >}}
129+
{{< /clients-example >}}
198130

199131
## Add data
200132

@@ -210,44 +142,8 @@ Use the `ToFloat32()` method of `Embedding` to produce the array of float
210142
values that we need, and use the `floatsToBytes()` function we defined
211143
above to convert this array to a `byte` string.
212144

213-
```go
214-
sentences := []string{
215-
"That is a very happy person",
216-
"That is a happy dog",
217-
"Today is a sunny day",
218-
}
219-
220-
tags := []string{
221-
"persons", "pets", "weather",
222-
}
223-
224-
embeddings, err := hf.Embed(ctx, sentences)
225-
226-
if err != nil {
227-
panic(err)
228-
}
229-
230-
for i, emb := range embeddings {
231-
buffer := floatsToBytes(emb.ToFloat32())
232-
233-
if err != nil {
234-
panic(err)
235-
}
236-
237-
_, err = rdb.HSet(ctx,
238-
fmt.Sprintf("doc:%v", i),
239-
map[string]any{
240-
"content": sentences[i],
241-
"genre": tags[i],
242-
"embedding": buffer,
243-
},
244-
).Result()
245-
246-
if err != nil {
247-
panic(err)
248-
}
249-
}
250-
```
145+
{{< clients-example set="home_query_vec" step="add_data" lang_filter="Go" >}}
146+
{{< /clients-example >}}
251147

252148
## Run a query
253149

@@ -263,47 +159,8 @@ the indexing, and passes it as a parameter when the query executes
263159
[Vector search]({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
264160
for more information about using query parameters with embeddings).
265161

266-
```go
267-
queryEmbedding, err := hf.Embed(ctx, []string{
268-
"That is a happy person",
269-
})
270-
271-
if err != nil {
272-
panic(err)
273-
}
274-
275-
buffer := floatsToBytes(queryEmbedding[0].ToFloat32())
276-
277-
if err != nil {
278-
panic(err)
279-
}
280-
281-
results, err := rdb.FTSearchWithArgs(ctx,
282-
"vector_idx",
283-
"*=>[KNN 3 @embedding $vec AS vector_distance]",
284-
&redis.FTSearchOptions{
285-
Return: []redis.FTSearchReturn{
286-
{FieldName: "vector_distance"},
287-
{FieldName: "content"},
288-
},
289-
DialectVersion: 2,
290-
Params: map[string]any{
291-
"vec": buffer,
292-
},
293-
},
294-
).Result()
295-
296-
if err != nil {
297-
panic(err)
298-
}
299-
300-
for _, doc := range results.Docs {
301-
fmt.Printf(
302-
"ID: %v, Distance:%v, Content:'%v'\n",
303-
doc.ID, doc.Fields["vector_distance"], doc.Fields["content"],
304-
)
305-
}
306-
```
162+
{{< clients-example set="home_query_vec" step="query" lang_filter="Go" >}}
163+
{{< /clients-example >}}
307164

308165
The code is now ready to run, but note that it may take a while to complete when
309166
you run it for the first time (which happens because `huggingfacetransformer`
@@ -334,37 +191,8 @@ every query. Also, you must set `OnJSON` to `true` when you create the index.
334191
The code below shows these differences, but the index is otherwise very similar to
335192
the one created previously for hashes:
336193

337-
```go
338-
_, err = rdb.FTCreate(ctx,
339-
"vector_json_idx",
340-
&redis.FTCreateOptions{
341-
OnJSON: true,
342-
Prefix: []any{"jdoc:"},
343-
},
344-
&redis.FieldSchema{
345-
FieldName: "$.content",
346-
As: "content",
347-
FieldType: redis.SearchFieldTypeText,
348-
},
349-
&redis.FieldSchema{
350-
FieldName: "$.genre",
351-
As: "genre",
352-
FieldType: redis.SearchFieldTypeTag,
353-
},
354-
&redis.FieldSchema{
355-
FieldName: "$.embedding",
356-
As: "embedding",
357-
FieldType: redis.SearchFieldTypeVector,
358-
VectorArgs: &redis.FTVectorArgs{
359-
HNSWOptions: &redis.FTHNSWOptions{
360-
Dim: 384,
361-
DistanceMetric: "L2",
362-
Type: "FLOAT32",
363-
},
364-
},
365-
},
366-
).Result()
367-
```
194+
{{< clients-example set="home_query_vec" step="json_index" lang_filter="Go" >}}
195+
{{< /clients-example >}}
368196

369197
Use [`JSONSet()`]({{< relref "/commands/json.set" >}}) to add the data
370198
instead of [`HSet()`]({{< relref "/commands/hset" >}}). The maps
@@ -375,23 +203,8 @@ specified using lists instead of binary strings. The loop below is similar
375203
to the one used previously to add the hash data, but it doesn't use the
376204
`floatsToBytes()` function to encode the `float32` array.
377205

378-
```go
379-
for i, emb := range embeddings {
380-
_, err = rdb.JSONSet(ctx,
381-
fmt.Sprintf("jdoc:%v", i),
382-
"$",
383-
map[string]any{
384-
"content": sentences[i],
385-
"genre": tags[i],
386-
"embedding": emb.ToFloat32(),
387-
},
388-
).Result()
389-
390-
if err != nil {
391-
panic(err)
392-
}
393-
}
394-
```
206+
{{< clients-example set="home_query_vec" step="json_data" lang_filter="Go" >}}
207+
{{< /clients-example >}}
395208

396209
The query is almost identical to the one for the hash documents. This
397210
demonstrates how the right choice of aliases for the JSON paths can
@@ -400,32 +213,8 @@ is that the vector parameter for the query is still specified as a
400213
binary string (using the `floatsToBytes()` method), even though the data for
401214
the `embedding` field of the JSON was specified as an array.
402215

403-
```go
404-
jsonQueryEmbedding, err := hf.Embed(ctx, []string{
405-
"That is a happy person",
406-
})
407-
408-
if err != nil {
409-
panic(err)
410-
}
411-
412-
jsonBuffer := floatsToBytes(jsonQueryEmbedding[0].ToFloat32())
413-
414-
jsonResults, err := rdb.FTSearchWithArgs(ctx,
415-
"vector_json_idx",
416-
"*=>[KNN 3 @embedding $vec AS vector_distance]",
417-
&redis.FTSearchOptions{
418-
Return: []redis.FTSearchReturn{
419-
{FieldName: "vector_distance"},
420-
{FieldName: "content"},
421-
},
422-
DialectVersion: 2,
423-
Params: map[string]any{
424-
"vec": jsonBuffer,
425-
},
426-
},
427-
).Result()
428-
```
216+
{{< clients-example set="home_query_vec" step="json_query" lang_filter="Go" >}}
217+
{{< /clients-example >}}
429218

430219
Apart from the `jdoc:` prefixes for the keys, the result from the JSON
431220
query is the same as for hash:

0 commit comments

Comments
 (0)