Skip to content

Commit 9e1123e

Browse files
DOC-5533 updated vector search example to use same library as vector sets example
1 parent b332cd8 commit 9e1123e

File tree

2 files changed

+80
-56
lines changed

2 files changed

+80
-56
lines changed

content/develop/clients/go/vecsearch.md

Lines changed: 22 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,8 @@ or JSON fields, Redis can retrieve documents that closely match the query in ter
2929
of their meaning.
3030

3131
In the example below, we use the
32-
[`huggingfaceembedder`](https://pkg.go.dev/github.com/henomis/[email protected]/embedder/huggingface)
33-
package from the [`LinGoose`](https://pkg.go.dev/github.com/henomis/[email protected])
34-
framework to generate vector embeddings to store and index with
32+
[`Hugot`](https://pkg.go.dev/github.com/knights-analytics/hugot)
33+
library to generate vector embeddings to store and index with
3534
Redis Query Engine. The code is first demonstrated for hash documents with a
3635
separate section to explain the
3736
[differences with JSON documents](#differences-with-json-documents).
@@ -47,38 +46,23 @@ for more information.
4746

4847
## Initialize
4948

50-
Start a new Go module with the following command:
51-
52-
```bash
53-
go mod init vecexample
54-
```
55-
56-
Then, in your module folder, install
57-
[`go-redis`]({{< relref "/develop/clients/go" >}})
58-
and the
59-
[`huggingfaceembedder`](https://pkg.go.dev/github.com/henomis/[email protected]/embedder/huggingface)
60-
package:
49+
First, install [`go-redis`]({{< relref "/develop/clients/go" >}})
50+
if you haven't already done so. Then, install
51+
[`Hugot`](https://pkg.go.dev/github.com/knights-analytics/hugot)
52+
using the following command:
6153

6254
```bash
63-
go get github.com/redis/go-redis/v9
64-
go get github.com/henomis/lingoose/embedder/huggingface
55+
go get github.com/knights-analytics/hugot
6556
```
6657

6758
Add the following imports to your module's main program file:
6859

6960
{{< clients-example set="home_query_vec" step="import" lang_filter="Go" >}}
7061
{{< /clients-example >}}
7162

72-
You must also create a [HuggingFace account](https://huggingface.co/join)
73-
and add a new access token to use the embedding model. See the
74-
[HuggingFace](https://huggingface.co/docs/hub/en/security-tokens)
75-
docs to learn how to create and manage access tokens. Note that the
76-
account and the `all-MiniLM-L6-v2` model that we will use to produce
77-
the embeddings for this example are both available for free.
78-
7963
## Add a helper function
8064

81-
The `huggingfaceembedder` model outputs the embeddings as a
65+
The `Hugot` model outputs the embeddings as a
8266
`[]float32` array. If you are storing your documents as
8367
[hash]({{< relref "/develop/data-types/hashes" >}}) objects, then you
8468
must convert this array to a `byte` string before adding it as a hash field.
@@ -119,11 +103,10 @@ and 384 dimensions, as required by the `all-MiniLM-L6-v2` embedding model.
119103

120104
## Create an embedder instance
121105

122-
You need an instance of the `huggingfaceembedder` class to
106+
You need an instance of the `FeatureExtractionPipeline` class to
123107
generate the embeddings. Use the code below to create an
124108
instance that uses the `sentence-transformers/all-MiniLM-L6-v2`
125-
model, passing your HuggingFace access token to the `WithToken()`
126-
method.
109+
model:
127110

128111
{{< clients-example set="home_query_vec" step="embedder" lang_filter="Go" >}}
129112
{{< /clients-example >}}
@@ -134,12 +117,12 @@ You can now supply the data objects, which will be indexed automatically
134117
when you add them with [`HSet()`]({{< relref "/commands/hset" >}}), as long as
135118
you use the `doc:` prefix specified in the index definition.
136119

137-
Use the `Embed()` method of `huggingfacetransformer`
120+
Use the `RunPipeline()` method of `FeatureExtractionPipeline`
138121
as shown below to create the embeddings that represent the `content` fields.
139122
This method takes an array of strings and outputs a corresponding
140-
array of `Embedding` objects.
141-
Use the `ToFloat32()` method of `Embedding` to produce the array of float
142-
values that we need, and use the `floatsToBytes()` function we defined
123+
array of `FeatureExtractionOutput` objects.
124+
The `Embeddings` field of `FeatureExtractionOutput` contains the array of float
125+
values that you need for the index. Use the `floatsToBytes()` function defined
143126
above to convert this array to a `byte` string.
144127

145128
{{< clients-example set="home_query_vec" step="add_data" lang_filter="Go" >}}
@@ -153,7 +136,7 @@ text. Redis calculates the similarity between the query vector and each
153136
embedding vector in the index as it runs the query. It then ranks the
154137
results in order of this numeric similarity value.
155138

156-
The code below creates the query embedding using `Embed()`, as with
139+
The code below creates the query embedding using `RunPipeline()`, as with
157140
the indexing, and passes it as a parameter when the query executes
158141
(see
159142
[Vector search]({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
@@ -163,14 +146,14 @@ for more information about using query parameters with embeddings).
163146
{{< /clients-example >}}
164147

165148
The code is now ready to run, but note that it may take a while to complete when
166-
you run it for the first time (which happens because `huggingfacetransformer`
149+
you run it for the first time (which happens because `Hugot`
167150
must download the `all-MiniLM-L6-v2` model data before it can
168151
generate the embeddings). When you run the code, it outputs the following text:
169152

170153
```
171-
ID: doc:0, Distance:0.114169843495, Content:'That is a very happy person'
172-
ID: doc:1, Distance:0.610845327377, Content:'That is a happy dog'
173-
ID: doc:2, Distance:1.48624765873, Content:'Today is a sunny day'
154+
ID: doc:0, Distance:2.96992516518, Content:'That is a very happy person'
155+
ID: doc:1, Distance:17.3678302765, Content:'That is a happy dog'
156+
ID: doc:2, Distance:43.7771987915, Content:'Today is a sunny day'
174157
```
175158

176159
The results are ordered according to the value of the `vector_distance`
@@ -220,9 +203,9 @@ Apart from the `jdoc:` prefixes for the keys, the result from the JSON
220203
query is the same as for hash:
221204

222205
```
223-
ID: jdoc:0, Distance:0.114169843495, Content:'That is a very happy person'
224-
ID: jdoc:1, Distance:0.610845327377, Content:'That is a happy dog'
225-
ID: jdoc:2, Distance:1.48624765873, Content:'Today is a sunny day'
206+
ID: jdoc:0, Distance:2.96992516518, Content:'That is a very happy person'
207+
ID: jdoc:1, Distance:17.3678302765, Content:'That is a happy dog'
208+
ID: jdoc:2, Distance:43.7771987915, Content:'Today is a sunny day'
226209
```
227210

228211
## Learn more

local_examples/client-specific/home_query_vec.go

Lines changed: 58 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ import (
88
"fmt"
99
"math"
1010

11-
huggingfaceembedder "github.com/henomis/lingoose/embedder/huggingface"
11+
"github.com/knights-analytics/hugot"
1212
"github.com/redis/go-redis/v9"
1313
)
1414

@@ -79,9 +79,37 @@ func main() {
7979
// STEP_END
8080

8181
// STEP_START embedder
82-
hf := huggingfaceembedder.New().
83-
WithToken("<your-access-token>").
84-
WithModel("sentence-transformers/all-MiniLM-L6-v2")
82+
// Create a Hugot session
83+
session, err := hugot.NewGoSession()
84+
if err != nil {
85+
panic(err)
86+
}
87+
defer func() {
88+
err := session.Destroy()
89+
if err != nil {
90+
panic(err)
91+
}
92+
}()
93+
94+
// Download the model
95+
downloadOptions := hugot.NewDownloadOptions()
96+
downloadOptions.OnnxFilePath = "onnx/model.onnx" // Specify which ONNX file to use
97+
modelPath, err := hugot.DownloadModel("sentence-transformers/all-MiniLM-L6-v2", "./models/", downloadOptions)
98+
if err != nil {
99+
panic(err)
100+
}
101+
102+
// Create feature extraction pipeline configuration
103+
config := hugot.FeatureExtractionConfig{
104+
ModelPath: modelPath,
105+
Name: "embeddingPipeline",
106+
}
107+
108+
// Create the feature extraction pipeline
109+
embeddingPipeline, err := hugot.NewPipeline(session, config)
110+
if err != nil {
111+
panic(err)
112+
}
85113
// STEP_END
86114

87115
// STEP_START add_data
@@ -95,16 +123,17 @@ func main() {
95123
"persons", "pets", "weather",
96124
}
97125

98-
embeddings, err := hf.Embed(ctx, sentences)
126+
// Generate embeddings using Hugot
127+
embeddingResult, err := embeddingPipeline.RunPipeline(sentences)
99128
if err != nil {
100129
panic(err)
101130
}
102131

132+
// Extract the embeddings from the result
133+
embeddings := embeddingResult.Embeddings
134+
103135
for i, emb := range embeddings {
104-
buffer := floatsToBytes(emb.ToFloat32())
105-
if err != nil {
106-
panic(err)
107-
}
136+
buffer := floatsToBytes(emb)
108137

109138
_, err = rdb.HSet(ctx,
110139
fmt.Sprintf("doc:%v", i),
@@ -114,25 +143,25 @@ func main() {
114143
"embedding": buffer,
115144
},
116145
).Result()
146+
117147
if err != nil {
118148
panic(err)
119149
}
120150
}
121151
// STEP_END
122152

123153
// STEP_START query
124-
queryEmbedding, err := hf.Embed(ctx, []string{
154+
// Generate query embedding using Hugot
155+
queryResult, err := embeddingPipeline.RunPipeline([]string{
125156
"That is a happy person",
126157
})
127-
if err != nil {
128-
panic(err)
129-
}
130158

131-
buffer := floatsToBytes(queryEmbedding[0].ToFloat32())
132159
if err != nil {
133160
panic(err)
134161
}
135162

163+
buffer := floatsToBytes(queryResult.Embeddings[0])
164+
136165
results, err := rdb.FTSearchWithArgs(ctx,
137166
"vector_idx",
138167
"*=>[KNN 3 @embedding $vec AS vector_distance]",
@@ -147,6 +176,7 @@ func main() {
147176
},
148177
},
149178
).Result()
179+
150180
if err != nil {
151181
panic(err)
152182
}
@@ -160,6 +190,13 @@ func main() {
160190
// STEP_END
161191

162192
// STEP_START json_index
193+
rdb.FTDropIndexWithArgs(ctx,
194+
"vector_json_idx",
195+
&redis.FTDropIndexOptions{
196+
DeleteDocs: true,
197+
},
198+
)
199+
163200
_, err = rdb.FTCreate(ctx,
164201
"vector_json_idx",
165202
&redis.FTCreateOptions{
@@ -202,24 +239,27 @@ func main() {
202239
map[string]any{
203240
"content": sentences[i],
204241
"genre": tags[i],
205-
"embedding": emb.ToFloat32(),
242+
"embedding": emb,
206243
},
207244
).Result()
245+
208246
if err != nil {
209247
panic(err)
210248
}
211249
}
212250
// STEP_END
213251

214252
// STEP_START json_query
215-
jsonQueryEmbedding, err := hf.Embed(ctx, []string{
253+
// Generate query embedding for JSON search using Hugot
254+
jsonQueryResult, err := embeddingPipeline.RunPipeline([]string{
216255
"That is a happy person",
217256
})
257+
218258
if err != nil {
219259
panic(err)
220260
}
221261

222-
jsonBuffer := floatsToBytes(jsonQueryEmbedding[0].ToFloat32())
262+
jsonBuffer := floatsToBytes(jsonQueryResult.Embeddings[0])
223263

224264
jsonResults, err := rdb.FTSearchWithArgs(ctx,
225265
"vector_json_idx",
@@ -235,6 +275,7 @@ func main() {
235275
},
236276
},
237277
).Result()
278+
238279
if err != nil {
239280
panic(err)
240281
}

0 commit comments

Comments
 (0)