@@ -66,19 +66,8 @@ go get github.com/henomis/lingoose/embedder/huggingface
6666
6767Add the following imports to your module's main program file:
6868
69- ``` go
70- package main
71-
72- import (
73- " context"
74- " encoding/binary"
75- " fmt"
76- " math"
77-
78- huggingfaceembedder " github.com/henomis/lingoose/embedder/huggingface"
79- " github.com/redis/go-redis/v9"
80- )
81- ```
69+ {{< clients-example set="home_query_vec" step="import" lang_filter="Go" >}}
70+ {{< /clients-example >}}
8271
8372You must also create a [ HuggingFace account] ( https://huggingface.co/join )
8473and add a new access token to use the embedding model. See the
@@ -96,18 +85,8 @@ must convert this array to a `byte` string before adding it as a hash field.
9685The function shown below uses Go's [ ` binary ` ] ( https://pkg.go.dev/encoding/binary )
9786package to produce the ` byte ` string:
9887
99- ``` go
100- func floatsToBytes (fs []float32 ) []byte {
101- buf := make ([]byte , len (fs)*4 )
102-
103- for i , f := range fs {
104- u := math.Float32bits (f)
105- binary.NativeEndian .PutUint32 (buf[i*4 :], u)
106- }
107-
108- return buf
109- }
110- ```
88+ {{< clients-example set="home_query_vec" step="helper" lang_filter="Go" >}}
89+ {{< /clients-example >}}
11190
11291Note that if you are using [ JSON] ({{< relref "/develop/data-types/json" >}})
11392objects to store your documents instead of hashes, then you should store
@@ -120,22 +99,8 @@ below).
12099In the ` main() ` function, connect to Redis and delete any index previously
121100created with the name ` vector_idx ` :
122101
123- ``` go
124- ctx := context.Background ()
125- rdb := redis.NewClient (&redis.Options {
126- Addr : " localhost:6379" ,
127- Password : " " , // no password docs
128- DB : 0 , // use default DB
129- Protocol : 2 ,
130- })
131-
132- rdb.FTDropIndexWithArgs (ctx,
133- " vector_idx" ,
134- &redis.FTDropIndexOptions {
135- DeleteDocs : true ,
136- },
137- )
138- ```
102+ {{< clients-example set="home_query_vec" step="connect" lang_filter="Go" >}}
103+ {{< /clients-example >}}
139104
140105Next, create the index.
141106The schema in the example below specifies hash objects for storage and includes
@@ -149,38 +114,8 @@ indexing, the
149114vector distance metric, ` Float32 ` values to represent the vector's components,
150115and 384 dimensions, as required by the ` all-MiniLM-L6-v2 ` embedding model.
151116
152- ``` go
153- _ , err := rdb.FTCreate (ctx,
154- " vector_idx" ,
155- &redis.FTCreateOptions {
156- OnHash : true ,
157- Prefix : []any{" doc:" },
158- },
159- &redis.FieldSchema {
160- FieldName : " content" ,
161- FieldType : redis.SearchFieldTypeText ,
162- },
163- &redis.FieldSchema {
164- FieldName : " genre" ,
165- FieldType : redis.SearchFieldTypeTag ,
166- },
167- &redis.FieldSchema {
168- FieldName : " embedding" ,
169- FieldType : redis.SearchFieldTypeVector ,
170- VectorArgs : &redis.FTVectorArgs {
171- HNSWOptions: &redis.FTHNSWOptions {
172- Dim: 384 ,
173- DistanceMetric: " L2" ,
174- Type: " FLOAT32" ,
175- },
176- },
177- },
178- ).Result ()
179-
180- if err != nil {
181- panic (err)
182- }
183- ```
117+ {{< clients-example set="home_query_vec" step="create_index" lang_filter="Go" >}}
118+ {{< /clients-example >}}
184119
185120## Create an embedder instance
186121
@@ -190,11 +125,8 @@ instance that uses the `sentence-transformers/all-MiniLM-L6-v2`
190125model, passing your HuggingFace access token to the ` WithToken() `
191126method.
192127
193- ``` go
194- hf := huggingfaceembedder.New ().
195- WithToken (" <your-access-token>" ).
196- WithModel (" sentence-transformers/all-MiniLM-L6-v2" )
197- ```
128+ {{< clients-example set="home_query_vec" step="embedder" lang_filter="Go" >}}
129+ {{< /clients-example >}}
198130
199131## Add data
200132
@@ -210,44 +142,8 @@ Use the `ToFloat32()` method of `Embedding` to produce the array of float
210142values that we need, and use the ` floatsToBytes() ` function we defined
211143above to convert this array to a ` byte ` string.
212144
213- ``` go
214- sentences := []string {
215- " That is a very happy person" ,
216- " That is a happy dog" ,
217- " Today is a sunny day" ,
218- }
219-
220- tags := []string {
221- " persons" , " pets" , " weather" ,
222- }
223-
224- embeddings , err := hf.Embed (ctx, sentences)
225-
226- if err != nil {
227- panic (err)
228- }
229-
230- for i , emb := range embeddings {
231- buffer := floatsToBytes (emb.ToFloat32 ())
232-
233- if err != nil {
234- panic (err)
235- }
236-
237- _, err = rdb.HSet (ctx,
238- fmt.Sprintf (" doc:%v " , i),
239- map [string ]any{
240- " content" : sentences[i],
241- " genre" : tags[i],
242- " embedding" : buffer,
243- },
244- ).Result ()
245-
246- if err != nil {
247- panic (err)
248- }
249- }
250- ```
145+ {{< clients-example set="home_query_vec" step="add_data" lang_filter="Go" >}}
146+ {{< /clients-example >}}
251147
252148## Run a query
253149
@@ -263,47 +159,8 @@ the indexing, and passes it as a parameter when the query executes
263159[ Vector search] ({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
264160for more information about using query parameters with embeddings).
265161
266- ``` go
267- queryEmbedding , err := hf.Embed (ctx, []string {
268- " That is a happy person" ,
269- })
270-
271- if err != nil {
272- panic (err)
273- }
274-
275- buffer := floatsToBytes (queryEmbedding[0 ].ToFloat32 ())
276-
277- if err != nil {
278- panic (err)
279- }
280-
281- results , err := rdb.FTSearchWithArgs (ctx,
282- " vector_idx" ,
283- " *=>[KNN 3 @embedding $vec AS vector_distance]" ,
284- &redis.FTSearchOptions {
285- Return : []redis.FTSearchReturn {
286- {FieldName: " vector_distance" },
287- {FieldName: " content" },
288- },
289- DialectVersion : 2 ,
290- Params : map [string ]any{
291- " vec" : buffer,
292- },
293- },
294- ).Result ()
295-
296- if err != nil {
297- panic (err)
298- }
299-
300- for _ , doc := range results.Docs {
301- fmt.Printf (
302- " ID: %v , Distance:%v , Content:'%v '\n " ,
303- doc.ID , doc.Fields [" vector_distance" ], doc.Fields [" content" ],
304- )
305- }
306- ```
162+ {{< clients-example set="home_query_vec" step="query" lang_filter="Go" >}}
163+ {{< /clients-example >}}
307164
308165The code is now ready to run, but note that it may take a while to complete when
309166you run it for the first time (which happens because ` huggingfacetransformer `
@@ -334,37 +191,8 @@ every query. Also, you must set `OnJSON` to `true` when you create the index.
334191The code below shows these differences, but the index is otherwise very similar to
335192the one created previously for hashes:
336193
337- ``` go
338- _, err = rdb.FTCreate (ctx,
339- " vector_json_idx" ,
340- &redis.FTCreateOptions {
341- OnJSON : true ,
342- Prefix : []any{" jdoc:" },
343- },
344- &redis.FieldSchema {
345- FieldName : " $.content" ,
346- As : " content" ,
347- FieldType : redis.SearchFieldTypeText ,
348- },
349- &redis.FieldSchema {
350- FieldName : " $.genre" ,
351- As : " genre" ,
352- FieldType : redis.SearchFieldTypeTag ,
353- },
354- &redis.FieldSchema {
355- FieldName : " $.embedding" ,
356- As : " embedding" ,
357- FieldType : redis.SearchFieldTypeVector ,
358- VectorArgs : &redis.FTVectorArgs {
359- HNSWOptions: &redis.FTHNSWOptions {
360- Dim: 384 ,
361- DistanceMetric: " L2" ,
362- Type: " FLOAT32" ,
363- },
364- },
365- },
366- ).Result ()
367- ```
194+ {{< clients-example set="home_query_vec" step="json_index" lang_filter="Go" >}}
195+ {{< /clients-example >}}
368196
369197Use [ ` JSONSet() ` ] ({{< relref "/commands/json.set" >}}) to add the data
370198instead of [ ` HSet() ` ] ({{< relref "/commands/hset" >}}). The maps
@@ -375,23 +203,8 @@ specified using lists instead of binary strings. The loop below is similar
375203to the one used previously to add the hash data, but it doesn't use the
376204` floatsToBytes() ` function to encode the ` float32 ` array.
377205
378- ``` go
379- for i , emb := range embeddings {
380- _, err = rdb.JSONSet (ctx,
381- fmt.Sprintf (" jdoc:%v " , i),
382- " $" ,
383- map [string ]any{
384- " content" : sentences[i],
385- " genre" : tags[i],
386- " embedding" : emb.ToFloat32 (),
387- },
388- ).Result ()
389-
390- if err != nil {
391- panic (err)
392- }
393- }
394- ```
206+ {{< clients-example set="home_query_vec" step="json_data" lang_filter="Go" >}}
207+ {{< /clients-example >}}
395208
396209The query is almost identical to the one for the hash documents. This
397210demonstrates how the right choice of aliases for the JSON paths can
@@ -400,32 +213,8 @@ is that the vector parameter for the query is still specified as a
400213binary string (using the ` floatsToBytes() ` method), even though the data for
401214the ` embedding ` field of the JSON was specified as an array.
402215
403- ``` go
404- jsonQueryEmbedding , err := hf.Embed (ctx, []string {
405- " That is a happy person" ,
406- })
407-
408- if err != nil {
409- panic (err)
410- }
411-
412- jsonBuffer := floatsToBytes (jsonQueryEmbedding[0 ].ToFloat32 ())
413-
414- jsonResults , err := rdb.FTSearchWithArgs (ctx,
415- " vector_json_idx" ,
416- " *=>[KNN 3 @embedding $vec AS vector_distance]" ,
417- &redis.FTSearchOptions {
418- Return : []redis.FTSearchReturn {
419- {FieldName: " vector_distance" },
420- {FieldName: " content" },
421- },
422- DialectVersion : 2 ,
423- Params : map [string ]any{
424- " vec" : jsonBuffer,
425- },
426- },
427- ).Result ()
428- ```
216+ {{< clients-example set="home_query_vec" step="json_query" lang_filter="Go" >}}
217+ {{< /clients-example >}}
429218
430219Apart from the ` jdoc: ` prefixes for the keys, the result from the JSON
431220query is the same as for hash:
0 commit comments