@@ -29,9 +29,8 @@ or JSON fields, Redis can retrieve documents that closely match the query in ter
2929of their meaning.
3030
3131In the example below, we use the
32- [ ` huggingfaceembedder ` ] ( https://pkg.go.dev/github.com/henomis/[email protected] /embedder/huggingface ) 33- package from the
[ ` LinGoose ` ] ( https://pkg.go.dev/github.com/henomis/[email protected] ) 34- framework to generate vector embeddings to store and index with
32+ [ ` Hugot ` ] ( https://pkg.go.dev/github.com/knights-analytics/hugot )
33+ library to generate vector embeddings to store and index with
3534Redis Query Engine. The code is first demonstrated for hash documents with a
3635separate section to explain the
3736[ differences with JSON documents] ( #differences-with-json-documents ) .
@@ -47,38 +46,23 @@ for more information.
4746
4847## Initialize
4948
50- Start a new Go module with the following command:
51-
52- ``` bash
53- go mod init vecexample
54- ```
55-
56- Then, in your module folder, install
57- [ ` go-redis ` ] ({{< relref "/develop/clients/go" >}})
58- and the
59- [ ` huggingfaceembedder ` ] ( https://pkg.go.dev/github.com/henomis/[email protected] /embedder/huggingface ) 60- package:
49+ First, install [ ` go-redis ` ] ({{< relref "/develop/clients/go" >}})
50+ if you haven't already done so. Then, install
51+ [ ` Hugot ` ] ( https://pkg.go.dev/github.com/knights-analytics/hugot )
52+ using the following command:
6153
6254``` bash
63- go get github.com/redis/go-redis/v9
64- go get github.com/henomis/lingoose/embedder/huggingface
55+ go get github.com/knights-analytics/hugot
6556```
6657
6758Add the following imports to your module's main program file:
6859
6960{{< clients-example set="home_query_vec" step="import" lang_filter="Go" >}}
7061{{< /clients-example >}}
7162
72- You must also create a [ HuggingFace account] ( https://huggingface.co/join )
73- and add a new access token to use the embedding model. See the
74- [ HuggingFace] ( https://huggingface.co/docs/hub/en/security-tokens )
75- docs to learn how to create and manage access tokens. Note that the
76- account and the ` all-MiniLM-L6-v2 ` model that we will use to produce
77- the embeddings for this example are both available for free.
78-
7963## Add a helper function
8064
81- The ` huggingfaceembedder ` model outputs the embeddings as a
65+ The ` Hugot ` model outputs the embeddings as a
8266` []float32 ` array. If you are storing your documents as
8367[ hash] ({{< relref "/develop/data-types/hashes" >}}) objects, then you
8468must convert this array to a ` byte ` string before adding it as a hash field.
@@ -119,11 +103,10 @@ and 384 dimensions, as required by the `all-MiniLM-L6-v2` embedding model.
119103
120104## Create an embedder instance
121105
122- You need an instance of the ` huggingfaceembedder ` class to
106+ You need an instance of the ` FeatureExtractionPipeline ` class to
123107generate the embeddings. Use the code below to create an
124108instance that uses the ` sentence-transformers/all-MiniLM-L6-v2 `
125- model, passing your HuggingFace access token to the ` WithToken() `
126- method.
109+ model:
127110
128111{{< clients-example set="home_query_vec" step="embedder" lang_filter="Go" >}}
129112{{< /clients-example >}}
@@ -134,12 +117,12 @@ You can now supply the data objects, which will be indexed automatically
134117when you add them with [ ` HSet() ` ] ({{< relref "/commands/hset" >}}), as long as
135118you use the ` doc: ` prefix specified in the index definition.
136119
137- Use the ` Embed ()` method of ` huggingfacetransformer `
120+ Use the ` RunPipeline ()` method of ` FeatureExtractionPipeline `
138121as shown below to create the embeddings that represent the ` content ` fields.
139122This method takes an array of strings and outputs a corresponding
140- array of ` Embedding ` objects.
141- Use the ` ToFloat32() ` method of ` Embedding ` to produce the array of float
142- values that we need, and use the ` floatsToBytes() ` function we defined
123+ array of ` FeatureExtractionOutput ` objects.
124+ The ` Embeddings ` field of ` FeatureExtractionOutput ` contains the array of float
125+ values that you need for the index. Use the ` floatsToBytes() ` function defined
143126above to convert this array to a ` byte ` string.
144127
145128{{< clients-example set="home_query_vec" step="add_data" lang_filter="Go" >}}
@@ -153,7 +136,7 @@ text. Redis calculates the similarity between the query vector and each
153136embedding vector in the index as it runs the query. It then ranks the
154137results in order of this numeric similarity value.
155138
156- The code below creates the query embedding using ` Embed ()` , as with
139+ The code below creates the query embedding using ` RunPipeline ()` , as with
157140the indexing, and passes it as a parameter when the query executes
158141(see
159142[ Vector search] ({{< relref "/develop/ai/search-and-query/query/vector-search" >}})
@@ -163,14 +146,14 @@ for more information about using query parameters with embeddings).
163146{{< /clients-example >}}
164147
165148The code is now ready to run, but note that it may take a while to complete when
166- you run it for the first time (which happens because ` huggingfacetransformer `
149+ you run it for the first time (which happens because ` Hugot `
167150must download the ` all-MiniLM-L6-v2 ` model data before it can
168151generate the embeddings). When you run the code, it outputs the following text:
169152
170153```
171- ID: doc:0, Distance:0.114169843495 , Content:'That is a very happy person'
172- ID: doc:1, Distance:0.610845327377 , Content:'That is a happy dog'
173- ID: doc:2, Distance:1.48624765873 , Content:'Today is a sunny day'
154+ ID: doc:0, Distance:2.96992516518 , Content:'That is a very happy person'
155+ ID: doc:1, Distance:17.3678302765 , Content:'That is a happy dog'
156+ ID: doc:2, Distance:43.7771987915 , Content:'Today is a sunny day'
174157```
175158
176159The results are ordered according to the value of the ` vector_distance `
@@ -220,9 +203,9 @@ Apart from the `jdoc:` prefixes for the keys, the result from the JSON
220203query is the same as for hash:
221204
222205```
223- ID: jdoc:0, Distance:0.114169843495 , Content:'That is a very happy person'
224- ID: jdoc:1, Distance:0.610845327377 , Content:'That is a happy dog'
225- ID: jdoc:2, Distance:1.48624765873 , Content:'Today is a sunny day'
206+ ID: jdoc:0, Distance:2.96992516518 , Content:'That is a very happy person'
207+ ID: jdoc:1, Distance:17.3678302765 , Content:'That is a happy dog'
208+ ID: jdoc:2, Distance:43.7771987915 , Content:'Today is a sunny day'
226209```
227210
228211## Learn more
0 commit comments