@@ -47,12 +47,8 @@ pip install sentence-transformers
4747
4848In a new Python file, import the required classes:
4949
50- ``` python
51- from sentence_transformers import SentenceTransformer
52-
53- import redis
54- import numpy as np
55- ```
50+ {{< clients-example set="home_vecsets" step="import" >}}
51+ {{< /clients-example >}}
5652
5753The first of these imports is the
5854` SentenceTransformer ` class, which generates an embedding from a section of text.
@@ -65,77 +61,16 @@ tokens (see
6561at the [ Hugging Face] ( https://huggingface.co/ ) docs to learn more about the way tokens
6662are related to the original text).
6763
68- ``` python
69- model = SentenceTransformer(" sentence-transformers/all-MiniLM-L6-v2" )
70- ```
64+ {{< clients-example set="home_vecsets" step="model" >}}
65+ {{< /clients-example >}}
7166
7267## Create the data
7368
7469The example data is contained a dictionary with some brief
7570descriptions of famous people:
7671
77- ``` python
78- peopleData = {
79- " Marie Curie" : {
80- " born" : 1867 , " died" : 1934 ,
81- " description" : """
82- Polish-French chemist and physicist. The only person ever to win
83- two Nobel prizes for two different sciences.
84- """
85- },
86- " Linus Pauling" : {
87- " born" : 1901 , " died" : 1994 ,
88- " description" : """
89- American chemist and peace activist. One of only two people to win two
90- Nobel prizes in different fields (chemistry and peace).
91- """
92- },
93- " Freddie Mercury" : {
94- " born" : 1946 , " died" : 1991 ,
95- " description" : """
96- British musician, best known as the lead singer of the rock band
97- Queen.
98- """
99- },
100- " Marie Fredriksson" : {
101- " born" : 1958 , " died" : 2019 ,
102- " description" : """
103- Swedish multi-instrumentalist, mainly known as the lead singer and
104- keyboardist of the band Roxette.
105- """
106- },
107- " Paul Erdos" : {
108- " born" : 1913 , " died" : 1996 ,
109- " description" : """
110- Hungarian mathematician, known for his eccentric personality almost
111- as much as his contributions to many different fields of mathematics.
112- """
113- },
114- " Maryam Mirzakhani" : {
115- " born" : 1977 , " died" : 2017 ,
116- " description" : """
117- Iranian mathematician. The first woman ever to win the Fields medal
118- for her contributions to mathematics.
119- """
120- },
121- " Masako Natsume" : {
122- " born" : 1957 , " died" : 1985 ,
123- " description" : """
124- Japanese actress. She was very famous in Japan but was primarily
125- known elsewhere in the world for her portrayal of Tripitaka in the
126- TV series Monkey.
127- """
128- },
129- " Chaim Topol" : {
130- " born" : 1935 , " died" : 2023 ,
131- " description" : """
132- Israeli actor and singer, usually credited simply as 'Topol'. He was
133- best known for his many appearances as Tevye in the musical Fiddler
134- on the Roof.
135- """
136- }
137- }
138- ```
72+ {{< clients-example set="home_vecsets" step="data" >}}
73+ {{< /clients-example >}}
13974
14075## Add the data to a vector set
14176
@@ -164,22 +99,8 @@ The call to `vadd()` also adds the `born` and `died` values from the
16499original dictionary as attribute data. You can access this during a query
165100or by using the [ ` vgetattr() ` ] ({{< relref "/commands/vgetattr" >}}) method.
166101
167- ``` py
168- r = redis.Redis(decode_responses = True )
169-
170- for name, details in peopleData.items():
171- emb = model.encode(details[" description" ]).astype(np.float32).tobytes()
172-
173- r.vset().vadd(
174- " famousPeople" ,
175- emb,
176- name,
177- attributes = {
178- " born" : details[" born" ],
179- " died" : details[" died" ]
180- }
181- )
182- ```
102+ {{< clients-example set="home_vecsets" step="add_data" >}}
103+ {{< /clients-example >}}
183104
184105## Query the vector set
185106
@@ -191,16 +112,8 @@ of the set, ranked in order of similarity to the query.
191112
192113Start with a simple query for "actors":
193114
194- ``` py
195- query_value = " actors"
196-
197- actors_results = r.vset().vsim(
198- " famousPeople" ,
199- model.encode(query_value).astype(np.float32).tobytes(),
200- )
201-
202- print (f " 'actors': { actors_results} " )
203- ```
115+ {{< clients-example set="home_vecsets" step="basic_query" >}}
116+ {{< /clients-example >}}
204117
205118This returns the following list of elements (formatted slightly for clarity):
206119
@@ -218,18 +131,8 @@ on the information contained in the embedding model.
218131You can use the ` count ` parameter of ` vsim() ` to limit the list of elements
219132to just the most relevant few items:
220133
221- ``` py
222- query_value = " actors"
223-
224- two_actors_results = r.vset().vsim(
225- " famousPeople" ,
226- model.encode(query_value).astype(np.float32).tobytes(),
227- count = 2
228- )
229-
230- print (f " 'actors (2)': { two_actors_results} " )
231- # >>> 'actors (2)': ['Masako Natsume', 'Chaim Topol']
232- ```
134+ {{< clients-example set="home_vecsets" step="limited_query" >}}
135+ {{< /clients-example >}}
233136
234137The reason for using text embeddings rather than simple text search
235138is that the embeddings represent semantic information. This allows a query
@@ -238,19 +141,8 @@ different. For example, the word "entertainer" doesn't appear in any of the
238141descriptions but if you use it as a query, the actors and musicians are ranked
239142highest in the results list:
240143
241- ``` py
242- query_value = " entertainer"
243-
244- entertainer_results = r.vset().vsim(
245- " famousPeople" ,
246- model.encode(query_value).astype(np.float32).tobytes()
247- )
248-
249- print (f " 'entertainer': { entertainer_results} " )
250- # >>> 'entertainer': ['Chaim Topol', 'Freddie Mercury',
251- # >>> 'Marie Fredriksson', 'Masako Natsume', 'Linus Pauling',
252- # 'Paul Erdos', 'Maryam Mirzakhani', 'Marie Curie']
253- ```
144+ {{< clients-example set="home_vecsets" step="entertainer_query" >}}
145+ {{< /clients-example >}}
254146
255147Similarly, if you use "science" as a query, you get the following results:
256148
@@ -270,19 +162,8 @@ with `vsim()` to restrict the search further. For example,
270162repeat the "science" query, but this time limit the results to people
271163who died before the year 2000:
272164
273- ``` py
274- query_value = " science"
275-
276- science2000_results = r.vset().vsim(
277- " famousPeople" ,
278- model.encode(query_value).astype(np.float32).tobytes(),
279- filter = " .died < 2000"
280- )
281-
282- print (f " 'science2000': { science2000_results} " )
283- # >>> 'science2000': ['Marie Curie', 'Linus Pauling',
284- # 'Paul Erdos', 'Freddie Mercury', 'Masako Natsume']
285- ```
165+ {{< clients-example set="home_vecsets" step="filtered_query" >}}
166+ {{< /clients-example >}}
286167
287168Note that the boolean filter expression is applied to items in the list
288169before the vector distance calculation is performed. Items that don't
0 commit comments