Skip to content

Commit 4f0b50b

Browse files
DOC-5501 Python prob examples
1 parent 4acf67f commit 4f0b50b

File tree

1 file changed

+173
-12
lines changed
  • content/develop/clients/redis-py

1 file changed

+173
-12
lines changed

content/develop/clients/redis-py/prob.md

Lines changed: 173 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -99,16 +99,45 @@ add. The following example adds some names to a Bloom filter representing
9999
a list of users and checks for the presence or absence of users in the list.
100100
Note that you must use the `bf()` method to access the Bloom filter commands.
101101

102-
{{< clients-example home_prob_dts bloom Python >}}
103-
{{< /clients-example >}}
102+
```py
103+
res1 = r.bf().madd("recorded_users", "andy", "cameron", "david", "michelle")
104+
print(res1) # >>> [1, 1, 1, 1]
105+
106+
res2 = r.bf().exists("recorded_users", "cameron")
107+
print(res2) # >>> 1
108+
109+
res3 = r.bf().exists("recorded_users", "kaitlyn")
110+
print(res3) # >>> 0
111+
```
112+
<!--< clients-example home_prob_dts bloom Python >}}
113+
< /clients-example >}} -->
104114

105115
A Cuckoo filter has similar features to a Bloom filter, but also supports
106116
a deletion operation to remove hashes from a set, as shown in the example
107117
below. Note that you must use the `cf()` method to access the Cuckoo filter
108118
commands.
109119

110-
{{< clients-example home_prob_dts cuckoo Python >}}
111-
{{< /clients-example >}}
120+
```py
121+
res4 = r.cf().add("other_users", "paolo")
122+
print(res4) # >>> 1
123+
124+
res5 = r.cf().add("other_users", "kaitlyn")
125+
print(res5) # >>> 1
126+
127+
res6 = r.cf().add("other_users", "rachel")
128+
print(res6) # >>> 1
129+
130+
res7 = r.cf().mexists("other_users", "paolo", "rachel", "andy")
131+
print(res7) # >>> [1, 1, 0]
132+
133+
res8 = r.cf().delete("other_users", "paolo")
134+
print(res8) # >>> 1
135+
136+
res9 = r.cf().exists("other_users", "paolo")
137+
print(res9) # >>> 0
138+
```
139+
<!-- < clients-example home_prob_dts cuckoo Python >}}
140+
< /clients-example >}} -->
112141

113142
Which of these two data types you choose depends on your use case.
114143
Bloom filters are generally faster than Cuckoo filters when adding new items,
@@ -128,8 +157,27 @@ You can also merge two or more HyperLogLogs to find the cardinality of the
128157
[union](https://en.wikipedia.org/wiki/Union_(set_theory)) of the sets they
129158
represent.
130159

131-
{{< clients-example home_prob_dts hyperloglog Python >}}
132-
{{< /clients-example >}}
160+
```py
161+
res10 = r.pfadd("group:1", "andy", "cameron", "david")
162+
print(res10) # >>> 1
163+
164+
res11 = r.pfcount("group:1")
165+
print(res11) # >>> 3
166+
167+
res12 = r.pfadd("group:2", "kaitlyn", "michelle", "paolo", "rachel")
168+
print(res12) # >>> 1
169+
170+
res13 = r.pfcount("group:2")
171+
print(res13) # >>> 4
172+
173+
res14 = r.pfmerge("both_groups", "group:1", "group:2")
174+
print(res14) # >>> True
175+
176+
res15 = r.pfcount("both_groups")
177+
print(res15) # >>> 7
178+
```
179+
<!--< clients-example home_prob_dts hyperloglog Python >}}
180+
< /clients-example >}} -->
133181

134182
The main benefit that HyperLogLogs offer is their very low
135183
memory usage. They can count up to 2^64 items with less than
@@ -169,8 +217,35 @@ a Count-min sketch object, add data to it, and then query it.
169217
Note that you must use the `cms()` method to access the Count-min
170218
sketch commands.
171219

172-
{{< clients-example home_prob_dts cms Python >}}
173-
{{< /clients-example >}}
220+
```py
221+
# Specify that you want to keep the counts within 0.01
222+
# (1%) of the true value with a 0.005 (0.5%) chance
223+
# of going outside this limit.
224+
res16 = r.cms().initbyprob("items_sold", 0.01, 0.005)
225+
print(res16) # >>> True
226+
227+
# The parameters for `incrby()` are two lists. The count
228+
# for each item in the first list is incremented by the
229+
# value at the same index in the second list.
230+
res17 = r.cms().incrby(
231+
"items_sold",
232+
["bread", "tea", "coffee", "beer"], # Items sold
233+
[300, 200, 200, 100]
234+
)
235+
print(res17) # >>> [300, 200, 200, 100]
236+
237+
res18 = r.cms().incrby(
238+
"items_sold",
239+
["bread", "coffee"],
240+
[100, 150]
241+
)
242+
print(res18) # >>> [400, 350]
243+
244+
res19 = r.cms().query("items_sold", "bread", "tea", "coffee", "beer")
245+
print(res19) # >>> [400, 200, 350, 100]
246+
```
247+
<!--< clients-example home_prob_dts cms Python >}}
248+
< /clients-example >}} -->
174249

175250
The advantage of using a CMS over keeping an exact count with a
176251
[sorted set]({{< relref "/develop/data-types/sorted-sets" >}})
@@ -202,8 +277,52 @@ shows how to merge two or more t-digest objects to query the combined
202277
data set. Note that you must use the `tdigest()` method to access the
203278
t-digest commands.
204279

205-
{{< clients-example home_prob_dts tdigest Python >}}
206-
{{< /clients-example >}}
280+
```py
281+
res20 = r.tdigest().create("male_heights")
282+
print(res20) # >>> True
283+
284+
res21 = r.tdigest().add(
285+
"male_heights",
286+
[175.5, 181, 160.8, 152, 177, 196, 164]
287+
)
288+
print(res21) # >>> OK
289+
290+
res22 = r.tdigest().min("male_heights")
291+
print(res22) # >>> 152.0
292+
293+
res23 = r.tdigest().max("male_heights")
294+
print(res23) # >>> 196.0
295+
296+
res24 = r.tdigest().quantile("male_heights", 0.75)
297+
print(res24) # >>> 181
298+
299+
# Note that the CDF value for 181 is not exactly
300+
# 0.75. Both values are estimates.
301+
res25 = r.tdigest().cdf("male_heights", 181)
302+
print(res25) # >>> [0.7857142857142857]
303+
304+
res26 = r.tdigest().create("female_heights")
305+
print(res26) # >>> True
306+
307+
res27 = r.tdigest().add(
308+
"female_heights",
309+
[155.5, 161, 168.5, 170, 157.5, 163, 171]
310+
)
311+
print(res27) # >>> OK
312+
313+
res28 = r.tdigest().quantile("female_heights", 0.75)
314+
print(res28) # >>> [170]
315+
316+
res29 = r.tdigest().merge(
317+
"all_heights", 2, "male_heights", "female_heights"
318+
)
319+
print(res29) # >>> OK
320+
321+
res30 = r.tdigest().quantile("all_heights", 0.75)
322+
print(res30) # >>> [175.5]
323+
```
324+
<!--< clients-example home_prob_dts tdigest Python >}}
325+
< /clients-example >}} -->
207326

208327
A t-digest object also supports several other related commands, such
209328
as querying by rank. See the
@@ -225,5 +344,47 @@ top *k* items and query whether or not a given item is in the
225344
list. Note that you must use the `topk()` method to access the
226345
Top-K commands.
227346

228-
{{< clients-example home_prob_dts topk Python >}}
229-
{{< /clients-example >}}
347+
```py
348+
# The `reserve()` method creates the Top-K object with
349+
# the given key. The parameters are the number of items
350+
# in the ranking and values for `width`, `depth`, and
351+
# `decay`, described in the Top-K reference page.
352+
res31 = r.topk().reserve("top_3_songs", 3, 7, 8, 0.9)
353+
print(res31) # >>> True
354+
355+
# The parameters for `incrby()` are two lists. The count
356+
# for each item in the first list is incremented by the
357+
# value at the same index in the second list.
358+
res32 = r.topk().incrby(
359+
"top_3_songs",
360+
[
361+
"Starfish Trooper",
362+
"Only one more time",
363+
"Rock me, Handel",
364+
"How will anyone know?",
365+
"Average lover",
366+
"Road to everywhere"
367+
],
368+
[
369+
3000,
370+
1850,
371+
1325,
372+
3890,
373+
4098,
374+
770
375+
]
376+
)
377+
print(res32)
378+
# >>> [None, None, None, 'Rock me, Handel', 'Only one more time', None]
379+
380+
res33 = r.topk().list("top_3_songs")
381+
print(res33)
382+
# >>> ['Average lover', 'How will anyone know?', 'Starfish Trooper']
383+
384+
res34 = r.topk().query(
385+
"top_3_songs", "Starfish Trooper", "Road to everywhere"
386+
)
387+
print(res34) # >>> [1, 0]
388+
```
389+
<!-- < clients-example home_prob_dts topk Python >}}
390+
< /clients-example >}} -->

0 commit comments

Comments
 (0)