Skip to content

Commit 3fece21

Browse files
authored
Apply suggestions from code review
1 parent 2ccc347 commit 3fece21

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

content/develop/data-types/probabilistic/bloom-filter.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ weight: 10
1919

2020
A Bloom filter is a probabilistic data structure in Redis Stack that enables you to check if an item is present in a set using a very small memory space of a fixed size.
2121

22-
Instead of storing all the items in a set, a Bloom Filter stores only the items' hashed representation, thus sacrificing some precision. The trade-off is that Bloom Filters are very space-efficient and fast.
22+
Instead of storing all the items in a set, a Bloom Filter stores only the items' hashed representations, thus sacrificing some precision. The trade-off is that Bloom Filters are very space-efficient and fast.
2323

2424
A Bloom filter can guarantee the absence of an item from a set, but it can only give an estimation about its presence. So when it responds that an item is not present in a set (a negative answer), you can be sure that indeed is the case. But one out of every N positive answers will be wrong. Even though it looks unusual at first glance, this kind of uncertainty still has its place in computer science. There are many cases out there where a negative answer will prevent more costly operations, for example checking if a username has been taken, if a credit card has been reported as stolen, if a user has already seen an ad and much more.
2525

@@ -114,7 +114,7 @@ The rate is a decimal value between 0 and 1. For example, for a desired false po
114114
This is the number of items you expect having in your filter in total and is trivial when you have a static set but it becomes more challenging when your set grows over time. It's important to get the number right because if you **oversize** - you'll end up wasting memory. If you **undersize**, the filter will fill up and a new one will have to be stacked on top of it (sub-filter stacking). In the cases when a filter consists of multiple sub-filters stacked on top of each other latency for adds stays the same, but the latency for presence checks increases. The reason for this is the way the checks work: a regular check would first be performed on the top (latest) filter and if a negative answer is returned the next one is checked and so on. That's where the added latency comes from.
115115

116116
#### 3. Scaling (`EXPANSION`)
117-
Adding an item to a Bloom filter never fails due to the data structure "filling up". Instead, the error rate starts to grow. To keep the error close to the one set on filter initialization - the Bloom filter will auto-scale, meaning, when capacity is reached, an additional sub-filter will be created.
117+
Adding an item to a Bloom filter never fails due to the data structure "filling up". Instead, the error rate starts to grow. To keep the error close to the one set on filter initialization, the Bloom filter will auto-scale, meaning, when capacity is reached, an additional sub-filter will be created.
118118
The size of the new sub-filter is the size of the last sub-filter multiplied by `EXPANSION`. If the number of items to be stored in the filter is unknown, we recommend that you use an expansion of 2 or more to reduce the number of sub-filters. Otherwise, we recommend that you use an expansion of 1 to reduce memory consumption. The default expansion value is 2.
119119

120120
The filter will keep adding more hash functions for every new sub-filter in order to keep your desired error rate.

0 commit comments

Comments
 (0)