You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: readme.md
+12-2Lines changed: 12 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,23 @@
1
1
### Sprout
2
2
3
-
A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set. Bloom filters are fast and space efficient. Bloom filters allow for false positives, but mitigate the probability with an expected false positive rate. An error rate of 0.001 implies that the probability of a false positive is 1 in 1000.
3
+
A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set. Bloom filters are fast and space efficient. They allow for false positives, but mitigate the probability with an expected false positive rate. An error rate of 0.001 implies that the probability of a false positive is 1 in 1000.
4
4
5
-
Sprout implements a bloom filter in Go, while using boltdb and badgerdb as optional in-memory persistent storage. Sprout writes the bloom filter to a memory-mapped file, and reads it from disk when needed.
5
+
To fulfil the false positive rate, bloom filters are initialized with a capacity. The capacity is the number of elements that can be inserted into the bloom filter, and this cannot be changed.
6
+
7
+
Sprout implements a bloom filter in Go, while using boltdb and badgerdb as optional in-memory persistent storage. Sprout writes the bloom filter to a memory-mapped file.
6
8
7
9
Sprout also implement a scalable bloom filter described in a paper written by [P. Almeida, C.Baquero, N. Preguiça, D. Hutchison](https://haslab.uminho.pt/cbm/files/dbloom.pdf).
8
10
9
11
A scalable bloom filter allows you to grow the filter beyond the initial filter capacity, while preserving the desired false positive rate.
10
12
13
+
### Memory Usage
14
+
15
+
Bloom filters are space efficient, as they are only storing the bits that are set. For a filter with a capacity of 20,000,000 and a error rate of 0.001, the storage size is approximately 34MB. That implies approx 1.78 bytes (~14 bits) per element.
16
+
The number of bits per element is as a result of the number of hash functions derived from the capacity and the error rate.
17
+
18
+
**Scalable Bloom Filters**
19
+
The scalable bloom filter initialized with a capacity of 2,000,000 and a error rate of 0.001, when grown to a capacity of 20,000,000, the total storage size is approximately 37.3MB.
0 commit comments