Skip to content

Commit c328cdd

Browse files
committed
update readme
1 parent 7495613 commit c328cdd

File tree

2 files changed

+10
-9
lines changed

2 files changed

+10
-9
lines changed

cmd/main.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ import (
1313
func main() {
1414
num := 20_000_00
1515
// div := num / 10
16-
// main5(num)
16+
// main6()
1717
// return
1818
opts := &sprout.BloomOptions{
1919
Err_rate: 0.001,
@@ -89,7 +89,7 @@ func bToMb(b uint64) uint64 {
8989

9090
func main6() {
9191
num := 2_000_000
92-
db, err := bolt.Open("store1.db", 0644, nil)
92+
db, err := bolt.Open("store.db", 0644, nil)
9393
if err != nil {
9494
panic(err)
9595
}
@@ -105,7 +105,7 @@ func main6() {
105105
err = db.Update(func(tx *bolt.Tx) error {
106106
b := tx.Bucket([]byte("test"))
107107
for j := 0; j < num/10; j++ {
108-
err := b.Put([]byte(fmt.Sprintf("foo-i%d-j%d", i, j)), []byte("bar"))
108+
err := b.Put([]byte(fmt.Sprintf("i%d-j%d", i, j)), []byte("b"))
109109
if err != nil {
110110
return err
111111
}

readme.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,25 @@
11
### Sprout
22

3-
A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set. Bloom filters are fast and space efficient. They allow for false positives, but mitigate the probability with an expected false positive rate. An error rate of 0.001 implies that the probability of a false positive is 1 in 1000.
3+
A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set. Bloom filters are fast and space efficient. They allow for false positives, but mitigate the probability with an expected false positive rate. An error rate of 0.001 implies that the probability of a false positive is 1 in 1000. Bloom filters don't store the elements themselves, but instead use a set of hash functions to determine the presence of an element.
44

5-
To fulfil the false positive rate, bloom filters are initialized with a capacity. The capacity is the number of elements that can be inserted into the bloom filter, and this cannot be changed.
5+
To fulfil the false positive rate, bloom filters can be initialized with a capacity. The capacity is the number of elements that can be inserted into the bloom filter, and this cannot be changed.
66

7-
Sprout implements a bloom filter in Go, while using boltdb and badgerdb as optional in-memory persistent storage for the values. The bloom filter is written to a memory-mapped file.
7+
Sprout implements a bloom filter in Go. The bits of the filter are stored in a memory-mapped file. Sprout also allows attaching a persistent storage (boltdb and badgerdb) to store the key value pairs.
88

99
Sprout also implement a scalable bloom filter described in a paper written by [P. Almeida, C.Baquero, N. Preguiça, D. Hutchison](https://haslab.uminho.pt/cbm/files/dbloom.pdf).
1010

1111
A scalable bloom filter allows you to grow the filter beyond the initial filter capacity, while preserving the desired false positive rate.
1212

1313
### Memory Usage
1414

15-
Bloom filters are space efficient, as they only store the bits that are set. For a filter with a capacity of 20,000,000 and a error rate of 0.001, the storage size is approximately 34MB. That implies that there are approximately 1.78 bytes (~14 bits) per element.
15+
Bloom filters are space efficient, as they only store the bits that are set. For a filter with a capacity of 2,000,000 and a error rate of 0.001, the storage size is approximately 3.4MB. That implies that there are approximately 1.8 bytes (~14 bits) per element.
1616
The number of bits per element is as a result of the number of hash functions, which is derived from the capacity and the error rate.
1717

18-
In comparison, adding 2 million key/value pair to a boltdb uses a storage size of about 128MB (134 bytes per pair).
18+
In comparison, adding 2 million elements (with a singly byte value) to a boltdb database uses a storage size of about 108MB (113 bytes per pair).
1919

2020
**Scalable Bloom Filters**
21-
The scalable bloom filter initialized with a capacity of 2,000,000 and a error rate of 0.001, when grown to a capacity of 20,000,000, the total storage size is approximately 37.3MB.
21+
22+
A scalable bloom filter initialized with a capacity of 2,000,000 and an error rate of 0.001, when grown to a capacity of 20,000,000, the total storage size is approximately 37.3MB.
2223

2324
### Installation
2425

0 commit comments

Comments
 (0)