Skip to content

Commit 3e2b44d

Browse files
committed
bloomfilter-blocked: update the README
1 parent 488224b commit 3e2b44d

File tree

2 files changed

+61
-16
lines changed

2 files changed

+61
-16
lines changed

bloomfilter-blocked/README.md

Lines changed: 58 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,71 @@
1-
# A fast, space efficient Bloom filter implementation
1+
# bloomfilter-blocked
22

3-
Copyright 2008, 2009, 2010, 2011 Bryan O'Sullivan <[email protected]>.
3+
`bloomfilter-blocked` is a Haskell library providing multiple fast and efficient
4+
implementations of [bloom filters][bloom-filter:wiki]. It is a full rewrite of
5+
the [`bloomfilter`][bloomfilter:hackage] package, originally authored by Bryan
6+
O'Sullivan <[email protected]>.
47

5-
This package provides both mutable and immutable Bloom filter data
6-
types, along with a family of hash function and an easy-to-use
7-
interface.
8+
A bloom filter is a space-efficient data structure representing a set that can
9+
be probablistically queried for set membership. The set membership query returns
10+
no false negatives, but it might return false positives. That is, if an element
11+
was added to a bloom filter, then a subsequent query definitely returns `True`.
12+
If an element was *not* added to a filter, then a subsequent query may still
13+
return `True` if `False` would be the correct answer. The probabiliy of false
14+
positives -- the false positive rate (FPR) -- is configurable, as we will
15+
describe later.
816

9-
To build:
17+
The library includes two implementations of bloom filters: classic, and blocked.
1018

11-
cabal install bloomfilter
19+
* **Classic** bloom filters, found in the `Data.BloomFilter.Classic` module: a
20+
default implementation that is faithful to the canonical description of a
21+
bloom filter data structure.
1222

13-
For examples of usage, see the Haddock documentation and the files in
14-
the examples directory.
23+
* **Blocked** floom filters, found in the `Data.BloomFilter.Blocked` module: an
24+
implementation that optimises the memory layout of a classic bloom filter for
25+
speed (cheaper CPU cache reads), at the cost of a slightly higher FPR for the
26+
same amount of assigned memory.
1527

28+
The FPR scales inversely with how much memory is assigned to the filter. It also
29+
scales inversely with how many elements are added to the set. The user can
30+
configure how much memory is asisgned to a filter, and the user also controls
31+
how many elements are added to a set. Each implementation comes with helper
32+
functions, like `sizeForFPR` and `sizeForBits`, that the user can leverage to
33+
configure filters.
1634

17-
# Get involved!
35+
Both immutable (`Bloom`) and mutable (`MBloom`) bloom filters, including
36+
functions to convert between the two, are provided for each implementation. Note
37+
however that a (mutable) bloom filter can not be resized once created, and that
38+
elements can not be deleted once inserted.
1839

19-
Please report bugs via the
20-
[github issue tracker](https://github.com/haskell-pkg-janitors/bloomfilter).
40+
For more information about the library and examples of how to use it, see the
41+
Haddock documentation of the different modules.
2142

22-
Master [git repository](https://github.com/haskell-pkg-janitors/bloomfilter):
43+
# Usage notes
2344

24-
* `git clone git://github.com/haskell-pkg-janitors/bloomfilter.git`
45+
User should take into account the following:
2546

47+
* This package is not supported on 32bit systems.
2648

27-
# Authors
49+
# Differences from the `bloomfilter` package
2850

29-
This library is written by Bryan O'Sullivan, <[email protected]>.
51+
The library is a full rewrite of the [`bloomfilter`][bloomfilter:hackage]
52+
package, originally authored by Bryan O'Sullivan <[email protected]>. The main
53+
differences are:
54+
55+
* `bloomfilter-blocked` supports both classic and blocked bloom filters, whereas
56+
`bloomfilter` only supports the former.
57+
* `bloomfilter-blocked` supports bloom filters of arbitrary sizes, whereas
58+
`bloomfilter` limits the sizes to powers of two. Moreover, we support sizes up
59+
to `2^48` for classic bloom filters and up to `2^41` for blocked bloom
60+
filters, instead of `2^32`.
61+
* In `bloomfilter`, the `Bloom` and `MBloom` types are parametrised over a
62+
`Hashable` type class. In `bloomfilter-blocked` the hashing scheme is static.
63+
This allows clean (de-)serialisation of bloom filters.
64+
* For hashing `bloomfilter-blocked` uses `XXH3` instead of the Jenkins'
65+
`lookup3` that `bloomfilter` uses.
66+
67+
68+
<!-- Sources -->
69+
70+
[bloom-filter:wiki]: https://en.wikipedia.org/wiki/Bloom_filter
71+
[bloomfilter:hackage]: https://hackage.haskell.org/package/bloomfilter

bloomfilter-blocked/src/Data/BloomFilter.hs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
-- | By default, this module re-exports the classic bloom filter implementation
2+
-- from "Data.BloomFilter.Classic". If you want to use the blocked bloom filter
3+
-- implementation, import "Data.BloomFilter.Blocked".
14
module Data.BloomFilter (
25
module Data.BloomFilter.Classic
36
) where

0 commit comments

Comments
 (0)