Skip to content

Commit e12ad7f

Browse files
committed
Merge #19968: doc: clarify CRollingBloomFilter size estimate
d9141a0 doc: clarify CRollingBloomFilter size estimate (Anthony Towns) Pull request description: Based on #19130, this change improves the comment for `CRollingBloomFilter` in `bloom.h`: - Give examples to illustrate the heuristic "1.8 bytes per element per factor 0.1 of false positive rate" - Add some Python code which can be copy/pasted for convenient filter size calculation (in an interpreter) - Reconcile the newly added code with the existing approximation ACKs for top commit: laanwj: ACK d9141a0 Tree-SHA512: e7138b3c531883a750ead06368975c750863fde7ef6f2633b137eca011079226e9205316217322014399fba05a48f294c788dd700bb7d479c58fe1f23e40419f
2 parents 47b6ad8 + d9141a0 commit e12ad7f

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

src/bloom.h

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,18 @@ class CBloomFilter
9494
* insert()'ed ... but may also return true for items that were not inserted.
9595
*
9696
* It needs around 1.8 bytes per element per factor 0.1 of false positive rate.
97-
* (More accurately: 3/(log(256)*log(2)) * log(1/fpRate) * nElements bytes)
97+
* For example, if we want 1000 elements, we'd need:
98+
* - ~1800 bytes for a false positive rate of 0.1
99+
* - ~3600 bytes for a false positive rate of 0.01
100+
* - ~5400 bytes for a false positive rate of 0.001
101+
*
102+
* If we make these simplifying assumptions:
103+
* - logFpRate / log(0.5) doesn't get rounded or clamped in the nHashFuncs calculation
104+
* - nElements is even, so that nEntriesPerGeneration == nElements / 2
105+
*
106+
* Then we get a more accurate estimate for filter bytes:
107+
*
108+
* 3/(log(256)*log(2)) * log(1/fpRate) * nElements
98109
*/
99110
class CRollingBloomFilter
100111
{

0 commit comments

Comments
 (0)