Skip to content

Commit d74224a

Browse files
committed
Clean out rejected experiments #3
1 parent b0ca9b6 commit d74224a

File tree

4 files changed

+76
-64
lines changed

4 files changed

+76
-64
lines changed

README.md

Lines changed: 32 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,8 @@ This is a research library currently. It is not meant for production use.
66

77
Developers might want to consider our [Header-only Xor Filter library in C](https://github.com/FastFilter/xor_singleheader/).
88

9-
## Reference
109

11-
* Thomas Mueller Graf, Daniel Lemire, [Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters](https://arxiv.org/abs/1912.08258), Journal of Experimental Algorithmics 25 (1), 2020. DOI: 10.1145/3376122
12-
13-
14-
15-
## Prerequisites_fastfilter
10+
## Prerequisites
1611

1712
- A C++11 compiler such as GNU G++ or LLVM Clang++
1813
- Make
@@ -44,17 +39,17 @@ Your results will depend on the hardware, on the compiler and how the system is
4439

4540
```
4641
$ ./bulk-insert-and-query.exe 10000000
47-
find find find find find optimal wasted million
48-
add remove 0% 25% 50% 75% 100% ε bits/item bits/item space keys
49-
50-
add cycles: 325.5/key, instructions: (303.2/key, 0.93/cycle) cache misses: 12.41/key branch misses: 1.17/key
51-
0.00% cycles: 81.7/key, instructions: ( 48.0/key, 0.59/cycle) cache misses: 3.06/key branch misses: 0.00/key
52-
0.25% cycles: 81.8/key, instructions: ( 48.0/key, 0.59/cycle) cache misses: 3.06/key branch misses: 0.00/key
53-
0.50% cycles: 81.8/key, instructions: ( 48.0/key, 0.59/cycle) cache misses: 3.06/key branch misses: 0.00/key
54-
0.75% cycles: 82.0/key, instructions: ( 48.0/key, 0.59/cycle) cache misses: 3.06/key branch misses: 0.00/key
55-
1.00% cycles: 81.9/key, instructions: ( 48.0/key, 0.59/cycle) cache misses: 3.06/key branch misses: 0.00/key
56-
Xor8 106.79 0.00 25.92 25.88 25.86 25.94 25.98 0.3892% 9.84 8.01 22.9% 10.0
57-
42+
./bulk-insert-and-query.exe 10000000
43+
find find find find find 1*add+ optimal wasted million
44+
add remove 0% 25% 50% 75% 100% 3*find ε% bits/item bits/item space% keys
45+
46+
add cycles: 351.3/key, instructions: (332.4/key, 0.95/cycle) cache misses: 13.99/key branch misses: 1.23/key
47+
0.00% cycles: 87.8/key, instructions: ( 48.0/key, 0.55/cycle) cache misses: 2.89/key branch misses: 0.00/key
48+
0.25% cycles: 87.8/key, instructions: ( 48.0/key, 0.55/cycle) cache misses: 2.89/key branch misses: 0.00/key
49+
0.50% cycles: 87.8/key, instructions: ( 48.0/key, 0.55/cycle) cache misses: 2.90/key branch misses: 0.00/key
50+
0.75% cycles: 87.9/key, instructions: ( 48.0/key, 0.55/cycle) cache misses: 2.90/key branch misses: 0.00/key
51+
1.00% cycles: 87.8/key, instructions: ( 48.0/key, 0.55/cycle) cache misses: 2.89/key branch misses: 0.00/key
52+
Xor8 106.59 0.00 23.85 23.85 23.85 23.88 23.83 178.15 0.3908 9.84 8.00 23.0 10.000
5853
... # many more lines omitted
5954
```
6055

@@ -66,50 +61,48 @@ As part of the benchmark, we check the correctness of the implementation.
6661

6762
## Benchmarking
6863

69-
The shell script `benchmark/benchmark.sh` runs the benchmark 3 times for the most important algorithms,
70-
with entry sizes of 10 million and 100 million keys.
71-
It is much slower than the above, because each invocation runs only one algorithm
72-
(to ensure running one algorithm doesn't influence benchmark results of other algorithms).
73-
It stores the results in the file `benchmark-results.txt`.
74-
To futher analyze the results, use the java tool `AnalyzeResults.java`
75-
from the project https://github.com/FastFilter/fastfilter_java.
76-
Requires GCC and Java 8.
64+
A simple way to run the benchmark for the most important algorithms
65+
is to run `./bulk-insert-and-query.exe <number of entries>`.
7766
To get a low error, it is best run on a Linux machine that is not otherwise in use.
78-
Steps to run the tests and analyze the results:
67+
You might want to run the tests multiple times to verify it is working properly.
68+
The steps to run the tests are:
7969

8070
git clone https://github.com/FastFilter/fastfilter_cpp.git
81-
git clone https://github.com/FastFilter/fastfilter_java.git
8271
cd fastfilter_cpp/benchmarks
8372
make clean ; make
84-
# this may take an hour to run
85-
./benchmark.sh
86-
87-
cd ../..
88-
cd fastfilter_java/fastfilter
89-
mvn clean install
90-
java -cp target/test-classes org.fastfilter.analysis.AnalyzeResults ../../fastfilter_cpp/benchmarks/benchmark-results.txt
91-
73+
# 100 million entries
74+
./bulk-insert-and-query.exe 100000000
9275

9376
## Where is your code?
9477

9578
The filter implementations are in `src/<type>/`. Most implementations depend on `src/hashutil.h`. Examples:
9679

9780
* src/bloom/bloom.h
9881
* src/xorfilter/xorfilter.h
82+
* src/xorfilter/3wise_xor_binary_fuse_filter_lowmem.h
83+
84+
85+
## References
86+
87+
- [Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters](https://arxiv.org/abs/1912.08258), Journal of Experimental Algorithmics 25 (1), 2020
88+
9989

10090
## Credit
10191

10292
The cuckoo filter and the benchmark are derived from https://github.com/efficient/cuckoofilter by Bin Fan et al.
10393
The SIMD blocked Bloom filter is from https://github.com/apache/impala (via the cuckoo filter).
10494
The Morton filter is from https://github.com/AMDComputeLibraries/morton_filter.
10595
The Counting Quotient Filter (CQF) is from https://github.com/splatlab/cqf.
96+
The ribbon filters are from https://github.com/pdillinger/fastfilter_cpp.
10697

10798

108-
# Implementations of xor filters in other programming languages
99+
# Implementations of xor and binary fuse filters in other programming languages
109100

110-
* [Go](https://github.com/FastFilter/xorfilter)
101+
* [C](https://github.com/FastFilter/xor_singleheader)
102+
* [C99](https://github.com/skeeto/xf8)
111103
* [Erlang](https://github.com/mpope9/exor_filter)
112-
* Rust: [1](https://github.com/bnclabs/xorfilter), [2](https://github.com/codri/xorfilter-rs), [3](https://github.com/Polochon-street/rustxorfilter)
104+
* [Go](https://github.com/FastFilter/xorfilter)
113105
* [Java](https://github.com/FastFilter/fastfilter_java)
114-
* [C](https://github.com/FastFilter/xor_singleheader)
115106
* [Python](https://github.com/GreyDireWolf/pyxorfilter)
107+
* Rust: [1](https://github.com/bnclabs/xorfilter), [2](https://github.com/codri/xorfilter-rs), [3](https://github.com/Polochon-street/rustxorfilter)
108+
* [Zig](https://github.com/hexops/xorfilter)

benchmarks/benchmark-fast.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#!/bin/sh
2+
# run the benchmark multiple times with all important algorithms
3+
# for algorithm ids and other parameters, see
4+
# bulk-insert-and-query.cc
5+
#
6+
# rnd: random number generators to use
7+
for rnd in `seq -1 -1`; do
8+
# m: number of entries (in millions)
9+
for m in `seq 100 100`; do
10+
# test: test id
11+
for test in `seq 1 10`; do
12+
sleep 5;
13+
# default algorithms
14+
./bulk-insert-and-query.exe ${m}000000;
15+
done;
16+
done;
17+
done > benchmark-fast-results.txt 2>&1

benchmarks/benchmark.sh

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,15 @@
55
#
66
# rnd: random number generators to use
77
for rnd in `seq -1 -1`; do
8-
# alg: algorithms to test
9-
for alg in 0 2 3 4 11 12 13 15 16 17 20 30 40 41 42 51 80 100; do
10-
# m: number of entries
11-
for m in `seq 10 90 100`; do
12-
# test: test id
13-
for test in `seq 1 3`; do
8+
# m: number of entries (in millions)
9+
for m in `seq 100 100`; do
10+
# test: test id
11+
for test in `seq 1 20`; do
12+
# alg: algorithms to test
13+
for alg in 0 2 11 12 44 45 51 116 117 118 119 1086 1156 3086 3156; do
1414
now=$(date +"%T");
15-
echo ${now} alg ${alg} size ${m} ${rnd};
15+
echo ${test} ${now} alg ${alg} size ${m} ${rnd};
16+
sleep 10;
1617
./bulk-insert-and-query.exe ${m}000000 ${alg} ${rnd};
1718
done;
1819
done;

benchmarks/bulk-insert-and-query.cc

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -289,12 +289,13 @@ Statistics FilterBenchmark(
289289
#ifdef __linux__
290290
unified.end(results);
291291
printf("remove ");
292-
printf("cycles: %5.1f/key, instructions: (%5.1f/key, %4.2f/cycle) cache misses: %5.2f/key branch misses: %4.2f/key\n",
292+
printf("cycles: %5.1f/key, instructions: (%5.1f/key, %4.2f/cycle) cache misses: %5.2f/key branch misses: %4.2f/key effective frequency %4.2f GHz\n",
293293
results[0]*1.0/add_count,
294294
results[1]*1.0/add_count ,
295295
results[1]*1.0/results[0],
296296
results[2]*1.0/add_count,
297-
results[3]*1.0/add_count);
297+
results[3]*1.0/add_count,
298+
results[0]*1.0/time);
298299
#else
299300
std::cout << "." << std::flush;
300301
#endif
@@ -347,7 +348,7 @@ int main(int argc, char * argv[]) {
347348
{0, "Xor8"}, {1, "Xor12"}, {2, "Xor16"},
348349
{3, "Xor+8"}, {4, "Xor+16"},
349350
{5, "Xor10"}, {6, "Xor10.666"},
350-
{7, "Xor10 (NBitArray)"}, {8, "Xor14 (NBitArray)"}, {9, "Xor8-2^n"},
351+
{7, "Xor10-NBitArray"}, {8, "Xor14-NBitArray"}, {9, "Xor8-2^n"},
351352
// Cuckooo
352353
{10,"Cuckoo8"}, {11,"Cuckoo12"}, {12,"Cuckoo16"},
353354
{13,"CuckooSemiSort13"},
@@ -363,27 +364,27 @@ int main(int argc, char * argv[]) {
363364
#endif
364365
// Bloom
365366
{40, "Bloom8"}, {41, "Bloom12" }, {42, "Bloom16"},
366-
{43, "Bloom8 (addall)"}, {44, "Bloom12 (addall)"}, {45, "Bloom16 (addall)"},
367-
{46, "BranchlessBloom8 (addall)"},
368-
{47, "BranchlessBloom12 (addall)"},
369-
{48, "BranchlessBloom16 (addall)"},
367+
{43, "Bloom8-addAll"}, {44, "Bloom12-addAll"}, {45, "Bloom16-addAll"},
368+
{46, "BranchlessBloom8-addAll"},
369+
{47, "BranchlessBloom12-addAll"},
370+
{48, "BranchlessBloom16-addAll"},
370371
// Blocked Bloom
371372
{50, "SimpleBlockedBloom"},
372373
#ifdef __aarch64__
373374
{51, "BlockedBloom"},
374-
{52, "BlockedBloom (addall)"},
375+
{52, "BlockedBloom-addAll"},
375376
#elif defined( __AVX2__)
376377
{51, "BlockedBloom"},
377-
{52, "BlockedBloom (addall)"},
378+
{52, "BlockedBloom-addAll"},
378379
{53, "BlockedBloom64"},
379380
#endif
380381
#ifdef __SSE41__
381382
{54, "BlockedBloom16"},
382383
#endif
383384

384385
// Counting Bloom
385-
{60, "CountingBloom10 (addall)"},
386-
{61, "SuccCountingBloom10 (addall)"},
386+
{60, "CountingBloom10-addAll"},
387+
{61, "SuccCountingBloom10-addAll"},
387388
{62, "SuccCountBlockBloom10"},
388389
{63, "SuccCountBlockBloomRank10"},
389390

@@ -393,8 +394,8 @@ int main(int argc, char * argv[]) {
393394

394395
{80, "Morton"},
395396

396-
{96, "XorBinaryFuse8"},
397-
{97, "XorBinaryFuse16"},
397+
{96, "XorBinaryFuse8-Naive"},
398+
{97, "XorBinaryFuse16-Naive"},
398399
{98, "XorBinaryFuse8-4Wise-Prefetch"},
399400
{99, "XorBinaryFuse16-4Wise-Prefetch"},
400401
{100, "XorBinaryFuse8-Prefetched"},
@@ -411,10 +412,10 @@ int main(int argc, char * argv[]) {
411412
{113, "XorBinaryFuse16-4Wise-PSorted"},
412413
{114, "XorBinaryFuse8-OneHash"},
413414
{115, "XorBinaryFuse16-OneHash"},
414-
{116, "XorBinaryFuse8-LowMem"},
415-
{117, "XorBinaryFuse16-LowMem"},
416-
{118, "XorBinaryFuse8-4Wise-LowMem"},
417-
{119, "XorBinaryFuse16-4Wise-LowMem"},
415+
{116, "XorBinaryFuse8"},
416+
{117, "XorBinaryFuse16"},
417+
{118, "XorBinaryFuse8-4Wise"},
418+
{119, "XorBinaryFuse16-4Wise"},
418419
{1056, "HomogRibbon64_5"},
419420
{1076, "HomogRibbon64_7"}, // interesting
420421
{1086, "HomogRibbon64_8"},
@@ -801,7 +802,7 @@ int main(int argc, char * argv[]) {
801802
cout << setw(NAME_WIDTH) << names[a] << cf << endl;
802803
}
803804
a = 41;
804-
if (algorithmId == a || (algos.find(a) != algos.end())) {
805+
if (algorithmId == a || (algos.find(a) != algos.end())) {
805806
auto cf = FilterBenchmark<
806807
BloomFilter<uint64_t, 12, false, SimpleMixSplit>>(
807808
add_count, to_add, intersectionsize, mixed_sets);

0 commit comments

Comments
 (0)