Optimize the "Hash" and make it faster by MeteorsLiu · Pull Request #10 · panmari/cuckoofilter

MeteorsLiu · 2022-11-26T10:18:03Z

github.com/seiflotfy/cuckoofilter first calculate the hash when it's in initialization.

And I try this to optimize this "better" version of cuckoofilter.

Test is all passed.

=== RUN   TestBucket_Reset
--- PASS: TestBucket_Reset (0.00s)
=== RUN   TestInsertion
--- PASS: TestInsertion (0.03s)
=== RUN   TestLookup
=== RUN   TestLookup/cf.Lookup("one")
=== PAUSE TestLookup/cf.Lookup("one")
=== RUN   TestLookup/cf.Lookup("two")
=== PAUSE TestLookup/cf.Lookup("two")
=== RUN   TestLookup/cf.Lookup("three")
=== PAUSE TestLookup/cf.Lookup("three")
=== RUN   TestLookup/cf.Lookup("four")
=== PAUSE TestLookup/cf.Lookup("four")
=== RUN   TestLookup/cf.Lookup("five")
=== PAUSE TestLookup/cf.Lookup("five")
=== CONT  TestLookup/cf.Lookup("one")
=== CONT  TestLookup/cf.Lookup("four")
=== CONT  TestLookup/cf.Lookup("five")
=== CONT  TestLookup/cf.Lookup("three")
=== CONT  TestLookup/cf.Lookup("two")
--- PASS: TestLookup (0.00s)
    --- PASS: TestLookup/cf.Lookup("one") (0.00s)
    --- PASS: TestLookup/cf.Lookup("four") (0.00s)
    --- PASS: TestLookup/cf.Lookup("five") (0.00s)
    --- PASS: TestLookup/cf.Lookup("three") (0.00s)
    --- PASS: TestLookup/cf.Lookup("two") (0.00s)
=== RUN   TestFilter_LookupLarge
--- PASS: TestFilter_LookupLarge (0.09s)
=== RUN   TestFilter_Insert
--- PASS: TestFilter_Insert (0.00s)
=== RUN   TestDelete
=== RUN   TestDelete/cf.Delete("four")
=== RUN   TestDelete/cf.Delete("five")
=== RUN   TestDelete/cf.Delete("one")
=== RUN   TestDelete/cf.Delete("two")
=== RUN   TestDelete/cf.Delete("three")
--- PASS: TestDelete (0.00s)
    --- PASS: TestDelete/cf.Delete("four") (0.00s)
    --- PASS: TestDelete/cf.Delete("five") (0.00s)
    --- PASS: TestDelete/cf.Delete("one") (0.00s)
    --- PASS: TestDelete/cf.Delete("two") (0.00s)
    --- PASS: TestDelete/cf.Delete("three") (0.00s)
=== RUN   TestDeleteMultipleSame
    cuckoofilter_test.go:229: Filter state full: &{[[65037 65037 65037 65037] [65037 0 0 0]] 5 1}
=== RUN   TestDeleteMultipleSame/cf.Delete("missing")
=== RUN   TestDeleteMultipleSame/cf.Delete("missing2")
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")#01
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")#02
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")#03
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")#04
=== RUN   TestDeleteMultipleSame/cf.Delete("some_item")#05
--- PASS: TestDeleteMultipleSame (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("missing") (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("missing2") (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item") (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item")#01 (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item")#02 (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item")#03 (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item")#04 (0.00s)
    --- PASS: TestDeleteMultipleSame/cf.Delete("some_item")#05 (0.00s)
=== RUN   TestEncodeDecode
--- PASS: TestEncodeDecode (0.00s)
=== RUN   TestIndexAndFP
--- PASS: TestIndexAndFP (0.00s)
=== RUN   FuzzDecode
=== RUN   FuzzDecode/seed#0
--- PASS: FuzzDecode (0.00s)
    --- PASS: FuzzDecode/seed#0 (0.00s)
=== RUN   Example
--- PASS: Example (0.00s)
=== RUN   ExampleFilter_Lookup
--- PASS: ExampleFilter_Lookup (0.00s)
=== RUN   ExampleFilter_Delete
--- PASS: ExampleFilter_Delete (0.00s)
=== RUN   Example_threadSafe
--- PASS: Example_threadSafe (0.00s)

Here is the benchmark result:

Before optimization:

BenchmarkFilter_Reset
BenchmarkFilter_Reset-8    	   97090	     12767 ns/op
BenchmarkFilter_Insert
BenchmarkFilter_Insert-8   	39633634	        30.11 ns/op
BenchmarkFilter_Lookup
BenchmarkFilter_Lookup-8   	37294400	        31.90 ns/op
PASS

After optimization:

BenchmarkFilter_Reset
BenchmarkFilter_Reset-8    	   97978	     12345 ns/op
BenchmarkFilter_Insert
BenchmarkFilter_Insert-8   	54499240	        22.14 ns/op
BenchmarkFilter_Lookup
BenchmarkFilter_Lookup-8   	57099548	        20.85 ns/op
PASS

github.com/seiflotfy/cuckoofilter(The benchmark file has been replaced by the same.):

BenchmarkFilter_Insert
BenchmarkFilter_Insert-8   	38737519	        29.49 ns/op
BenchmarkFilter_Lookup
BenchmarkFilter_Lookup-8   	44468107	        27.07 ns/op
BenchmarkFilter_Reset
BenchmarkFilter_Reset-8    	   97950	     12242 ns/op
PASS

MeteorsLiu · 2022-11-26T10:40:04Z

@panmari

panmari

Thanks for your PR, I like the direction! It does a lot of things at once and I'd prefere to be more diligent and evaluate each separately.

panmari · 2022-11-28T19:21:36Z

util.go

 func randi(i1, i2 uint) uint {
-	if rand.Int31()%2 == 0 {
+	// it's faster than mod, but the result is the almost same.
+	if uint32(uint64(uint32(rng.Uint64()))*uint64(2)>>32) == 0 {


This makes it much harder to read, is the optimization really worth it? Please do a separate PR/evaluation.

After studying the code in rng.Uint64(), I see that there will be a speed up in particular on concurrent use. So moving to the new library looks good to me.

But I don't like the obfuscated code here. Please revert it to

rng.Uint64() % 2 == 0

In my opinion, readability is more important here than the small gain from converting and bit shifting.

panmari · 2022-11-28T19:23:04Z

util.go

+}
+func init() {
+	b := make([]byte, 2)
+	endian := getEndian()


What's the advantage in deriving endianes? Hard-coding either big or little for all platforms should do the same afaik.

Undo, I don't see the benefit.

panmari · 2022-11-28T19:26:54Z

util.go

+
+var (
+	// plus 1 to prevent out of range.
+	altHash = [maxFingerprint + 1]uint{}


This comes with a memory penalty of 2^16 /1024 = 64kb.

Please do a separate PR/evaluation to see if it's worth.

Ok, I agree with your argument that in most cases, the filter itself will already be quite big, hence the additional 64 kb don't add much. Please send as separate pull request.

MeteorsLiu · 2022-12-04T13:46:41Z

Hello, sorry for delay.

Here is the evaluation benchmark result.

The source code :
https://github.com/MeteorsLiu/test-cmp

Compare Randi(Wyrng VS Go):

BenchmarkRandiGoMod-8      	75846440	        15.78 ns/op
BenchmarkRandiGo-8         	76128673	        15.77 ns/op
BenchmarkRandiWyrng-8      	196768308	         5.974 ns/op
BenchmarkRandiWyrngMod-8   	200895794	         5.999 ns/op

Compare Hash Algorithm (XXH3 VS Metro)

BenchmarkMetroHash-8    	175887223	         6.801 ns/op
BenchmarkXXH3Hash-8     	176485131	         6.815 ns/op

Compare Static looking up with hashing again.

BenchmarkMetroHash-8    	175887223	         6.801 ns/op
BenchmarkXXH3Hash-8     	176485131	         6.815 ns/op
BenchmarkStaticHash-8   	1000000000	         0.2779 ns/op

Answer:

This makes it much harder to read, is the optimization really worth it? Please do a separate PR/evaluation.

Yes, it worths.

Golang's rand has a mutex lock, which slows down the process, while SRNG doesn't require a lock(replaced by atomical operation)

Also, Wyrng is a faster pseudorandom algorithm than the go's.

What's the advantage in deriving endianes? Hard-coding either big or little for all platforms should do the same afaik.

Maybe no?

This comes with a memory penalty of 2^16 /1024 = 64kb. Please do a separate PR/evaluation to see if it's worth.

Although it takes 64kb to store the hash result, its advantage is that it only calculate for the first time.

When someone is quering the cuckoofilter for many times, it's a great advantage without doubt.

Because getAltIndex is used for many times, which calculates the repeat Hash value for many times, this repeat calculation can be reduced by this.

And I believe that 64kb is a "small" memory block, while cuckoofilter may takes 2MB to store the hash entries.

@panmari

MeteorsLiu · 2022-12-04T14:02:08Z

And Why I replace the MetroHash with XXH3 .

According to the tests,
https://github.com/rurban/smhasher

MetroHash 64 has a LongNeighbor problem, while XXH3 hasn't.

https://github.com/hmakholm/smhasher/blob/master/src/LongNeighborTest.md

As far as i'm concerned, XXH3 is more quality than Metro.
Also, according to my benchmark result, its speed is close to each other.

A quality Hash algorithm will have a great impact on the correctness of the cuckoofilter.

panmari · 2022-12-13T19:21:43Z

Thanks for breaking up the evaluation!

I would still prefer merging/discussing this as separate pull requests. Could you please split them up? See also comments above.

MeteorsLiu · 2022-12-16T05:47:19Z

Sorry, i misunderstand your meaning.
I will separate this requests a few days later.

GuoHaoLan added 5 commits November 26, 2022 18:02

Optimize the cuckoofilter

6c860a6

Update the go.mod

ad028b7

Update the go.sum

c893297

Update util.go

590cde9

Update util.go

1cd0746

panmari requested changes Nov 28, 2022

View reviewed changes

Merge branch 'panmari:master' into master

b610af5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the "Hash" and make it faster#10

Optimize the "Hash" and make it faster#10
MeteorsLiu wants to merge 6 commits intopanmari:masterfrom
InteractivePlus:master

MeteorsLiu commented Nov 26, 2022 •

edited

Loading

Uh oh!

MeteorsLiu commented Nov 26, 2022

Uh oh!

panmari left a comment

Uh oh!

panmari Nov 28, 2022

Uh oh!

panmari Dec 13, 2022

Uh oh!

panmari Nov 28, 2022

Uh oh!

panmari Dec 13, 2022

Uh oh!

panmari Nov 28, 2022

Uh oh!

panmari Dec 13, 2022

Uh oh!

MeteorsLiu commented Dec 4, 2022 •

edited

Loading

Uh oh!

MeteorsLiu commented Dec 4, 2022 •

edited

Loading

Uh oh!

panmari commented Dec 13, 2022

Uh oh!

MeteorsLiu commented Dec 16, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MeteorsLiu commented Nov 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MeteorsLiu commented Nov 26, 2022

Uh oh!

panmari left a comment

Choose a reason for hiding this comment

Uh oh!

panmari Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

panmari Dec 13, 2022

Choose a reason for hiding this comment

Uh oh!

panmari Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

panmari Dec 13, 2022

Choose a reason for hiding this comment

Uh oh!

panmari Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

panmari Dec 13, 2022

Choose a reason for hiding this comment

Uh oh!

MeteorsLiu commented Dec 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MeteorsLiu commented Dec 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

panmari commented Dec 13, 2022

Uh oh!

MeteorsLiu commented Dec 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MeteorsLiu commented Nov 26, 2022 •

edited

Loading

MeteorsLiu commented Dec 4, 2022 •

edited

Loading

MeteorsLiu commented Dec 4, 2022 •

edited

Loading

MeteorsLiu commented Dec 16, 2022 •

edited

Loading