Conversation
304934f to
ad70c60
Compare
0267d23 to
74b4d29
Compare
af184e8 to
7872f4e
Compare
7872f4e to
e10454f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR tries to improve the current BMT hasher to use SIMD when available.
Motivation and Context (Optional)
The current BMT hasher is highly inefficient - it uses massive goroutine spawning to compute a single chunk address. For a full chunk, this means 255 goroutines are spawned per chunk, creating GC stress and significant scheduler stress. For the ReserveSample function used in the redistribution game, this means excessive memory and CPU stress in order to calculate the reserve sample.
The idea here was to use an existing keccak implementation, compile it to assembler and try to call it directly from our go code without having to use
cgowhich comes with its own set of side-effects.I used (I==me+Claude) the XKCP project (from the keccak authors) and built a build script that builds, extracts and wraps the compiled code correctly. Currently only linux amd64 is supported. Windows and mac should fall back to the go legacy sha3 hasher.
So far the results are promising:
x1.6faster BMT hashing on my local machine (laptop)x2.5faster on AVX2 supported data-center CPUs (Hetzner)x5faster on newer AVX512 architecturesThere's a few more things to iron out and test:
bmtpoolwould improve anything at all/usr/bin/ld: warning: /tmp/go-link-1425710637/000001.o: missing .note.GNU-stack section implies executable stack /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linkerWorth noting:
shadow space(don't ask)Test plan
Related Issue (Optional)
#5174
References: