-
Notifications
You must be signed in to change notification settings - Fork 9
feat(bloomfilter): add salt #742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
559e85f to
08445e8
Compare
dcoutts
previously requested changes
May 29, 2025
Collaborator
dcoutts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good so far.
A few minor suggestions:
- arg order
- do we really need a class for hashing with salt, it's just used internally
- where the Salt type lives
And then we can move on to the uses in lsm-tree.
a0d8b27 to
caeb000
Compare
2fc1622 to
f13d95b
Compare
f13d95b to
fe4ec82
Compare
fe4ec82 to
6b676f4
Compare
Co-authored-by: Wen Kokke <[email protected]>
642aff3 to
7ff8e01
Compare
Making the hash salt configurable improves security, because bloom filter hashes are not cryptographic hashes. The hash salt can be configured on a session-wide basis only. That is, all bloom filters for all tables in a single session use the same salt. As a result, batch lookup performance is not impacted by the salt. The performance currently relies on being able to compute a hash only once for multiple bloom filters, which is only possible if the salt for each bloom filter is the same. For now the user has the responsibility of passing in the same salt each time a session is restored. We will change this in the next few commits. Co-authored-by: Wen Kokke <[email protected]>
Co-authored-by: Wen Kokke <[email protected]>
7ff8e01 to
72e0b7f
Compare
This helped troubleshoot a bug that we will address in the next commit.
`HasBlockIO` is passed into `openSession` by the user, so it should also be closed by the user. Otherwise, it would prevent reuse of `HasBlockIO`. This bug popped up in the test we add in the next commit.
…the time) If a session is restored with a salt that was different from the salt it was created with initially, then lookups will often return incorrect results. We will fix this in the next few commits.
... to `withKeepSessionOpen` and `withKeepTableOpen` respectively. This is to avoid name conflicts in the next few commits.
Only the former requires a salt, and the latter does not require it. We still keep a version of `openSession` that defers to `newSession` or `restoreSession` based on the session directory contents, because we expect it to be useful for users. In `newSession`, the input salt is written to a new metadata file in the session directory. In `restoreSession`, we read this metadata file back into memory to extract the salt.
…salt This prevents opening snapshots with the wrong salt, which would lead to incorrect lookup results.
… tests ... and update the tests a bit.
4cecb21 to
5e7cf18
Compare
jorisdral
approved these changes
Jun 26, 2025
jorisdral
added a commit
that referenced
this pull request
Jul 1, 2025
There are a few places where functions in the public API create a `HasBlockIO` for the user, but since #742 when a session is closed the `HasBlockIO` it contains does not get closed automatically anymore. So in the public API, we have to be extra careful to do so when the public API opens a `HasBlockIO`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds salt to the Bloom filter API.
I expect the CI to break, since I have not modified the usage of the bloomfilter package in lsm-tree.