Fix checksum cache with compacting #92
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The checksum cache works not just by caching specific checksum queries, but also by re-using earlier checksums and just adding the partial checksum of any newer operations on the bucket, which can be significantly faster than recalculating the entire checksum from scratch.
Compacting a bucket guarantees that the final checksum for the bucket stays the same. However, there is one edge case:
The core issue is the CLEAR operation indicates that the checksum must be reset, instead of adding to the checksum of previous operations. While this is done correctly on the client, the checksum caching did not provision for this.
This only happened when the compact resulted in a CLEAR operation that is later than the last checksum cache. This could happen when for example a table was re-created from scratch (or every row updated), and the bucket is then compacted, without looking up & caching checksums in between. This means the issue was rare, but had a major impact when it did occur.
The fix is to detect CLEAR operations when computing a partial checksum, and resetting the checksum in that case.
This also adds an 1-hour TTL to cache entries. This is not needed to keep the cache small (it is size-limited already), but it helps to eventually refresh bucket counts after a compact. If the checksum cache for a bucket is updated within the TTL period, this has no effect, since the new entries will have a new expiration time. We could eventually add a mechanism to explicitly flush the cache after a compact operation.