Fix checksum calculations in large buckets with > 4m rows #282
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The bug
As background, we calculate an unsigned 32-bit checksum for each synced operation. For checkpoints, we aggregate this by summing, and discarding any higher bits. As a side-note, the protocol is a little inconsistent in whether these are returned as signed or unsigned integers, so the clients handle both.
For MongoDB storage, we stored these u32 numbers directly. Internally, MongoDB could store these as:
So what happened was that numbers that don't fit into a signed int32 are converted to double, and this would happen with roughly half the number of rows.
So now when aggregating, MongoDB often gets int + double for the sum, so it converts both to a double for the sum.
In most cases, this is fine. Up to around 4 million* operations in a single bucket, the double addition is still precise. However, for buckets with more than 4 million operations, this checksum calculation can be inaccurate - typically off by 1. The result is that the client repeatedly gets checksum failures when syncing the bucket.
If the data is spread over multiple buckets, the same issue does not occur - only when there are more than 4 million operations in a single bucket.
The fix
This fixes the issue on two levels:
In theory we could convert the numbers from unsigned to signed int32 when storing, instead of long. However:
We could still investigate that option in the future.
Why 4 million?
Integers larger than 2^53 cannot be safely stored in a double/float8 without losing precision. Since our checksums go up to 2^32, that means we have an upper bound of 2^21=2097152 that we could always safely do the checksum calculation for.
In practice, checksums have a fairly even spread, averaging around 2^31. This doubles our old practical limit to a little over 4 million operations per bucket.