Fix checksum calculations in large buckets with > 4m rows #282

rkistner · 2025-06-19T12:49:14Z

The bug

As background, we calculate an unsigned 32-bit checksum for each synced operation. For checkpoints, we aggregate this by summing, and discarding any higher bits. As a side-note, the protocol is a little inconsistent in whether these are returned as signed or unsigned integers, so the clients handle both.

For MongoDB storage, we stored these u32 numbers directly. Internally, MongoDB could store these as:

int32 (int)
int64 (long)
float8 (double)

So what happened was that numbers that don't fit into a signed int32 are converted to double, and this would happen with roughly half the number of rows.

So now when aggregating, MongoDB often gets int + double for the sum, so it converts both to a double for the sum.

In most cases, this is fine. Up to around 4 million* operations in a single bucket, the double addition is still precise. However, for buckets with more than 4 million operations, this checksum calculation can be inaccurate - typically off by 1. The result is that the client repeatedly gets checksum failures when syncing the bucket.

If the data is spread over multiple buckets, the same issue does not occur - only when there are more than 4 million operations in a single bucket.

The fix

This fixes the issue on two levels:

Always store the checksums as a long, instead of int or double.
Convert checksums to long when aggregating, to cater for any existing data.

In theory we could convert the numbers from unsigned to signed int32 when storing, instead of long. However:

The aggregation would still have to convert to long, even if that happens implicitly.
This would change the values being stored, which would be a bigger change and require more testing.

We could still investigate that option in the future.

Why 4 million?

Integers larger than 2^53 cannot be safely stored in a double/float8 without losing precision. Since our checksums go up to 2^32, that means we have an upper bound of 2^21=2097152 that we could always safely do the checksum calculation for.

In practice, checksums have a fairly even spread, averaging around 2^31. This doubles our old practical limit to a little over 4 million operations per bucket.

changeset-bot · 2025-06-19T12:49:17Z

🦋 Changeset detected

Latest commit: e8cfbed

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 11 packages

Name	Type
@powersync/service-module-mongodb-storage	Patch
@powersync/service-core	Patch
@powersync/service-image	Patch
@powersync/service-schema	Patch
@powersync/service-module-mongodb	Patch
@powersync/service-module-mysql	Patch
@powersync/service-module-postgres	Patch
@powersync/service-core-tests	Patch
@powersync/service-module-core	Patch
@powersync/service-module-postgres-storage	Patch
test-client	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

rkistner added 2 commits June 19, 2025 14:36

Convert checksums to long when aggregating.

595a2b2

Fix the compactor.

a44bc46

rkistner changed the title ~~Fix checksum calculations in large buckets~~ Fix checksum calculations in large buckets with > 4m rows Jun 19, 2025

Changeset.

e8cfbed

rkistner force-pushed the fix-checksum-storage branch from c72d12e to e8cfbed Compare June 19, 2025 12:57

rkistner marked this pull request as ready for review June 19, 2025 12:57

rkistner requested a review from stevensJourney June 19, 2025 12:57

stevensJourney approved these changes Jun 19, 2025

View reviewed changes

rkistner merged commit 1b326fb into main Jun 19, 2025
21 checks passed

rkistner deleted the fix-checksum-storage branch June 19, 2025 14:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix checksum calculations in large buckets with > 4m rows #282

Fix checksum calculations in large buckets with > 4m rows #282

Uh oh!

rkistner commented Jun 19, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Jun 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix checksum calculations in large buckets with > 4m rows #282

Fix checksum calculations in large buckets with > 4m rows #282

Uh oh!

Conversation

rkistner commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The bug

The fix

Why 4 million?

Uh oh!

changeset-bot bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rkistner commented Jun 19, 2025 •

edited

Loading

changeset-bot bot commented Jun 19, 2025 •

edited

Loading