Skip to content

Conversation

@rkistner
Copy link
Contributor

The issue

This is a regression in v1.13.0.

In #163, we changed initial snapshot queries on MongoDB to chunk on _id, using {_id: {$gt: ...}} queries to filter out earlier chunks. The issue is that when the same collection uses different types for the _id field, the $gt operator only returns documents with the same type. That meant that when you have a large collection with for example both string and ObjectId _ids, only the documents with string _id would replicate in the initial snapshot.

This would be visible in the logs, with the collection finishing replicating with mismatching counts, such as:

Replicating "mydb"."mycollection" 12356/~23000

The fix

This changes the snapshot query to { $expr: { $gt: ['$_id', { $literal: ... }] } }. This is almost the same, but the $expr version respects the total ordering of BSON types (same as the sort stage), and doesn't filter out the other types. It still uses the _id index.

@rkistner rkistner requested a review from Copilot June 30, 2025 12:32
@changeset-bot
Copy link

changeset-bot bot commented Jun 30, 2025

🦋 Changeset detected

Latest commit: f6b1d0f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 11 packages
Name Type
@powersync/service-module-mongodb Patch
@powersync/service-core Patch
@powersync/service-image Patch
@powersync/service-schema Patch
@powersync/service-core-tests Patch
@powersync/service-module-core Patch
@powersync/service-module-mongodb-storage Patch
@powersync/service-module-mysql Patch
@powersync/service-module-postgres-storage Patch
@powersync/service-module-postgres Patch
test-client Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request fixes MongoDB replication issues when collections contain mixed _id types by updating the snapshot query to use a $expr comparison. Key changes include:

  • Adding a new test case for chunked snapshots with mixed _id types.
  • Updating the snapshot query filter in MongoSnapshotQuery.ts to use $expr with $literal.
  • Minor comment improvements and changeset updates.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
modules/module-mongodb/test/src/chunked_snapshot.test.ts Added a new test case to exercise mixed _id type behavior.
modules/module-mongodb/src/replication/MongoSnapshotQuery.ts Updated the filter query to properly handle mixed _id types.
modules/module-mongodb/src/replication/ChangeStream.ts Added a clarifying comment on when iteration stops.
.changeset/shy-pugs-teach.md Updated changeset message to document the fix.

@rkistner rkistner marked this pull request as ready for review June 30, 2025 12:48
@rkistner rkistner requested a review from stevensJourney June 30, 2025 12:48
@rkistner rkistner merged commit 3e7d629 into main Jun 30, 2025
21 checks passed
@rkistner rkistner deleted the fix-mongo-id-types branch June 30, 2025 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants