Skip to content

Conversation

rkistner
Copy link
Contributor

@rkistner rkistner commented Mar 3, 2025

In some cases, MongoDB initial replication replicated at a rate as low as 150 documents/second, while we're aiming for 5-10k/second. After some investigation, the main bottleneck appears to be reading the documents from the source database, rather than writing to bucket storage.

I can't figure out why, but reading the documents using a MongoDB session appears to be the main culprit. Using the session, it could take as long as 10-20s to read 10k documents in some cases. This is not consistent, which makes it difficult to test. Without the session, that drops down to 0.5-1s for the same 10k documents. The latter is fairly consistent.

This changes the query to not use a session anymore. Initially we used a session for snapshot reads, but that is not required anymore, so there is no good reason to use sessions for this.

The performance both before and after is fairly inconsistent, and I could not find clear patterns. Even with the changes here, replication is slow in some cases. However, the median performance does appear to be better with these changes.

Other smaller changes:

  1. Use readBufferedDocuments() to read documents in batches, rather than an async iterator.
  2. Increase the read batch size from the default 1000 to 6000. This has no effect if the cursor is limited due to the size of the documents, so does not increase memory in the worst case scenario. This improves read latency in some cases.
  3. Log the duration of each read batch (up to 6k documents), as well as the flushing of each PersistedBatch (up to 2k documents).
  4. Filter out empty "resumeBatch" batches to avoid the empty Flushed 0 + 0 + 0 updates messages after each actual batch.
  5. Start reading the next batch while processing the current batch.

@rkistner rkistner requested a review from stevensJourney March 3, 2025 10:21
Copy link

changeset-bot bot commented Mar 3, 2025

🦋 Changeset detected

Latest commit: 5695410

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 9 packages
Name Type
@powersync/service-module-mongodb-storage Patch
@powersync/service-module-mongodb Patch
@powersync/service-module-mysql Patch
@powersync/service-module-postgres Patch
@powersync/service-image Patch
@powersync/service-core Patch
@powersync/service-core-tests Patch
@powersync/service-module-postgres-storage Patch
test-client Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@rkistner rkistner marked this pull request as ready for review March 3, 2025 10:23
stevensJourney
stevensJourney previously approved these changes Mar 3, 2025
@rkistner
Copy link
Contributor Author

rkistner commented Mar 3, 2025

Update: The changes appear to not completely solve the read issue in all cases - I still get slightly inconsistent results when testing. However, in my main test case, I'm seeing the time to replicate 100k documents get reduced from over 2 minutes (800 docs/s) to around 22s (4500 docs/s). This is much more in line with the performance we expect, and similar to what we get for Postgres.

At this point, replication becomes CPU constrained on the replication process. We can eventually implement concurrent replication threads to improve replication performance further.

@rkistner rkistner requested a review from stevensJourney March 3, 2025 12:17
@rkistner rkistner merged commit 0dd746a into main Mar 3, 2025
21 checks passed
@rkistner rkistner deleted the mongo-initial-replication-performance branch March 3, 2025 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants