You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #158 from FGasper/felipe_reduce_change_stream_recheck_dupes
Migration Verifier listens for changes on both source & destination clusters and enqueues rechecks in the same collection for both. Thus, any changes that hit the source also hit the destination. Thus, for every change on the source we expect to see a duplicate change on the destination.
We handle this via an insert with tolerance for duplicate keys. The server’s duplicate-key path, though, is quite slow. It’s much faster just to write both documents and then deduplicate them when reading.
This changeset makes that change. Rechecks triggered by source changes are no longer document-level duplicates of destination-triggered rechecks because the `_id` now contains a `rand` field, set to a random int32, that distinguishes them. When we convert those into recheck tasks, we project the `_id.rand` field out so that it’s easy to deduplicate them.
This also avoids duplicate-key errors in the “hot documents” case as well.
0 commit comments