PCSM-237: Fix intermittent test failures due to race condition in wait_for_current_optime()#156
Merged
inelpandzic merged 8 commits intomainfrom Dec 2, 2025
Merged
Conversation
…ry is replicated.
boris-ilijic
approved these changes
Dec 2, 2025
Member
There was a problem hiding this comment.
We're waiting for Timestamp(S, 10) where S is the current second
should be Timestamp(S, 1)
For ts comparison it's possible to use: https://pkg.go.dev/go.mongodb.org/mongo-driver/bson/primitive#Timestamp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Tests would intermittently fail with assertions indicating that operations performed on the source were not replicated to the target before the test comparison ran.
Root cause: The
wait_for_current_optime()method had two critical race conditions:Race Condition 1: Clock Advancement with
pingCommandThe original implementation used the
pingcommand to capture a cluster timestamp:pingdoesn't ensure clusterTime returned is after all previous write operationspingcommand advances the cluster time view but doesn't guarantee that prior operations have been written to the oplogBSON Timestamp comparison causes premature wait completion
Timestamp(seconds, increment)secondsoverincrement: if seconds differ, the increment is ignoredTimestamp(S, 10)where S is the current secondTimestamp(S, 2)Timestamp(S, 3-9)is pendingTimestamp(S+1, 1)(new second)Timestamp(S, 10) <= Timestamp(S+1, 1)evaluates to TRUEResult: Test comparison runs before the collection creation operation is replicated.
Race Condition 2: Two-Phase Replication Lag
Even after fixing the timestamp issue, tests still failed intermittently because PCSM's replication has two phases:
lastReplicatedOpTimein statusWhen
wait_for_current_optime()returned immediately after detectingcurr_optime <= last_applied_op, Phase 1 was complete but Phase 2 might still be in progress. The test comparison would run before operations were fully written to the target database.This was discovered when adding debug print statements accidentally fixed the tests - the print statements added ~100-200ms of delay (HTTP calls + string formatting + I/O) which gave Phase 2 enough time to complete.
Solution
Ensure clusterTime reflects all prior write ops:
appendOplogNoteto getclusterTimeinwait_for_current_optime()Add post-wait delay for Phase 2 completion:
This fix eliminates both race conditions by ensuring proper operation ordering in the oplog and allowing sufficient time for operations to be fully applied to the target database, making tests deterministic.