From 58c6b2a637767bb327f570cec9d33344e716fbe7 Mon Sep 17 00:00:00 2001 From: Shane Harvey Date: Thu, 21 Nov 2024 15:29:13 -0800 Subject: [PATCH 1/3] DRIVERS-2798 Drivers do not gossip $clusterTime on SDAM commands --- source/sessions/driver-sessions.md | 20 ++++++++++++++++---- source/sessions/tests/README.md | 14 ++++++++++++++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/source/sessions/driver-sessions.md b/source/sessions/driver-sessions.md index 0494f25d69..17eae4b8e4 100644 --- a/source/sessions/driver-sessions.md +++ b/source/sessions/driver-sessions.md @@ -710,7 +710,8 @@ expiration. ## Gossipping the cluster time -Drivers MUST gossip the cluster time when connected to a deployment that uses cluster times. +Drivers MUST gossip the cluster time when connected to a deployment that uses cluster times. However, drivers MUST NOT +gossip cluster time on SDAM commands. Gossipping the cluster time is a process in which the driver participates in distributing the logical cluster time in a deployment. Drivers learn the current cluster time (from a particular server's perspective) in responses they receive @@ -722,7 +723,7 @@ received from a server. ### Receiving the current cluster time -Drivers MUST examine all responses from the server commands to see if they contain a top level field named +Drivers MUST examine all non-SDAM responses from the server commands to see if they contain a top level field named `$clusterTime` formatted as follows: ```typescript @@ -748,8 +749,9 @@ not participate in the comparison. ### Sending the highest seen cluster time -Whenever a driver sends a command to a server it MUST include the highest seen cluster time in a top level field called -`$clusterTime`, in the same format as it was received in (but see Gossipping with mixed server versions below). +Whenever a driver sends a non-SDAM command to a server it MUST include the highest seen cluster time in a top level +field called `$clusterTime`, in the same format as it was received in (but see Gossipping with mixed server versions +below). ### How to compute the `$clusterTime` to send to a server @@ -791,6 +793,15 @@ the command. A server supports `$clusterTime` when the `maxWireVersion` >= 6. This supports the (presumably short lived) scenario where not all servers have been upgraded to 3.6. +### Do not gossip on SDAM commands + +Drivers MUST NOT gossip cluster time on SDAM commands. In earlier versions of this spec, drivers did gossip cluster time +on SDAM commands, however it was discovered that doing so provides little benefit and can result in a loss of +availability. For example, if the driver attempts to connect to a member of a different replica set, the driver can end +up with an invalid cluster time. Worse, this invalid cluster time may remain the highest for an indefinite time since +clocks between different clusters are not synced. This results in all operations failing until the application is +restarted. To fix this issue we decided it was more robust to stop gossiping on SDAM commands altogether. + ## Test Plan See the [README](tests/README.md) for tests. @@ -918,6 +929,7 @@ has risks that do not justify the potential guaranteed `ServerSession` allocatio ## Changelog +- 2024-11-21: Drivers MUST NOT gossip $clusterTime on SDAM commands. - 2024-05-08: Migrated from reStructuredText to Markdown. - 2017-09-13: If causalConsistency option is omitted assume true - 2017-09-16: Omit session ID when opening and authenticating a connection diff --git a/source/sessions/tests/README.md b/source/sessions/tests/README.md index c7d2495789..47ba60477e 100644 --- a/source/sessions/tests/README.md +++ b/source/sessions/tests/README.md @@ -238,6 +238,20 @@ and configure a `MongoClient` with default options. - Attempt to send a write command to the server (e.g., `insertOne`) with the explicit session passed in - Assert that a client-side error is generated indicating that sessions are not supported +### 20. Drivers do not gossip `$clusterTime` on SDAM commands. + +- Skip this test when connected to a deployment that does not support cluster times +- Create a client, C1, with a small heartbeatFrequencyMS + - `c1 = MongoClient(heartbeatFrequencyMS=10)` +- Run a ping command using C1 and record the `$clusterTime` in the response, as `clusterTime`. + - `clusterTime = c1.admin.command({"ping": 1})["$clusterTime"]` +- Using a separate client, C2, run an insert to advance the cluster time + - `c2.test.test.insert_one({"advance": "$clusterTime"})` +- Next, wait until the client C1 processes at least 1 SDAM heartbeat + - If possible, assert the SDAM heartbeats do not send `$clusterTime` +- Run a ping command using C1 and assert that `$clusterTime` sent is the same as the `clusterTime` recorded earlier. + This assertion proves that C1's `$clusterTime` was not advanced by gossiping through SDAM. + ## Changelog - 2024-05-08: Migrated from reStructuredText to Markdown. From 037f3b6d4bf881c97855e1dd559001369a6187ba Mon Sep 17 00:00:00 2001 From: Shane Harvey Date: Mon, 25 Nov 2024 16:09:20 -0800 Subject: [PATCH 2/3] DRIVERS-2798 Add SDAM to terms --- source/sessions/driver-sessions.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/source/sessions/driver-sessions.md b/source/sessions/driver-sessions.md index 17eae4b8e4..32fae390c4 100644 --- a/source/sessions/driver-sessions.md +++ b/source/sessions/driver-sessions.md @@ -90,6 +90,11 @@ with sessions. Any network exception writing to or reading from a socket (e.g. a socket timeout or error). +**SDAM** + +An abbreviated form of "Server Discovery and Monitoring", specification defined in +[Server Discovery and Monitoring Specification](../server-discovery-and-monitoring/server-discovery-and-monitoring.md). + ## Specification Drivers currently have no concept of a session. The driver API will be expanded to provide a way for applications to From edf512655bca83dc3ec87f73a11f786c5eb123b5 Mon Sep 17 00:00:00 2001 From: Shane Harvey Date: Thu, 23 Jan 2025 11:42:10 -0800 Subject: [PATCH 3/3] DRIVERS-2798 Fix issues in prose test discovered in python implementation --- source/sessions/driver-sessions.md | 2 +- source/sessions/tests/README.md | 11 +++++------ 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/source/sessions/driver-sessions.md b/source/sessions/driver-sessions.md index 32fae390c4..556ee721d3 100644 --- a/source/sessions/driver-sessions.md +++ b/source/sessions/driver-sessions.md @@ -934,7 +934,7 @@ has risks that do not justify the potential guaranteed `ServerSession` allocatio ## Changelog -- 2024-11-21: Drivers MUST NOT gossip $clusterTime on SDAM commands. +- 2025-02-24: Drivers MUST NOT gossip $clusterTime on SDAM commands. - 2024-05-08: Migrated from reStructuredText to Markdown. - 2017-09-13: If causalConsistency option is omitted assume true - 2017-09-16: Omit session ID when opening and authenticating a connection diff --git a/source/sessions/tests/README.md b/source/sessions/tests/README.md index 47ba60477e..8d817a59fd 100644 --- a/source/sessions/tests/README.md +++ b/source/sessions/tests/README.md @@ -49,7 +49,6 @@ This test applies to drivers with session pools. ### 3. `$clusterTime` in commands -- Turn `heartbeatFrequencyMS` up to a very large number. - Register a command-started and a command-succeeded APM listener. If the driver has no APM support, inspect commands/replies in another idiomatic way, such as monkey-patching or a mock server. - Send a `ping` command to the server with the generic `runCommand` method. @@ -58,8 +57,7 @@ This test applies to drivers with session pools. - Record the `$clusterTime`, if any, in the reply passed to the command-succeeded APM listener. - Send another `ping` command. - Assert that `$clusterTime` in the command passed to the command-started listener, if any, equals the `$clusterTime` in - the previous server reply. (Turning `heartbeatFrequencyMS` up prevents an intervening heartbeat from advancing the - `$clusterTime` between these final two steps.) + the previous server reply. Repeat the above for: @@ -241,19 +239,20 @@ and configure a `MongoClient` with default options. ### 20. Drivers do not gossip `$clusterTime` on SDAM commands. - Skip this test when connected to a deployment that does not support cluster times -- Create a client, C1, with a small heartbeatFrequencyMS - - `c1 = MongoClient(heartbeatFrequencyMS=10)` +- Create a client, C1, directly connected to a writable server and a small heartbeatFrequencyMS + - `c1 = MongoClient(directConnection=True, heartbeatFrequencyMS=10)` - Run a ping command using C1 and record the `$clusterTime` in the response, as `clusterTime`. - `clusterTime = c1.admin.command({"ping": 1})["$clusterTime"]` - Using a separate client, C2, run an insert to advance the cluster time - `c2.test.test.insert_one({"advance": "$clusterTime"})` -- Next, wait until the client C1 processes at least 1 SDAM heartbeat +- Next, wait until the client C1 processes the next pair of SDAM heartbeat started + succeeded events. - If possible, assert the SDAM heartbeats do not send `$clusterTime` - Run a ping command using C1 and assert that `$clusterTime` sent is the same as the `clusterTime` recorded earlier. This assertion proves that C1's `$clusterTime` was not advanced by gossiping through SDAM. ## Changelog +- 2025-02-24: Test drivers do not gossip $clusterTime on SDAM. - 2024-05-08: Migrated from reStructuredText to Markdown. - 2019-05-15: Initial version. - 2021-06-15: Added snapshot-session tests. Introduced legacy and unified folders.