Add support for server selection's deprioritized servers to all topologies. by vbabanin · Pull Request #1860 · mongodb/mongo-java-driver

vbabanin · 2026-01-12T21:06:22Z

Relevant specification changes:

DRIVERS-3344 - Add support for server selection's deprioritized servers to all topologies (#1865)
DRIVERS-3380 Filter deprioritized candidates by address only (#1886) - the change has not yet been addressed in this PR, because it was introduced after creation of the PR.
DRIVERS-3404 - Server selection deprioritization only for overload errors on replica sets (#1900) - the change has not yet been addressed in this PR, because it was introduced after creation of the PR; requires us to return the condition we had previously, but change it and either introduce it in a different place, or attach additional information to a deprioritized server address.
DRIVERS-3406 test deprioritized selection with tag sets (#1903) - the change has not yet been addressed in this PR, because it was introduced after creation of the PR.

JAVA-6021,
JAVA-6074,
JAVA-6105,
JAVA-6114

…ogies. JAVA-6021

stIncMale · 2026-01-12T22:34:56Z

driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java


-    protected void updateDescription(final ClusterDescription newDescription) {
+    @VisibleForTesting(otherwise = PROTECTED)
+    public void updateDescription(final ClusterDescription newDescription) {


It seems we can avoid making this method public by overriding it in the subclass. We won't even need to make it public in the subclass, because the test code using the method will be in the same package as the subclass. However, this requires the subclass to not be anonymous (if we could use var, the proposed approach would have worked even for an anonymous subclass).

vbabanin · 2026-01-17T21:27:13Z

driver-core/src/test/unit/com/mongodb/connection/ServerSelectionSelectionTest.java

+        } catch (MongoTimeoutException mongoTimeoutException) {
+            List<ServerDescription> inLatencyWindowServers = buildServerDescriptions(definition.getArray("in_latency_window"));
+            assertTrue("Expected emtpy but was " + inLatencyWindowServers.size() + " " + definition.toJson(
+                    JsonWriterSettings.builder()
+                            .indent(true).build()), inLatencyWindowServers.isEmpty());
+            return;


Since we now perform the full server-selection path (via BaseCluster), the behavior of "no servers selected: is observed differently than before.

Previously, these tests invoked the Selector directly, got an empty list, and asserted on that result. With BaseCluster, server selection runs the normal selection loop: it will retry until either a server becomes selectable or the selection timeout elapses.

In this setup, a server selection timeout is the expected signal that no servers are available/eligible for selection. The timeout is set to 200ms to keep the test fast, while giving enough headroom to avoid any flakiness.

vbabanin · 2026-01-17T21:41:57Z

driver-core/src/test/unit/com/mongodb/connection/ServerSelectionSelectionTest.java

-        List<ServerDescription> latencyBasedSelectedServers = latencyBasedServerSelector.select(clusterDescription);
-        assertServers(latencyBasedSelectedServers, inLatencyWindowServers);
+        assertNotNull(serverTuple);
+        assertTrue(inLatencyWindowServers.stream().anyMatch(s -> s.getAddress().equals(serverTuple.getServerDescription().getAddress())));


There’s an ambiguity in the server selection spec docs about what drivers must assert for in_latency_window:

tests/README.md says to verify the set of servers in in_latency_window.

"Drivers implementing server selection MUST test that their implementation correctly returns the set of servers in in_latency_window."

server-selection-tests.mdsays to verify the selection returns one of the servers in in_latency_window.

"Drivers implementing server selection MUST test that their implementations correctly return one of the servers in in_latency_window."

This test follows server-selection-tests.md, so asserting that the selected server is within the expected in_latency_window set is consistent with the requirements in the spec.

P.S: Both files were created in this single commit with contradicting requirements:

Feb 6, 2015 - Commit 6b63123a - "Add Server Selection Spec" File 1: server-selection-tests.rst Drivers implementing server selection MUST test that their implementations correctly return one of the servers in ``in_latency_window``. File 2: tests/README.rst Drivers implementing server selection MUST test that their implementations correctly return the set of servers in ``in_latency_window``.

# Conflicts: # driver-core/src/test/resources/specifications # driver-core/src/test/unit/com/mongodb/connection/ServerSelectionSelectionTest.java

stIncMale

The last reviewed commit is f20b482.

The files I haven't yet reviewed:

ServerDeprioritizationTest
ServerSelectionSelectionTest

@vbabanin I am proposing to resolve all the outstanding comments before proceeding with the changes required by DRIVERS-3404 (for details, see the description of this PR).

stIncMale · 2026-02-10T08:28:16Z

driver-core/src/test/functional/com/mongodb/ClusterFixture.java

-            new ReadConcernAwareNoOpSessionContext(ReadConcern.DEFAULT),
-            new TimeoutContext(TIMEOUT_SETTINGS),
-            getServerApi());
+    public static OperationContext getOperationContext() {


Thank you for noticing the issue with OperationContext being mutable (it was mutable before this PR), and fixing it.

The methods returning OperationContext in this class are a mess:

getOperationContext()

createOperationContext(TimeoutSettings timeoutSettings)

createNewOperationContext(TimeoutSettings timeoutSettings)

getOperationContext(ReadPreference readPreference)

Let's name the methods consistently. I think, all of the aforementioned methods should use the "create" prefix.
1.1. Let's do this automatically via IDE in a separate commit, and express in the commit message that the commit was done via automatic refactoring, so that reviewers know not to review it.

Let's remove the weirdly named and trivial createNewOperationContext method, and inline it where it is used (fortunately, it is used only in two places in ClusterFixture, and nowhere else).

stIncMale · 2026-02-10T09:03:25Z

driver-core/src/test/functional/com/mongodb/connection/ConnectionSpecification.groovy

-        def source = getBinding().getReadConnectionSource(OPERATION_CONTEXT)
-        def connection = source.getConnection(OPERATION_CONTEXT)
+        def source = getBinding().getReadConnectionSource(getOperationContext())
+        def connection = source.getConnection(getOperationContext())


I think, the OperationContext used here should be the same. There may be many more changed places in this PR where this is the case, and given how many changes there are, it may not be too easy to identify them all.

For brevity, I'll be marking such places with just the "same context" comment. For now I am not sure if it is even practically achievable to identify them all.

The problem is exacerbated by ClusterFixture creating a new OperationContext each time it needs one, which means that the ClusterFixture.getBinding never uses the same context as the one used by the test calling ClusterFixture.getBinding.

stIncMale · 2026-02-10T09:11:04Z

driver-core/src/test/functional/com/mongodb/connection/ConnectionSpecification.groovy

-        def source = getBinding().getReadConnectionSource(OPERATION_CONTEXT)
-        def connection = source.getConnection(OPERATION_CONTEXT)
+        def source = getBinding().getReadConnectionSource(getOperationContext())
+        def connection = source.getConnection(getOperationContext())


same context

stIncMale · 2026-02-10T19:35:10Z

.../src/test/unit/com/mongodb/internal/connection/AbstractServerDiscoveryAndMonitoringTest.java

@@ -98,7 +98,7 @@ protected void applyApplicationError(final BsonDocument applicationError) {
        switch (type) {
            case "command":
                exception = getCommandFailureException(applicationError.getDocument("response"), serverAddress,
-                        OPERATION_CONTEXT.getTimeoutContext());
+                        getOperationContext().getTimeoutContext());


same context

stIncMale · 2026-02-10T19:49:30Z

driver-core/src/test/unit/com/mongodb/internal/connection/ServerDiscoveryAndMonitoringTest.java

+            Timeout serverSelectionTimeout = getOperationContext().getTimeoutContext().computeServerSelectionTimeout();
            DefaultServer server = (DefaultServer) getCluster()
-                    .getServersSnapshot(serverSelectionTimeout, OPERATION_CONTEXT.getTimeoutContext())
+                    .getServersSnapshot(serverSelectionTimeout, getOperationContext().getTimeoutContext())


same context

stIncMale · 2026-02-11T17:25:10Z

driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java

-                final ClusterableServerFactory serverFactory,
-                final ClientMetadata clientMetadata) {
+    @VisibleForTesting(otherwise = PRIVATE)
+    protected BaseCluster(final ClusterId clusterId,


The value of otherwise is incorrect here.

stIncMale · 2026-02-11T17:34:29Z

driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java

                getRaceConditionPreFilteringSelector(serversSnapshot),
-                serverSelector,
-                serverDeprioritization.getServerSelector(),
+                serverDeprioritization.applyDeprioritization(serverSelector),


Let's rename the method to apply. The name of the type (ServerDeprioritization) of the object whose instance method we call tells what is being applied. There is no need to duplicate that in the names of methods.

Let's do this automatically via IDE in a separate commit, and express in the commit message that the commit was done via automatic refactoring, so that reviewers know not to review it.

stIncMale · 2026-02-12T02:08:50Z

driver-core/src/main/com/mongodb/internal/connection/OperationContext.java

+                if (serverDescriptions.size() == 1 || deprioritized.isEmpty()) {
+                    return wrappedSelector.select(clusterDescription);
                }
+
                List<ServerDescription> nonDeprioritizedServerDescriptions = serverDescriptions
                        .stream()
                        .filter(serverDescription -> !deprioritized.contains(serverDescription.getAddress()))
                        .collect(toList());


Let's leave a TODO-JAVA-XXXX comment linking this code (the if optimization as well as the use of Stream) to the existing performance ticket about server selection. We should also mention in the description of that ticket that there are TODO-JAVA-XXXX comments that needs to be addressed.

The open questions here are:

whether the if optimization is worth itl

whether we should use a loop instead of using Stream.

stIncMale · 2026-02-12T02:24:36Z

driver-core/src/main/com/mongodb/internal/connection/OperationContext.java

+                        new ClusterDescription(clusterDescription.getConnectionMode(), clusterDescription.getType(),
+                                nonDeprioritizedServerDescriptions,
+                                clusterDescription.getClusterSettings(),
+                                clusterDescription.getServerSettings()));


We should use the

public ClusterDescription(final ClusterConnectionMode connectionMode, final ClusterType type, @Nullable final MongoException srvResolutionException, final List<ServerDescription> serverDescriptions, @Nullable final ClusterSettings clusterSettings, @Nullable final ServerSettings serverSettings) {

constructor.

stIncMale · 2026-02-12T02:36:26Z

driver-core/src/main/com/mongodb/internal/connection/OperationContext.java

+         * The returned {@link ServerSelector} wraps the provided selector and attempts server selection in two passes:
+         * <ol>
+         *   <li>First pass: calls the wrapped selector with only non-deprioritized {@link ServerDescription}s</li>
+         *   <li>Second pass: if the first pass returns no servers, calls the wrapped selector again with all servers (including deprioritized ones)</li>
+         * </ol>


[optional]

Suggested change

* The returned {@link ServerSelector} wraps the provided selector and attempts server selection in two passes:

* <ol>

* <li>First pass: calls the wrapped selector with only non-deprioritized {@link ServerDescription}s</li>

* <li>Second pass: if the first pass returns no servers, calls the wrapped selector again with all servers (including deprioritized ones)</li>

* </ol>

* The returned {@link ServerSelector} wraps the provided selector and attempts

* {@linkplain ServerSelector#select(ClusterDescription) server selection} in two passes:

* <ol>

* <li>First pass: selects using the wrapped selector with only non-deprioritized {@link ServerDescription}s.</li>

* <li>Second pass: if the first pass selects no {@link ServerDescription}s,

* selects using the wrapped selector again with all {@link ServerDescription}s, including deprioritized ones.</li>

* </ol>

Add support for server selection's deprioritized servers to all topol…

282bf57

…ogies. JAVA-6021

stIncMale reviewed Jan 12, 2026

View reviewed changes

Remove global OperationContext in tests as it has mutable state.

cb8905c

vbabanin self-assigned this Jan 15, 2026

vbabanin added 4 commits January 14, 2026 23:43

Merge branch 'main' into JAVA-6021

74ae21f

Merge branch 'main' into JAVA-6021

f39b7a0

Add more test-cases.

2a4cd70

Fix static checks.

fa35dd9

vbabanin commented Jan 17, 2026

View reviewed changes

Allow invoking connect.

7b613e4

vbabanin marked this pull request as ready for review January 19, 2026 22:31

vbabanin requested a review from a team as a code owner January 19, 2026 22:31

vbabanin requested a review from stIncMale January 19, 2026 22:31

nhachicha mentioned this pull request Jan 28, 2026

JAVA-5949 preserve connection pool on backpressure errors when establishing connections #1854

Closed

vbabanin added 2 commits February 9, 2026 23:17

Merge branch 'main' into JAVA-6021

7e30f2a

# Conflicts: # driver-core/src/test/resources/specifications # driver-core/src/test/unit/com/mongodb/connection/ServerSelectionSelectionTest.java

Bump specification commit.

f20b482

stIncMale requested changes Feb 25, 2026

View reviewed changes

Conversation

vbabanin commented Jan 12, 2026 • edited by stIncMale Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vbabanin Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vbabanin Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stIncMale left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vbabanin commented Jan 12, 2026 •

edited by stIncMale

Loading

vbabanin Jan 17, 2026 •

edited

Loading

vbabanin Jan 17, 2026 •

edited

Loading