-
Notifications
You must be signed in to change notification settings - Fork 246
DRIVERS-2972: add message requirement to ServerDescription.error #1729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
ea5698d
6b75475
2668eb9
889e3dd
75ce4dd
bc971f2
5d121b1
21187a9
b7adc0b
b6dfeee
07976f0
cd84c0e
a2bd209
becaa6b
47777df
483db4e
8bf359e
70532c7
9d51f24
cdb27c9
e97908e
4b8516d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -226,7 +226,9 @@ Fields: | |
field in the server's hello or legacy hello response, in the case that the server reports an address different from | ||
the address the client uses. | ||
|
||
- (=) `error`: information about the last error related to this server. Default null. | ||
- (=) `error`: information about the last error related to this server. Default null. MUST contain or be able to produce | ||
a string describing the error. The name of the field containing the string describing the error SHOULD be what is | ||
most idiomatic for each driver. | ||
|
||
- `roundTripTime`: the duration of the hello or legacy hello call. Default null. | ||
|
||
|
@@ -485,7 +487,14 @@ removed once the primary is checked. | |
#### error | ||
|
||
If the client experiences any error when checking a server, it stores error information in the ServerDescription's error | ||
field. | ||
field. The message contained in this field MUST contain the substrings detailed in the table below when the | ||
ServerDescription is changed to Unknown in the circumstances outlined. | ||
|
||
| circumstance | error substring | | ||
| -------------------------------------------------------------------- | -------------------------------------------------------- | | ||
| RSPrimary with a stale electionId is discovered | `'primary marked stale due to electionId mismatch'` | | ||
| RSPrimary with a stale setVersion is discovered | `'primary marked stale due to setVersion mismatch'` | | ||
|
||
| A more current RSPrimary is discovered alongside an existing primary | `'primary marked stale due to discovery of new primary'` | | ||
|
||
|
||
#### roundTripTime | ||
|
||
|
@@ -863,38 +872,46 @@ else if topologyDescription.setName != serverDescription.setName: | |
|
||
if serverDescription.maxWireVersion >= 17: # MongoDB 6.0+ | ||
# Null values for both electionId and setVersion are always considered less than | ||
if serverDescription.electionId > topologyDescription.maxElectionId or ( | ||
serverDescription.electionId == topologyDescription.maxElectionId | ||
and serverDescription.setVersion >= topologyDescription.maxSetVersion | ||
): | ||
topologyDescription.maxElectionId = serverDescription.electionId | ||
topologyDescription.maxSetVersion = serverDescription.setVersion | ||
else: | ||
# Stale primary. | ||
# replace serverDescription with a default ServerDescription of type "Unknown" | ||
if serverDescription.electionId < topologyDescription.maxElectionId: | ||
|
||
# Stale primary due to electionId mismatch | ||
# replace serverDescription with a default ServerDescription of type "Unknown" and an error | ||
# field with a message containing the substring "primary marked stale due to mismatched electionId" | ||
checkIfHasPrimary() | ||
return | ||
else: | ||
# Maintain old comparison rules, namely setVersion is checked before electionId | ||
if serverDescription.setVersion is not null and serverDescription.electionId is not null: | ||
if ( | ||
topologyDescription.maxSetVersion is not null | ||
and topologyDescription.maxElectionId is not null | ||
and ( | ||
topologyDescription.maxSetVersion > serverDescription.setVersion | ||
or ( | ||
topologyDescription.maxSetVersion == serverDescription.setVersion | ||
and topologyDescription.maxElectionId > serverDescription.electionId | ||
) | ||
) | ||
): | ||
# Stale primary. | ||
# replace serverDescription with a default ServerDescription of type "Unknown" | ||
elif serverDescription.electionId == topologyDescription.maxElectionId : | ||
if serverDescription.setVersion < topology.maxSetVersion: | ||
# Stale primary due to setVersion mismatch | ||
# replace serverDescription with a default ServerDescription of type "Unknown" and an error | ||
# field with a message containing the substring "primary marked stale due to mismatched setVersion" | ||
checkIfHasPrimary() | ||
return | ||
|
||
else: | ||
topologyDescription.maxElectionId = serverDescription.electionId | ||
topologyDescription.maxSetVersion = serverDescription.setVersion | ||
else: | ||
topologyDescription.maxElectionId = serverDescription.electionId | ||
topologyDescription.maxSetVersion = serverDescription.setVersion | ||
|
||
|
||
else: | ||
# Maintain old comparison rules, namely setVersion is checked before electionId | ||
if serverDescription.setVersion is not null and serverDescription.electionId is not null | ||
and topologyDescription.maxSetVersion is not null and topologyDescription.maxElectionId is not | ||
null: | ||
if topologyDescription.maxSetVersion > serverDescription.setVersion: | ||
# replace serverDescription with a default ServerDescription of type "Unknown" and an | ||
# error field with a message containing the substring "primary marked stale due to mismatched setVersion" | ||
checkIfHasPrimary() | ||
return | ||
elif topologyDescription.maxSetVersion == serverDescription.setVersion && | ||
topologyDescription.maxElectionId > serverDescription.electionId: | ||
# Stale primary due to electionId mismatch | ||
# replace serverDescription with a default ServerDescription of type "Unknown" and an error | ||
# field with a message containing the substring "primary marked stale due to mismatched electionId" | ||
checkIfHasPrimary() | ||
return | ||
else: | ||
topologyDescription.maxElectionId = serverDescription.electionId | ||
if serverDescription.setVersion is not null and ( | ||
topologyDescription.maxSetVersion is null | ||
or serverDescription.setVersion > topologyDescription.maxSetVersion | ||
|
@@ -905,9 +922,9 @@ else: | |
for each server in topologyDescription.servers: | ||
if server.address != serverDescription.address: | ||
if server.type is RSPrimary: | ||
# See note below about invalidating an old primary. | ||
replace the server with a default ServerDescription of type "Unknown" | ||
|
||
# See note below about invalidating an old primary | ||
# Replace the server with a default ServerDescription of type "Unknown" and an error field | ||
# with a message containing the substring "primary marked stale due to discovery of newer primary" | ||
ShaneHarvey marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
for each address in serverDescription's "hosts", "passives", and "arbiters": | ||
if address is not in topologyDescription.servers: | ||
add new default ServerDescription of type "Unknown" | ||
|
@@ -921,9 +938,11 @@ checkIfHasPrimary() | |
``` | ||
|
||
A note on invalidating the old primary: when a new primary is discovered, the client finds the previous primary (there | ||
should be none or one) and replaces its description with a default ServerDescription of type "Unknown." A multi-threaded | ||
client MUST [request an immediate check](server-monitoring.md#requesting-an-immediate-check) for that server as soon as | ||
possible. | ||
should be none or one) and replaces its description with a default ServerDescription of type "Unknown". Additionally, | ||
the `error` field of the new `ServerDescription` object MUST include a descriptive error explaining that it was | ||
invalidated because the primary was determined to be stale. Drivers MAY additionally specify whether this was due to an | ||
|
||
electionId or setVersion mismatch. A multi-threaded client MUST | ||
|
||
[request an immediate check](server-monitoring.md#requesting-an-immediate-check) for that server as soon as possible. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we expand the scope of this ticket to include more info in other cases where we reset a server to "unknown" besides stale primary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah I think that makes sense to do here. Seems like the only other place we do that is in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So there are two places we do this in our |
||
|
||
If the old primary server version is 4.0 or earlier, the client MUST clear its connection pool for the old primary, too: | ||
the connections are all bad because the old primary has closed its sockets. If the old primary server version is 4.2 or | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,6 +71,8 @@ following keys: | |
|
||
- type: A ServerType name, like "RSSecondary". See [ServerType](../server-discovery-and-monitoring.md#servertype) for | ||
details pertaining to async and multi-threaded drivers. | ||
- error: An optional object with a with a string field containing a string that must be a substring of the message on | ||
|
||
the `ServerDescription.error` object | ||
- setName: A string with the expected replica set name, or null. | ||
- setVersion: absent or an integer. | ||
- electionId: absent, null, or an ObjectId. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/new_primary.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/primary_disconnect_setversion.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/set_version_can_rollback.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/setversion_equal_max_without_electionid.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/setversion_greaterthan_max_without_electionid.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/setversion_without_electionid-pre-6.0.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/setversion_without_electionid.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/stepdown_change_set_name.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/use_setversion_without_electionid-pre-6.0.yml | ||
https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/tests/rs/use_setversion_without_electionid.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the motivation for adding "The name of the field containing the string describing the error SHOULD be what is most idiomatic for each driver."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that the
error
field would be an object representing an error and since different languages have different ways of doing that, I think it's better to defer to driver engineers on what is the best way to communicate that to users. In Node we'd likely do that with a newMongoError
subclass (which itself subclasses the native NodeError
type) that uses amessage
field to hold human readable information about the error. We currently have it typed as a genericMongoError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think we can remove that sentence since it generally always true of any spec prescribed api.