Skip to content

Commit bc2950c

Browse files
minor: sync retryable writes tests
1 parent b27dfbc commit bc2950c

File tree

59 files changed

+9002
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+9002
-0
lines changed
Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
=====================
2+
Retryable Write Tests
3+
=====================
4+
5+
.. contents::
6+
7+
----
8+
9+
Introduction
10+
============
11+
12+
The YAML and JSON files in this directory tree are platform-independent tests
13+
that drivers can use to prove their conformance to the Retryable Writes spec.
14+
15+
Several prose tests, which are not easily expressed in YAML, are also presented
16+
in this file. Those tests will need to be manually implemented by each driver.
17+
18+
Tests will require a MongoClient created with options defined in the tests.
19+
Integration tests will require a running MongoDB cluster with server versions
20+
3.6.0 or later. The ``{setFeatureCompatibilityVersion: 3.6}`` admin command
21+
will also need to have been executed to enable support for retryable writes on
22+
the cluster. Some tests may have more stringent version requirements depending
23+
on the fail points used.
24+
25+
Server Fail Point
26+
=================
27+
28+
onPrimaryTransactionalWrite
29+
---------------------------
30+
31+
Some tests depend on a server fail point, ``onPrimaryTransactionalWrite``, which
32+
allows us to force a network error before the server would return a write result
33+
to the client. The fail point also allows control whether the server will
34+
successfully commit the write via its ``failBeforeCommitExceptionCode`` option.
35+
Keep in mind that the fail point only triggers for transaction writes (i.e. write
36+
commands including ``txnNumber`` and ``lsid`` fields). See `SERVER-29606`_ for
37+
more information.
38+
39+
.. _SERVER-29606: https://jira.mongodb.org/browse/SERVER-29606
40+
41+
The fail point may be configured like so::
42+
43+
db.runCommand({
44+
configureFailPoint: "onPrimaryTransactionalWrite",
45+
mode: <string|document>,
46+
data: <document>
47+
});
48+
49+
``mode`` is a generic fail point option and may be assigned a string or document
50+
value. The string values ``"alwaysOn"`` and ``"off"`` may be used to enable or
51+
disable the fail point, respectively. A document may be used to specify either
52+
``times`` or ``skip``, which are mutually exclusive:
53+
54+
- ``{ times: <integer> }`` may be used to limit the number of times the fail
55+
point may trigger before transitioning to ``"off"``.
56+
- ``{ skip: <integer> }`` may be used to defer the first trigger of a fail
57+
point, after which it will transition to ``"alwaysOn"``.
58+
59+
The ``data`` option is a document that may be used to specify options that
60+
control the fail point's behavior. As noted in `SERVER-29606`_,
61+
``onPrimaryTransactionalWrite`` supports the following ``data`` options, which
62+
may be combined if desired:
63+
64+
- ``closeConnection``: Boolean option, which defaults to ``true``. If ``true``,
65+
the connection on which the write is executed will be closed before a result
66+
can be returned.
67+
- ``failBeforeCommitExceptionCode``: Integer option, which is unset by default.
68+
If set, the specified exception code will be thrown and the write will not be
69+
committed. If unset, the write will be allowed to commit.
70+
71+
failCommand
72+
-----------
73+
74+
Some tests depend on a server fail point, ``failCommand``, which allows the
75+
client to force the server to return an error. Unlike
76+
``onPrimaryTransactionalWrite``, ``failCommand`` does not allow the client to
77+
directly control whether the server will commit the operation (execution of the
78+
write depends on whether the ``closeConnection`` and/or ``errorCode`` options
79+
are specified). See: `failCommand <../../transactions/tests#failcommand>`_ in
80+
the Transactions spec test suite for more information.
81+
82+
Disabling Fail Points after Test Execution
83+
------------------------------------------
84+
85+
After each test that configures a fail point, drivers should disable the fail
86+
point to avoid spurious failures in subsequent tests. The fail point may be
87+
disabled like so::
88+
89+
db.runCommand({
90+
configureFailPoint: <fail point name>,
91+
mode: "off"
92+
});
93+
94+
Use as Integration Tests
95+
========================
96+
97+
Integration tests are expressed in YAML and can be run against a replica set or
98+
sharded cluster as denoted by the top-level ``runOn`` field. Tests that rely on
99+
the ``onPrimaryTransactionalWrite`` fail point cannot be run against a sharded
100+
cluster because the fail point is not supported by mongos.
101+
102+
The tests exercise the following scenarios:
103+
104+
- Single-statement write operations
105+
106+
- Each test expecting a write result will encounter at-most one network error
107+
for the write command. Retry attempts should return without error and allow
108+
operation to succeed. Observation of the collection state will assert that
109+
the write occurred at-most once.
110+
111+
- Each test expecting an error will encounter successive network errors for
112+
the write command. Observation of the collection state will assert that the
113+
write was never committed on the server.
114+
115+
- Multi-statement write operations
116+
117+
- Each test expecting a write result will encounter at-most one network error
118+
for some write command(s) in the batch. Retry attempts should return without
119+
error and allow the batch to ultimately succeed. Observation of the
120+
collection state will assert that each write occurred at-most once.
121+
122+
- Each test expecting an error will encounter successive network errors for
123+
some write command in the batch. The batch will ultimately fail with an
124+
error, but observation of the collection state will assert that the failing
125+
write was never committed on the server. We may observe that earlier writes
126+
in the batch occurred at-most once.
127+
128+
We cannot test a scenario where the first and second attempts both encounter
129+
network errors but the write does actually commit during one of those attempts.
130+
This is because (1) the fail point only triggers when a write would be committed
131+
and (2) the skip and times options are mutually exclusive. That said, such a
132+
test would mainly assert the server's correctness for at-most once semantics and
133+
is not essential to assert driver correctness.
134+
135+
Test Format
136+
-----------
137+
138+
Each YAML file has the following keys:
139+
140+
- ``runOn`` (optional): An array of server version and/or topology requirements
141+
for which the tests can be run. If the test environment satisfies one or more
142+
of these requirements, the tests may be executed; otherwise, this file should
143+
be skipped. If this field is omitted, the tests can be assumed to have no
144+
particular requirements and should be executed. Each element will have some or
145+
all of the following fields:
146+
147+
- ``minServerVersion`` (optional): The minimum server version (inclusive)
148+
required to successfully run the tests. If this field is omitted, it should
149+
be assumed that there is no lower bound on the required server version.
150+
151+
- ``maxServerVersion`` (optional): The maximum server version (inclusive)
152+
against which the tests can be run successfully. If this field is omitted,
153+
it should be assumed that there is no upper bound on the required server
154+
version.
155+
156+
- ``topology`` (optional): An array of server topologies against which the
157+
tests can be run successfully. Valid topologies are "single", "replicaset",
158+
and "sharded". If this field is omitted, the default is all topologies (i.e.
159+
``["single", "replicaset", "sharded"]``).
160+
161+
- ``data``: The data that should exist in the collection under test before each
162+
test run.
163+
164+
- ``tests``: An array of tests that are to be run independently of each other.
165+
Each test will have some or all of the following fields:
166+
167+
- ``description``: The name of the test.
168+
169+
- ``clientOptions``: Parameters to pass to MongoClient().
170+
171+
- ``useMultipleMongoses`` (optional): If ``true``, the MongoClient for this
172+
test should be initialized with multiple mongos seed addresses. If ``false``
173+
or omitted, only a single mongos address should be specified. This field has
174+
no effect for non-sharded topologies.
175+
176+
- ``failPoint`` (optional): The ``configureFailPoint`` command document to run
177+
to configure a fail point on the primary server. Drivers must ensure that
178+
``configureFailPoint`` is the first field in the command. This option and
179+
``useMultipleMongoses: true`` are mutually exclusive.
180+
181+
- ``operation``: Document describing the operation to be executed. The
182+
operation should be executed through a collection object derived from a
183+
client that has been created with ``clientOptions``. The operation will have
184+
some or all of the following fields:
185+
186+
- ``name``: The name of the operation as defined in the CRUD specification.
187+
188+
- ``arguments``: The names and values of arguments from the CRUD
189+
specification.
190+
191+
- ``outcome``: Document describing the return value and/or expected state of
192+
the collection after the operation is executed. This will have some or all
193+
of the following fields:
194+
195+
- ``error``: If ``true``, the test should expect an error or exception. Note
196+
that some drivers may report server-side errors as a write error within a
197+
write result object.
198+
199+
- ``result``: The return value from the operation. This will correspond to
200+
an operation's result object as defined in the CRUD specification. This
201+
field may be omitted if ``error`` is ``true``. If this field is present
202+
and ``error`` is ``true`` (generally for multi-statement tests), the
203+
result reports information about operations that succeeded before an
204+
unrecoverable failure. In that case, drivers may choose to check the
205+
result object if their BulkWriteException (or equivalent) provides access
206+
to a write result object.
207+
208+
- ``errorLabelsContain``: A list of error label strings that the
209+
error is expected to have.
210+
211+
- ``errorLabelsOmit``: A list of error label strings that the
212+
error is expected not to have.
213+
214+
- ``collection``:
215+
216+
- ``name`` (optional): The name of the collection to verify. If this isn't
217+
present then use the collection under test.
218+
219+
- ``data``: The data that should exist in the collection after the
220+
operation has been run.
221+
222+
Split Batch Tests
223+
=================
224+
225+
The YAML tests specify bulk write operations that are split by command type
226+
(e.g. sequence of insert, update, and delete commands). Multi-statement write
227+
operations may also be split due to ``maxWriteBatchSize``,
228+
``maxBsonObjectSize``, or ``maxMessageSizeBytes``.
229+
230+
For instance, an insertMany operation with five 10 MiB documents executed using
231+
OP_MSG payload type 0 (i.e. entire command in one document) would be split into
232+
five insert commands in order to respect the 16 MiB ``maxBsonObjectSize`` limit.
233+
The same insertMany operation executed using OP_MSG payload type 1 (i.e. command
234+
arguments pulled out into a separate payload vector) would be split into two
235+
insert commands in order to respect the 48 MB ``maxMessageSizeBytes`` limit.
236+
237+
Noting when a driver might split operations, the ``onPrimaryTransactionalWrite``
238+
fail point's ``skip`` option may be used to control when the fail point first
239+
triggers. Once triggered, the fail point will transition to the ``alwaysOn``
240+
state until disabled. Driver authors should also note that the server attempts
241+
to process all documents in a single insert command within a single commit (i.e.
242+
one insert command with five documents may only trigger the fail point once).
243+
This behavior is unique to insert commands (each statement in an update and
244+
delete command is processed independently).
245+
246+
If testing an insert that is split into two commands, a ``skip`` of one will
247+
allow the fail point to trigger on the second insert command (because all
248+
documents in the first command will be processed in the same commit). When
249+
testing an update or delete that is split into two commands, the ``skip`` should
250+
be set to the number of statements in the first command to allow the fail point
251+
to trigger on the second command.
252+
253+
Command Construction Tests
254+
==========================
255+
256+
Drivers should also assert that command documents are properly constructed with
257+
or without a transaction ID, depending on whether the write operation is
258+
supported. `Command Monitoring`_ may be used to check for the presence of a
259+
``txnNumber`` field in the command document. Note that command documents may
260+
always include an ``lsid`` field per the `Driver Session`_ specification.
261+
262+
.. _Command Monitoring: ../../command-monitoring/command-monitoring.rst
263+
.. _Driver Session: ../../sessions/driver-sessions.rst
264+
265+
These tests may be run against both a replica set and shard cluster.
266+
267+
Drivers should test that transaction IDs are never included in commands for
268+
unsupported write operations:
269+
270+
* Write commands with unacknowledged write concerns (e.g. ``{w: 0}``)
271+
272+
* Unsupported single-statement write operations
273+
274+
- ``updateMany()``
275+
- ``deleteMany()``
276+
277+
* Unsupported multi-statement write operations
278+
279+
- ``bulkWrite()`` that includes ``UpdateMany`` or ``DeleteMany``
280+
281+
* Unsupported write commands
282+
283+
- ``aggregate`` with write stage (e.g. ``$out``, ``$merge``)
284+
285+
Drivers should test that transactions IDs are always included in commands for
286+
supported write operations:
287+
288+
* Supported single-statement write operations
289+
290+
- ``insertOne()``
291+
- ``updateOne()``
292+
- ``replaceOne()``
293+
- ``deleteOne()``
294+
- ``findOneAndDelete()``
295+
- ``findOneAndReplace()``
296+
- ``findOneAndUpdate()``
297+
298+
* Supported multi-statement write operations
299+
300+
- ``insertMany()`` with ``ordered=true``
301+
- ``insertMany()`` with ``ordered=false``
302+
- ``bulkWrite()`` with ``ordered=true`` (no ``UpdateMany`` or ``DeleteMany``)
303+
- ``bulkWrite()`` with ``ordered=false`` (no ``UpdateMany`` or ``DeleteMany``)
304+
305+
Prose Tests
306+
===========
307+
308+
The following tests ensure that retryable writes work properly with replica sets
309+
and sharded clusters.
310+
311+
#. Test that retryable writes raise an exception when using the MMAPv1 storage
312+
engine. For this test, execute a write operation, such as ``insertOne``,
313+
which should generate an exception. Assert that the error message is the
314+
replacement error message::
315+
316+
This MongoDB deployment does not support retryable writes. Please add
317+
retryWrites=false to your connection string.
318+
319+
and the error code is 20.
320+
321+
**Note**: Drivers that rely on ``serverStatus`` to determine the storage engine
322+
in use MAY skip this test for sharded clusters, since ``mongos`` does not report
323+
this information in its ``serverStatus`` response.
324+
325+
Changelog
326+
=========
327+
328+
:2019-10-21: Add ``errorLabelsContain`` and ``errorLabelsContain`` fields to ``result``
329+
330+
:2019-08-07: Add Prose Tests section
331+
332+
:2019-06-07: Mention $merge stage for aggregate alongside $out
333+
334+
:2019-03-01: Add top-level ``runOn`` field to denote server version and/or
335+
topology requirements requirements for the test file. Removes the
336+
``minServerVersion`` and ``maxServerVersion`` top-level fields,
337+
which are now expressed within ``runOn`` elements.
338+
339+
Add test-level ``useMultipleMongoses`` field.

0 commit comments

Comments
 (0)