-
Notifications
You must be signed in to change notification settings - Fork 245
DRIVERS-2888 Support QE with Client.bulkWrite #1770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
c6f645d
aa51fc3
fe8ed1c
44d883c
ad2fd55
573aac1
112d5bd
f910f65
364eb4c
01505fc
87d3c41
7f4021a
bd7a0fc
b2893e7
28b2ced
6415c1a
a388a7e
148ed07
10d34d7
f85e42e
6a0daee
90b3701
c046c98
175096a
c6ca35a
0f4e11e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"fields": [ | ||
{ | ||
"keyId": { | ||
"$binary": { | ||
"base64": "LOCALAAAAAAAAAAAAAAAAA==", | ||
"subType": "04" | ||
} | ||
}, | ||
"path": "foo", | ||
"bsonType": "string" | ||
} | ||
] | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
"foo": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -563,10 +563,13 @@ First, perform the setup. | |
2. Using `client`, drop and create the collection `db.coll` configured with the included JSON schema | ||
[limits/limits-schema.json](../limits/limits-schema.json). | ||
|
||
3. Using `client`, drop the collection `keyvault.datakeys`. Insert the document | ||
3. Using `client`, drop and create the collection `db.coll2` configured with the included encryptedFields | ||
[limits/limits-encryptedFields.json](../limits/limits-encryptedFields.json). | ||
mdb-ad marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
4. Using `client`, drop the collection `keyvault.datakeys`. Insert the document | ||
[limits/limits-key.json](../limits/limits-key.json) | ||
|
||
4. Create a MongoClient configured with auto encryption (referred to as `client_encrypted`) | ||
5. Create a MongoClient configured with auto encryption (referred to as `client_encrypted`) | ||
|
||
Configure with the `local` KMS provider as follows: | ||
|
||
|
@@ -578,27 +581,27 @@ First, perform the setup. | |
|
||
Using `client_encrypted` perform the following operations: | ||
|
||
1. Insert `{ "_id": "over_2mib_under_16mib", "unencrypted": <the string "a" repeated 2097152 times> }`. | ||
1. Insert `{ "_id": "over_2mib_under_16mib", "unencrypted": <the string "a" repeated 2097152 times> }` into `coll`. | ||
|
||
Expect this to succeed since this is still under the `maxBsonObjectSize` limit. | ||
|
||
2. Insert the document [limits/limits-doc.json](../limits/limits-doc.json) concatenated with | ||
`{ "_id": "encryption_exceeds_2mib", "unencrypted": < the string "a" repeated (2097152 - 2000) times > }` Note: | ||
`{ "_id": "encryption_exceeds_2mib", "unencrypted": < the string "a" repeated (2097152 - 2000) times > }` into `coll`. Note: | ||
limits-doc.json is a 1005 byte BSON document that encrypts to a ~10,000 byte document. | ||
|
||
Expect this to succeed since after encryption this still is below the normal maximum BSON document size. Note, before | ||
auto encryption this document is under the 2 MiB limit. After encryption it exceeds the 2 MiB limit, but does NOT | ||
exceed the 16 MiB limit. | ||
|
||
3. Bulk insert the following: | ||
3. Use MongoCollection.bulkWrite to insert the following into `coll`: | ||
|
||
- `{ "_id": "over_2mib_1", "unencrypted": <the string "a" repeated (2097152) times> }` | ||
- `{ "_id": "over_2mib_2", "unencrypted": <the string "a" repeated (2097152) times> }` | ||
|
||
Expect the bulk write to succeed and split after first doc (i.e. two inserts occur). This may be verified using | ||
[command monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md). | ||
|
||
4. Bulk insert the following: | ||
4. Use MongoCollection.bulkWrite insert the following into `coll`: | ||
|
||
- The document [limits/limits-doc.json](../limits/limits-doc.json) concatenated with | ||
`{ "_id": "encryption_exceeds_2mib_1", "unencrypted": < the string "a" repeated (2097152 - 2000) times > }` | ||
|
@@ -608,15 +611,36 @@ Using `client_encrypted` perform the following operations: | |
Expect the bulk write to succeed and split after first doc (i.e. two inserts occur). This may be verified using | ||
[command logging and monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md). | ||
|
||
5. Insert `{ "_id": "under_16mib", "unencrypted": <the string "a" repeated 16777216 - 2000 times>`. | ||
5. Insert `{ "_id": "under_16mib", "unencrypted": <the string "a" repeated 16777216 - 2000 times>` into `coll`. | ||
|
||
Expect this to succeed since this is still (just) under the `maxBsonObjectSize` limit. | ||
|
||
6. Insert the document [limits/limits-doc.json](../limits/limits-doc.json) concatenated with | ||
`{ "_id": "encryption_exceeds_16mib", "unencrypted": < the string "a" repeated (16777216 - 2000) times > }` | ||
`{ "_id": "encryption_exceeds_16mib", "unencrypted": < the string "a" repeated (16777216 - 2000) times > }` into `coll`. | ||
|
||
Expect this to fail since encryption results in a document exceeding the `maxBsonObjectSize` limit. | ||
|
||
> [!NOTE] | ||
> MongoDB 8.0+ is required for MongoClient.bulkWrite | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggest removing this blurb, and instead changing steps 7 and 8 to be:
|
||
|
||
7. Use MongoClient.bulkWrite to insert the following into `coll2`: | ||
mdb-ad marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- `{ "_id": "over_2mib_3", "unencrypted": <the string "a" repeated (2097152 - 1500) times> }` | ||
- `{ "_id": "over_2mib_4", "unencrypted": <the string "a" repeated (2097152 - 1500) times> }` | ||
|
||
Expect the bulk write to succeed and split after first doc (i.e. two inserts occur). This may be verified using | ||
[command logging and monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md). | ||
|
||
8. Use MongoClient.bulkWrite to insert the following into `coll2`: | ||
|
||
- The document [limits/limits-qe-doc.json](../limits/limits-qe-doc.json) concatenated with | ||
`{ "_id": "encryption_exceeds_2mib_3", "foo": < the string "a" repeated (2097152 - 2000 - 1500) times > }` | ||
- The document [limits/limits-qe-doc.json](../limits/limits-qe-doc.json) concatenated with | ||
`{ "_id": "encryption_exceeds_2mib_4", "foo": < the string "a" repeated (2097152 - 2000 - 1500) times > }` | ||
|
||
Expect the bulk write to succeed and split after first doc (i.e. two inserts occur). This may be verified using | ||
[command logging and monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md). | ||
|
||
Optionally, if it is possible to mock the maxWriteBatchSize (i.e. the maximum number of documents in a batch) test that | ||
setting maxWriteBatchSize=1 and inserting the two documents `{ "_id": "a" }, { "_id": "b" }` with `client_encrypted` | ||
splits the operation into two inserts. | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -459,7 +459,7 @@ class BulkWriteResult { | |||||||||||
* The results of each individual write operation that was successfully performed. | ||||||||||||
* | ||||||||||||
* This value will only be populated if the verboseResults option was set to true. | ||||||||||||
*/ | ||||||||||||
*/ | ||||||||||||
verboseResults: Optional<VerboseResults>; | ||||||||||||
|
||||||||||||
/* rest of fields */ | ||||||||||||
|
@@ -553,7 +553,9 @@ The `bulkWrite` server command has the following format: | |||||||||||
} | ||||||||||||
``` | ||||||||||||
|
||||||||||||
Drivers MUST use document sequences ([`OP_MSG`](../message/OP_MSG.md) payload type 1) for the `ops` and `nsInfo` fields. | ||||||||||||
If auto-encryption is not enabled, drivers MUST use document sequences ([`OP_MSG`](../message/OP_MSG.md) payload type 1) | ||||||||||||
for the `ops` and `nsInfo` fields. If auto-encryption is enabled, drivers MUST NOT use document sequences and MUST | ||||||||||||
append the `ops` and `nsInfo` fields to the `bulkWrite` command document. | ||||||||||||
|
||||||||||||
The `bulkWrite` command is executed on the "admin" database. | ||||||||||||
|
||||||||||||
|
@@ -645,13 +647,6 @@ write concern containing the following message: | |||||||||||
|
||||||||||||
> Cannot request unacknowledged write concern and ordered writes | ||||||||||||
|
||||||||||||
## Auto-Encryption | ||||||||||||
|
||||||||||||
If `MongoClient.bulkWrite` is called on a `MongoClient` configured with `AutoEncryptionOpts`, drivers MUST return an | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove prose test 13 (expects an error) in the CRUD tests: https://github.com/mongodb/specifications/blob/0f4e11e78f50cf7f4dd167d5ba37a519dd8164f5/source/crud/tests/README.md#13-mongoclientbulkwrite-returns-an-error-if-auto-encryption-is-configured |
||||||||||||
error with the message: "bulkWrite does not currently support automatic encryption". | ||||||||||||
|
||||||||||||
This is expected to be removed once [DRIVERS-2888](https://jira.mongodb.org/browse/DRIVERS-2888) is implemented. | ||||||||||||
|
||||||||||||
## Command Batching | ||||||||||||
|
||||||||||||
Drivers MUST accept an arbitrary number of operations as input to the `MongoClient.bulkWrite` method. Because the server | ||||||||||||
|
@@ -672,8 +667,10 @@ multiple commands if the user provides more than `maxWriteBatchSize` operations | |||||||||||
|
||||||||||||
### Total Message Size | ||||||||||||
|
||||||||||||
Drivers MUST ensure that the total size of the `OP_MSG` built for each `bulkWrite` command does not exceed | ||||||||||||
`maxMessageSizeBytes`. | ||||||||||||
#### Unencrypted bulk writes | ||||||||||||
|
||||||||||||
When auto-encryption is not enabled, drivers MUST ensure that the total size of the `OP_MSG` built for each `bulkWrite` | ||||||||||||
command does not exceed `maxMessageSizeBytes`. | ||||||||||||
|
||||||||||||
The upper bound for the size of an `OP_MSG` includes opcode-related bytes (e.g. the `OP_MSG` header) and | ||||||||||||
operation-agnostic command field bytes (e.g. `txnNumber`, `lsid`). Drivers MUST limit the combined size of the | ||||||||||||
|
@@ -727,6 +724,12 @@ was determined. | |||||||||||
|
||||||||||||
Drivers MUST return an error if there is not room to add at least one operation to `ops`. | ||||||||||||
|
||||||||||||
#### Auto-encrypted bulk writes | ||||||||||||
|
||||||||||||
Drivers MUST use the reduced size limit defined in | ||||||||||||
[Size limits for Write Commands](../client-side-encryption/client-side-encryption.md#size-limits-for-write-commands) for | ||||||||||||
the size of the `bulkWrite` command document when auto-encryption is enabled. | ||||||||||||
|
||||||||||||
## Handling the `bulkWrite` Server Response | ||||||||||||
|
||||||||||||
The server's response to `bulkWrite` has the following format: | ||||||||||||
|
@@ -857,6 +860,12 @@ When a `getMore` fails with a retryable error when attempting to iterate the res | |||||||||||
entire `bulkWrite` command to receive a fresh cursor and retry iteration. This work was omitted to minimize the scope of | ||||||||||||
the initial implementation and testing of the new bulk write API, but may be revisited in the future. | ||||||||||||
|
||||||||||||
### Use document sequences for auto-encrypted bulk writes | ||||||||||||
|
||||||||||||
Auto-encryption does not currently support document sequences. This specification should be updated when | ||||||||||||
[DRIVERS-2859](https://jira.mongodb.org/browse/DRIVERS-2859) is completed to require use of document sequences for `ops` | ||||||||||||
and `nsInfo` when auto-encryption is enabled. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggest adding a blurb suggesting waiting for DRIVERS-2859 before implementing if using non-document-sequences is a significant change:
Suggested change
Motivated by this comment: #1770 (comment) |
||||||||||||
|
||||||||||||
## Q&A | ||||||||||||
|
||||||||||||
### Is `bulkWrite` supported on Atlas Serverless? | ||||||||||||
|
@@ -928,6 +937,8 @@ error in this specific situation does not seem helpful enough to require size ch | |||||||||||
|
||||||||||||
## **Changelog** | ||||||||||||
|
||||||||||||
- 2025-08-13: Removed the requirement to error when QE is enabled. | ||||||||||||
|
||||||||||||
- 2025-06-27: Added `rawData` option. | ||||||||||||
|
||||||||||||
- 2024-11-05: Updated the requirements regarding the size validation. | ||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step requires at least QE server support (server 7.0+). Since the
MongoCollection.bulkWrite
cases can run on older servers, suggest conditioning this step on server 8.0+