Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions source/crud/crud.md
Original file line number Diff line number Diff line change
Expand Up @@ -815,6 +815,13 @@ database-level aggregation will allow users to receive a cursor from these colle

##### Insert, Update, Replace, Delete, and Bulk Writes

###### Generated identifiers

The insert and bulk insert operations described below MUST generate identifiers for all documents that do not already
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted that this is already discussed in the client bulk write spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this behavior discussed here for general CRUD operations makes sense to me, as the client bulk write spec follows from this broader context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client bulk write spec stands on its own, except for a reference back to CRUD for modeling unacknowledged write results.

With my last comment, I just meant to acknowledge that no changes were needed to the client bulk write spec since it already addressed _id ordering. This was in the vein of previous PRs like #1644 that required updates to both specs.

have them. These identifiers SHOULD be prepended to the document so they are the first field, in order to prevent the
server from spending time re-ordering the document. If a document already has a user-provided identifier, the driver MAY
re-order the document so the identifier is the first field.

```typescript
interface Collection {

Expand Down Expand Up @@ -2474,6 +2481,8 @@ aforementioned allowance in the SemVer spec.

## Changelog

- 2024-10-28: Clarified that generated identifiers should be prepended to documents.

- 2024-10-01: Add sort option to `replaceOne` and `updateOne`.

- 2024-09-12: Specify that explain helpers support maxTimeMS.
Expand Down
12 changes: 12 additions & 0 deletions source/crud/tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -737,3 +737,15 @@ that `firstEvent.operationId` is equal to `secondEvent.operationId`. Assert both
To force completion of the `w:0` writes, execute `coll.countDocuments` and expect the returned count is
`maxMessageSizeBytes / maxBsonObjectSize + 1`. This is intended to avoid incomplete writes interfering with other tests
that may use this collection.

### 16. Generated document identifiers are the first field in their document

Construct a `MongoClient` (referred to as `client`) with
[command monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md) enabled to observe
CommandStartedEvents. For each of `insertOne`, client `bulkWrite`, and collection `bulkWrite`, do the following:

- Execute the command with a document that does not contain an `_id` field.
- If possible, capture the wire protocol message (referred to as `request`) of the command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only necessary in languages where the ordering is still not deterministic in the CommandStartedEvent's command document?

Copy link
Contributor Author

@NoahStapp NoahStapp Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the preferred way to verify the ordering of the document, as drivers may modify the document between emitting the CommandStartedEvent and the actual wire transfer of the command. For example, the Python driver re-orders _id to be the first field during BSON conversion, which takes place after the CommandStartedEvent would be emitted.

Verifying the order of the actual transmitted payload document ensures that the server receives exactly what we expect it to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could potentially be rephrased if wire protocol parsing is difficult for drivers to achieve, but I agree it would be easy to assert _id is the first key in command monitoring and that not actually be the case when the wire message is produced.

I think perhaps for this test; we can encourage drivers to just use command monitoring as the "main path" set of assertions to implement. But we should encourage drivers to check that their wire message key order either matches their command events or that reordering is done when the wire message is built.

So for Node, I would write a test that asserts the JS object that is inspectable on the event (for any command) matches key order in BSON. Python may check that their wire messages are ordered correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If drivers are able to check wire messages directly, they should take that path. Otherwise, they should fall back to using command monitoring, ideally verifying that they don't modify field ordering before sending data over the wire.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, after all the point here is that the bytes are in an order that the server benefits from, if your command monitoring is a source of truth for such a thing you could/should use it but the real goal is in the wire message.

- Assert that the first field of `request.documents[0]` is `_id`
- Otherwise, capture the CommandStartedEvent (referred to as `event`) emitted by the command.
- Assert that the first field of `event.command.documents[0]` is `_id`.
1 change: 0 additions & 1 deletion source/transactions/transactions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1072,7 +1072,6 @@ objective of avoiding duplicate commits.

## **Changelog**


- 2024-10-28: Clarify when drivers must add TransientTransactionError label.

- 2024-10-28: Note read preference must always be primary in a transaction.
Expand Down
Loading