Skip to content

DRIVERS-719 OpenTelementry specification #1826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
97d8a5b
added first tracing tests
nhachicha Jul 11, 2025
c82a6a2
using unsetOrMatches for txNumber, since it's not set for single topo…
nhachicha Jul 16, 2025
61270d6
updating tests
nhachicha Jul 20, 2025
581faa1
Tests and schema update for the upcoming Otel spec
nhachicha Aug 8, 2025
c447d10
Merge branch 'nh/tracing'
nhachicha Aug 8, 2025
3a578ee
Update generated files
nhachicha Aug 8, 2025
dbb1a9c
update schema to latest (1.26)
nhachicha Aug 8, 2025
29e25a1
update tests
nhachicha Aug 8, 2025
f801d78
DRIVERS-719 OpenTelementry specification
comandeo-mongo Jun 26, 2025
9d66a8a
Cursor operations
comandeo-mongo Aug 1, 2025
269221c
update doc
nhachicha Aug 9, 2025
452696d
moved tests under open-telemtry directroy
nhachicha Aug 9, 2025
9f05c59
Fixing formatting
nhachicha Aug 9, 2025
0523751
update formatting using pre-commit mdformat
nhachicha Aug 9, 2025
8466c1b
Adding Test README
nhachicha Aug 9, 2025
18ad723
Adding retries test, fixing duplicate exception tag
nhachicha Aug 13, 2025
04b947a
Update source/open-telemetry/open-telemetry.md
nhachicha Aug 14, 2025
88c7be0
Update source/open-telemetry/open-telemetry.md
nhachicha Aug 14, 2025
ced82eb
Update source/unified-test-format/tests/invalid/entity-client-observe…
nhachicha Aug 15, 2025
f1804c3
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
03d4e3c
Update source/unified-test-format/tests/invalid/entity-client-observe…
nhachicha Aug 15, 2025
3c69ed5
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
240d009
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
6a4d217
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
b596f05
Update generated files
nhachicha Aug 15, 2025
bbb075b
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
a5638da
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
56570a5
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
a4aa513
Update source/unified-test-format/tests/invalid/expectedTracingSpans-…
nhachicha Aug 15, 2025
cb74b54
Using YAML anchors and references
nhachicha Aug 14, 2025
3caa87f
PR feedback: Updating schema, adding more invalid tests
nhachicha Aug 15, 2025
7a68cbc
Fixing lint errors
nhachicha Aug 15, 2025
1c24759
Update source/open-telemetry/open-telemetry.md
comandeo-mongo Aug 18, 2025
a989a49
Update source/open-telemetry/open-telemetry.md
comandeo-mongo Aug 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
- [MongoDB Handshake](mongodb-handshake/handshake.md)
- [OCSP Support](ocsp-support/ocsp-support.md)
- [OP_MSG](message/OP_MSG.md)
- [OpenTelemetry](open-telemetry/open-telemetry.md)
- [Performance Benchmarking](benchmarking/benchmarking.md)
- [Polling SRV Records for mongos Discovery](polling-srv-records-for-mongos-discovery/polling-srv-records-for-mongos-discovery.md)
- [Read and Write Concern](read-write-concern/read-write-concern.md)
Expand Down
251 changes: 251 additions & 0 deletions source/open-telemetry/open-telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
# OpenTelemetry

- Title: OpenTelemetry
- Status: Accepted
- Minimum Server Version: N/A

______________________________________________________________________

## Abstract

This specification defines requirements for drivers' OpenTelemetry integration and behavior. Drivers will trace database
commands and driver operations with a pre-defined set of attributes when OpenTelemetry is enabled and configured in an
application.

## META

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).

## Specification

### Terms

**Host Application**

An application that uses the MongoDB driver.

#### Span

A Span represents a single operation within a trace. Spans can be nested to form a trace tree. Each trace contains a
root span, which typically describes the entire operation and, optionally, one or more sub-spans for its sub-operations.

Spans encapsulate:

- The span name
- An immutable SpanContext that uniquely identifies the Span
- A parent span in the form of a Span, SpanContext, or null
- A SpanKind
- A start timestamp
- An end timestamp
- Attributes
- A list of links to other Spans
- A list of timestamped Events
- A Status.

#### Tracer

A Tracer is responsible for creating spans, and using a tracer is the only way to create a span. A Tracer is not
responsible for configuration; this should be the responsibility of the TracerProvider instead.

**OpenTelemetry API and SDK**

OpenTelemetry offers two components for implementing instrumentation – API and SDK. The OpenTelemetry API provides all
the necessary types and method signatures. If there is no OpenTelemetry SDK available at runtime, API methods are
no-ops. OpenTelemetry SDK is an actual implementation of the API. If the SDK is available, API methods do work.

### Implementation Requirements

#### External Dependencies

Drivers MAY add a dependency to the corresponding OpenTelemetry API. This is the recommended way for implementing
OpenTelemetry in libraries. Alternatively, drivers can implement OpenTelemetry support using any suitable tools within
the driver ecosystem. Drivers MUST NOT add a dependency to OpenTelemetry SDK.

#### Enabling and Disabling OpenTelemetry

OpenTelemetry SHOULD be disabled by default.

Drivers SHOULD support configuring OpenTelemetry on multiple levels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the spec tests will only exercise enabling OpenTelemetry at the MongoClient level? Would you intend to have prose tests to cover the other two mechanisms (and just assert functionality with a single operation)?


- **MongoClient Level**: Drivers SHOULD provide a configuration option for `MongoClient`'s Configuration/Settings that
enables or disables tracing for operations and commands executed with this client. This option MUST override
settings on higher levels. This configuration can be implemented with a `MongoClient` option, for example,
`tracing.enabled`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this section clarify that this is not a URI option, or is that redundant? I'm not familiar with the language in other specs for non-URI MongoClient options, but we should follow prior art here. Disregard if there's no concern.

- **Driver Level**: Drivers SHOULD provide a global setting that enables or disables OpenTelemetry for all `MongoClient`
instances (excluding those that explicitly override the setting). This configuration SHOULD be implemented with an
environment variable `OTEL_#{LANG}_INSTRUMENTATION_MONGODB_ENABLED`. Drivers MAY provide other means to globally
disable OpenTelemetry that are more suitable for their language ecosystem. This option MUST override settings on the
higher level.
- **Host Application Level**: If the host application enables OpenTelemetry for all available instrumentations (e.g.,
Ruby), and a driver can detect this, OpenTelemetry SHOULD be enabled in the driver.

Drivers MUST NOT try to detect whether the OpenTelemetry SDK library is available, and enable tracing based on this.

#### Tracer Attributes

If a driver creates a Tracer using OpenTelemetry API, drivers MUST use the following attributes:

- `name`: A string that identifies the driver. It can be the name of a driver's component (e.g., "mongo", "PyMongo") or
a package name (e.g., "com.mongo.Driver"). Drivers SHOULD select a name that is idiomatic for their language and
ecosystem. Drivers SHOULD follow the Instrumentation Scope guidance.
- `version`: The version of the driver.

#### Instrumenting Driver Operations

When a user calls the driver's public API, the driver MUST create a span for every driver operation. Drivers MUST start
the span as soon as possible so that the span’s duration reflects all activities made by the driver, such as server
selection and serialization/deserialization.

The span for the operation MUST be created within the current span of the host application, with the exceptions listed
below.

##### Cursors

If the driver operation returns a cursor, spans for all the subsequent operations on the cursor SHOULD be nested into
the operation span. This includes operations such as `getMore`, `next`, `close`.

##### `withTransaction`

The `withTransaction` operation is a special case because it may include other operations that are executed "in scope"
of `withTransaction`. In this case, spans for operations that are executed inside the callbacks SHOULD be nested into
the `withTransaction` span.

##### Span Name

The span name SHOULD be:

- `driver_operation_name db.collection_name` if the command is executed on a collection (e.g.,
`findOneAndDelete warehouse.users`).
- `db.driver_operation_name` if there is no specific collection for the command (e.g., `warehouse.runCommand`).
Comment on lines +118 to +120
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks inconsistent. Why the driver_operation_name is at the beginning for a collection operation and at the end for a database operation?

One of this would be more consistent:

  • warehouse.users.findOneAndDelete and warehouse.runCommand
  • findOneAndDelete warehouse.users and runCommand warehouse

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GromNaN: Note the same inconsistency exists in the Instrumenting Server Commands > Span Name section below. I'd be in favor of using the operation/command name as the initial prefix and then appending the database or full namespace as appropriate.


##### Span Kind

Span kind MUST be "client".

##### Span Attributes

Spans SHOULD have the following attributes:

| Attribute | Type | Description | Requirement Level |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.client.connection.pool.name is not listed. Is it because it doesn't have the "stable" flag? It would be easy to fill when the connection is created by an ODM, they usually have a connection name.

| :--------------------- | :------- | :------------------------------------------------------------------------- | :-------------------- |
| `db.system` | `string` | MUST be 'mongodb' | Required |
| `db.namespace` | `string` | The database name | Required if available |
| `db.collection.name` | `string` | The collection being accessed within the database stated in `db.namespace` | Required if available |
| `db.operation.name` | `string` | The name of the driver operation being executed | Required |
| `db.operation.summary` | `string` | Equivalent to span name | Required |
| `db.mongodb.cursor_id` | `int64` | If a cursor is created or used in the operation | Required if available |

Not all attributes are available at the moment of span creation. Drivers need to add attributes at later stages, which
requires an operation span to be available throughout the complete operation lifecycle.

##### Exceptions

If the driver operation fails with an exception, drivers MUST record an exception to the current operation span. When
recording an exception, drivers SHOULD add the following attributes to the span, when the content for the attribute if
available:

- `exception.message`
- `exception.type`
- `exception.stacktrace`

#### Instrumenting Server Commands

Drivers MUST create a span for every server command sent to the server as a result of a public API call, except for
sensitive commands as listed in the command logging and monitoring specification.

Spans for commands MUST be nested to the span for the corresponding driver operation span. If the command is being
retried, the driver MUST create a separate span for each retry; all the retries MUST be nested to the same operation
span.

##### Span Name

The span name SHOULD be:

- `server_command db.collection_name` if the command is executed on a collection (e.g.,
`findAndModify warehouse.users`).
- `db.server_command` if there is no specific collection for the command.

##### Span Kind

Span kind MUST be "client".

##### Span Attributes

Spans SHOULD have the following attributes:

| Attribute | Type | Description | Requirement Level |
| :-------------------------------- | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------- |
| `db.system` | `string` | MUST be 'mongodb' | Required |
| `db.namespace` | `string` | The database name | Required if available |
| `db.collection.name` | `string` | The collection being accessed within the database stated in `db.namespace` | Required if available |
| `db.command.name` | `string` | The name of the server command being executed | Required |
| `db.response.status_code` | `string` | MongoDB error code represented as a string. This attribute should be added only if an error happens. | Required if an error happens |
| `error.type` | `string` | Describes a class of error the operation ended with. This attribute should be added only if an error happens. Examples: `timeout; java.net.UnknownHostException; server_certificate_invalid; 500`. | Required if an error happens |
| `server.port` | `int64` | Server port number | Required |
| `server.address` | `string` | Name of the database host, or IP address if name is not known | Required |
| `network.transport` | `string` | MUST be 'tcp' or 'unix' depending on the protocol | Required |
| `db.query.summary` | `string` | Equivalent to span name | Required |
| `db.mongodb.server_connection_id` | `int64` | Server connection id | Required if available |
| `db.mongodb.driver_connection_id` | `int64` | Local connection id | Required if available |
| `db.query.text` | `string` | Database command that was sent to the server. Content should be equivalent to the `document` field of the CommandStartedEvent of the command monitoring. | Conditional |
| `db.mongodb.cursor_id` | `int64` | If a cursor is created or used in the operation | Required if available |

##### db.response.status_code and error.type

These attributes should be added only if the command was not successful. The content of `error.type` is language
specific; a driver decides what best describes the error.

##### db.query.text

This attribute contains the full database command executed serialized to extended JSON. If not truncated, the content of this
attribute SHOULD be equivalent to the `document` field of the CommandStartedEvent of the command monitoring excluding
the following fields: `lsid`, `$db`, `$clusterTime`, `signature`.

Drivers MUST NOT add this attribute by default. Drivers MUST provide a toggle to enable this attribute. This
configuration can be implemented with an environment variable
`OTEL_#{LANG}_INSTRUMENTATION_MONGODB_QUERY_TEXT_MAX_LENGTH` set to a positive integer value. The attribute will be
added and truncated to the provided value (similar to the Logging specification).

On the `MongoClient` level this configuration can be implemented with a `MongoClient` option, for example,
`tracing.query_text_max_length`.

##### db.mongodb.cursor_id

If the command returns a cursor, or uses a cursor, the `cursor_id` attribute SHOULD be added.

##### Exception Handling

Exceptions MUST be added to the parent span of the command span, which is the driver operation span.

## Motivation for Change

A common complaint from our support team is that they don't know how to easily get debugging information from drivers.
Some drivers provide debug logging, but others do not. For drivers that do provide it, the log messages produced and the
mechanisms for enabling debug logging are inconsistent.

Although users can implement their own debug logging support via existing driver events (SDAM, APM, etc), this requires
code changes. It is often difficult to quickly implement and deploy such changes in production at the time they are
needed, and to remove the changes afterward. Additionally, there are useful scenarios to log that do not correspond to
existing events. Standardizing on debug log messages that drivers produce and how to enable/configure logging will
provide TSEs, CEs, and MongoDB users an easier way to get debugging information out of our drivers, facilitate support
of drivers for our internal teams, and improve our documentation around troubleshooting.

## Test Plan

TODO

## Backwards Compatibility

Introduction of OpenTelemetry in new driver versions should not significantly affect existing applications that do not
enable OpenTelemetry. However, since the no-op tracing operation may introduce some performance degradation (though it
should be negligible), customers should be informed of this feature and how to disable it completely.

If a driver is used in an application that has OpenTelemetry enabled, customers will see traces from the driver in their
OpenTelemetry backends. This may be unexpected and MAY cause negative effects in some cases (e.g., the OpenTelemetry
backend MAY not have enough capacity to process new traces). Customers should be informed of this feature and how to
disable it completely.

## Security Implication

Drivers MUST take care to avoid exposing sensitive information (e.g. authentication credentials) in traces.
34 changes: 34 additions & 0 deletions source/open-telemetry/tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# OpenTelemetry Tests

______________________________________________________________________

## Testing

### Automated Tests

The YAML and JSON files in this directory are platform-independent tests meant to exercise a driver's implementation of
the OpenTelemetry specification. These tests utilize the
[Unified Test Format](../../unified-test-format/unified-test-format.md).

For each test, create a MongoClient, configure it to enable tracing.

```yaml
createEntities:
- client:
id: client0
observeTracingMessages:
enableCommandPayload: true
```

These tests require the ability to collect tracing [spans](../open-telemetry.md#span) data in a structured form as
described in the
[Unified Test Format specification.expectTracingMessages](../../unified-test-format/unified-test-format.md#expectTracingMessages).
For example the Java driver uses [Micrometer](https://jira.mongodb.org/browse/JAVA-5732) to collect tracing spans.

```yaml
expectTracingMessages:
client: client0
ignoreExtraSpans: false
spans:
...
```
Loading
Loading