Skip to content

Commit da80b73

Browse files
First draft
1 parent db69351 commit da80b73

File tree

2 files changed

+227
-0
lines changed

2 files changed

+227
-0
lines changed

source/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
- [MongoDB Handshake](mongodb-handshake/handshake.md)
3737
- [OCSP Support](ocsp-support/ocsp-support.md)
3838
- [OP_MSG](message/OP_MSG.md)
39+
- [OpenTelemetry](open-telemetry/open-telemetry.md)
3940
- [Performance Benchmarking](benchmarking/benchmarking.md)
4041
- [Polling SRV Records for mongos Discovery](polling-srv-records-for-mongos-discovery/polling-srv-records-for-mongos-discovery.md)
4142
- [Read and Write Concern](read-write-concern/read-write-concern.md)
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
# OpenTelemetry
2+
3+
- Title: OpenTelemetry
4+
- Status: Accepted
5+
- Minimum Server Version: N/A
6+
7+
______________________________________________________________________
8+
9+
## Abstract
10+
11+
This specification defines requirements for drivers' OpenTelemetry integration and behavior. Drivers will trace database
12+
commands and driver operations with a pre-defined set of attributes when OpenTelemetry is enabled and configured in an
13+
application.
14+
15+
## META
16+
17+
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
18+
"OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).
19+
20+
## Specification
21+
22+
### Terms
23+
24+
**Host Application**
25+
26+
An application that uses the MongoDB driver.
27+
28+
**Span**
29+
30+
A Span represents a single operation within a trace. Spans can be nested to form a trace tree. Each trace contains a
31+
root span, which typically describes the entire operation and, optionally, one or more sub-spans for its sub-operations.
32+
33+
Spans encapsulate:
34+
35+
- The span name
36+
- An immutable SpanContext that uniquely identifies the Span
37+
- A parent span in the form of a Span, SpanContext, or null
38+
- A SpanKind
39+
- A start timestamp
40+
- An end timestamp
41+
- Attributes
42+
- A list of Links to other Spans
43+
- A list of timestamped Events
44+
- A Status.
45+
46+
**Tracer**
47+
48+
A Tracer is responsible for creating spans, and using a tracer is the only way to create a span. A Tracer is not
49+
responsible for configuration; this should be the responsibility of the TracerProvider instead.
50+
51+
**OpenTelemetry API and SDK**
52+
53+
OpenTelemetry offers two components for implementing instrumentation – API and SDK. The OpenTelemetry API provides all
54+
the necessary types and method signatures. If there is no OpenTelemetry SDK available at runtime, API methods are no-op.
55+
OpenTelemetry SDK is an actual implementation of the API. If the SDK is available, API methods do work.
56+
57+
### Implementation Requirements
58+
59+
Drivers MAY add a dependency to the corresponding OpenTelemetry API. This is the recommended way for implementing
60+
OpenTelemetry in libraries. Alternatively, drivers can implement OpenTelemetry support using any suitable tools within
61+
the driver ecosystem. Drivers MUST NOT add a dependency to OpenTelemetry SDK.
62+
63+
#### Enabling and Disabling OpenTelemetry
64+
65+
OpenTelemetry SHOULD be disabled by default.
66+
67+
Drivers SHOULD support configuring OpenTelemetry on multiple levels.
68+
69+
- **MongoClient Level**: Drivers SHOULD provide a configuration option for `MongoClient`'s Configuration/Settings that
70+
enables or disables tracing for operations and commands executed with this client. This option MUST override
71+
settings on higher levels.
72+
- **Driver Level**: Drivers SHOULD provide a global setting that enables or disables OpenTelemetry for all `MongoClient`
73+
instances (excluding those that explicitly override the setting). This configuration can be implemented with an
74+
environment variable `OTEL_#{LANG}_INSTRUMENTATION_MONGODB_ENABLED`. Drivers MAY provide other means to globally
75+
disable OpenTelemetry that are more suitable for their language ecosystem. This option MUST override settings on the
76+
higher level.
77+
- **Host Application Level**: If the host application enables OpenTelemetry for all available instrumentations (e.g.,
78+
Ruby), and a driver can detect this, OpenTelemetry SHOULD be enabled in the driver.
79+
80+
Drivers MUST NOT try to detect whether the OpenTelemetry SDK library is available, and enable tracing based on this.
81+
82+
#### Tracer Attributes
83+
84+
If a driver creates a Tracer using OpenTelemetry API, drivers MUST use the following attributes:
85+
86+
- `name`: A string that identifies the driver. It can be the name of a driver's component (e.g., "mongo", "PyMongo") or
87+
a package name (e.g., "com.mongo.Driver"). Drivers SHOULD select a name that is idiomatic for their language and
88+
ecosystem. Drivers SHOULD follow the Instrumentation Scope guidance.
89+
- `version`: The version of the driver.
90+
91+
#### Instrumenting Driver Operations
92+
93+
When a user calls the driver's public API, the driver MUST create a span for every driver operation. Drivers MUST start
94+
the span as soon as possible so that the span’s duration reflects all activities made by the driver, such as server
95+
selection and serialization/deserialization.
96+
97+
##### `withTransaction`
98+
99+
The `withTransaction` operation is a special case because it may include other operations that are executed "in scope"
100+
of `withTransaction`. In this case, spans for operations that are executed inside the callbacks SHOULD be nested into
101+
the `withTransaction` span.
102+
103+
##### Span Name
104+
105+
The span name SHOULD be:
106+
107+
- `driver_operation_name db.collection_name` if the command is executed on a collection (e.g.,
108+
`findOneAndDelete warehouse.users`).
109+
- `db.driver_operation_name` if there is no specific collection for the command (e.g., `warehouse.runCommand`).
110+
111+
##### Span Kind
112+
113+
Span kind MUST be "client".
114+
115+
##### Span Attributes
116+
117+
Spans SHOULD have the following attributes:
118+
119+
| Attribute | Type | Description | Requirement Level |
120+
| :--------------------- | :------- | :------------------------------------------------------------------------- | :-------------------- |
121+
| `db.system` | `string` | MUST be 'mongodb' | Required |
122+
| `db.namespace` | `string` | The database name | Required if available |
123+
| `db.collection.name` | `string` | The collection being accessed within the database stated in `db.namespace` | Required if available |
124+
| `db.operation.name` | `string` | The name of the driver operation being executed | Required |
125+
| `db.operation.summary` | `string` | Equivalent to span name | Required |
126+
| `db.mongodb.cursor_id` | `int64` | If a cursor is created or used in the operation | Required if available |
127+
128+
Not all attributes are available at the moment of span creation. Drivers need to add attributes at later stages, which
129+
requires an operation span to be available throughout the complete operation lifecycle.
130+
131+
##### Exceptions
132+
133+
If the driver operation fails with an exception, drivers MUST record an exception to the current operation span. When
134+
recording an exception, drivers SHOULD add the following attributes to the span, when the content for the attribute if
135+
available:
136+
137+
- `exception.message`
138+
- `exception.type`
139+
- `exception.stacktrace`
140+
141+
#### Instrumenting Server Commands
142+
143+
Drivers MUST create a span for every server command sent to the server as a result of a public API call, except for
144+
sensitive commands as listed in the command logging and monitoring specification.
145+
146+
Spans for commands MUST be nested to the span for the corresponding driver operation span. If the command is being
147+
retried, the driver MUST create a separate span for each retry.
148+
149+
##### Span Name
150+
151+
The span name SHOULD be:
152+
153+
- `server_command db.collection_name` if the command is executed on a collection (e.g.,
154+
`findAndModify warehouse.users`).
155+
- `db.server_command` if there is no specific collection for the command.
156+
157+
##### Span Kind
158+
159+
Span kind MUST be "client".
160+
161+
##### Span Attributes
162+
163+
Spans SHOULD have the following attributes:
164+
165+
| Attribute | Type | Description | Requirement Level |
166+
| :-------------------------------- | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------- |
167+
| `db.system` | `string` | MUST be 'mongodb' | Required |
168+
| `db.namespace` | `string` | The database name | Required if available |
169+
| `db.collection.name` | `string` | The collection being accessed within the database stated in `db.namespace` | Required if available |
170+
| `db.command.name` | `string` | The name of the server command being executed | Required |
171+
| `db.response.status_code` (\*) | `string` | MongoDB error code represented as a string. This attribute should be added only if an error happens. | Required if an error happens |
172+
| `error.type` (\*) | `string` | Describes a class of error the operation ended with. This attribute should be added only if an error happens. Examples: `timeout; java.net.UnknownHostException; server_certificate_invalid; 500`. | Required if an error happens |
173+
| `server.port` | `int64` | Server port number | Required |
174+
| `server.address` | `string` | Name of the database host, or IP address if name is not known | Required |
175+
| `network.transport` | `string` | MUST be 'tcp' or 'unix' depending on the protocol | Required |
176+
| `db.query.summary` | `string` | Equivalent to span name | Required |
177+
| `db.mongodb.server_connection_id` | `int64` | Server connection id | Required if available |
178+
| `db.mongodb.driver_connection_id` | `int64` | Local connection id | Required if available |
179+
| `db.query.text` (\*\*) | `string` | Database command that was sent to the server. Content should be equivalent to the `document` field of the CommandStartedEvent of the command monitoring. | Conditional |
180+
| `db.mongodb.cursor_id` (\*\*\*) | `int64` | If a cursor is created or used in the operation | Required if available |
181+
182+
(\*) `db.response.status_code` and `error.type` attributes should be added only if the command was not successful. The
183+
content of `error.type` is language specific; a driver decides what best describes the error.
184+
185+
(\*\*) `db.query.text` contains the full database command executed serialized to extended JSON. Drivers MUST NOT add
186+
this attribute by default. Drivers MUST provide a toggle to enable this attribute. This configuration can be implemented
187+
with an environment variable `OTEL_#{LANG}_INSTRUMENTATION_MONGODB_QUERY_TEXT_MAX_LENGTH` set to a positive integer
188+
value. The attribute will be added and truncated to the provided value (similar to the Logging specification).
189+
190+
(\*\*\*) If the command returns a cursor, or uses a cursor, the `cursor_id` attribute SHOULD be added.
191+
192+
##### Exception Handling
193+
194+
If an exception was thrown, it MUST be recorded in accordance with OpenTelemetry specifications for exceptions.
195+
196+
## Motivation for Change
197+
198+
A common complaint from our support team is that they don't know how to easily get debugging information from drivers.
199+
Some drivers provide debug logging, but others do not. For drivers that do provide it, the log messages produced and the
200+
mechanisms for enabling debug logging are inconsistent.
201+
202+
Although users can implement their own debug logging support via existing driver events (SDAM, APM, etc), this requires
203+
code changes. It is often difficult to quickly implement and deploy such changes in production at the time they are
204+
needed, and to remove the changes afterward. Additionally, there are useful scenarios to log that do not correspond to
205+
existing events. Standardizing on debug log messages that drivers produce and how to enable/configure logging will
206+
provide TSEs, CEs, and MongoDB users an easier way to get debugging information out of our drivers, facilitate support
207+
of drivers for our internal teams, and improve our documentation around troubleshooting.
208+
209+
## Test Plan
210+
211+
TODO
212+
213+
## Backwards Compatibility
214+
215+
Introduction of OpenTelemetry in new driver versions should not significantly affect existing applications that do not
216+
enable OpenTelemetry. However, since the no-op tracing operation may introduce some performance degradation (though it
217+
should be negligible), customers should be informed of this feature and how to disable it completely.
218+
219+
If a driver is used in an application that has OpenTelemetry enabled, customers will see traces from the driver in their
220+
OpenTelemetry backends. This may be unexpected and MAY cause negative effects in some cases (e.g., the OpenTelemetry
221+
backend MAY not have enough capacity to process new traces). Customers should be informed of this feature and how to
222+
disable it completely.
223+
224+
## Security Implication
225+
226+
Drivers MUST take care to avoid exposing sensitive information (e.g. authentication credentials) in traces.

0 commit comments

Comments
 (0)