You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Trade-offs and mitigations](#trade-offs-and-mitigations)
15
+
-[Prototypes](#prototypes)
17
16
-[Prior art and alternatives](#prior-art-and-alternatives)
18
-
-[Open questions](#open-questions)
19
17
-[Future possibilities](#future-possibilities)
20
18
21
19
<!-- tocstop -->
22
20
23
-
This OTEP provides guidance on how to record errors using OpenTelemetry Logs,
21
+
This OTEP continues [Span Event API deprecation plan OTEP](./4430-span-event-api-deprecation-plan.md)
22
+
and provides guidance on how to record errors and exceptions using OpenTelemetry Logs,
24
23
focusing on minimizing duplication and providing context to reduce noise.
25
24
26
-
In the long term, errors recorded in logs **will replace span events**
27
-
(according to the [Event vision OTEP](./0265-event-vision.md)).
28
-
29
25
> [!NOTE]
30
26
> Throughout this OTEP, the terms exception and error are defined as follows:
31
27
>
@@ -36,18 +32,12 @@ In the long term, errors recorded in logs **will replace span events**
36
32
37
33
## Motivation
38
34
39
-
Today, OTel supports recording *exceptions* using span events available through the Trace API. Outside the OTel world,
40
-
*errors* are usually recorded by user apps and libraries using logging libraries,
41
-
and may be recorded as OTel logs via a logging bridge.
42
-
43
-
Using logs to record errors has the following advantages over using span events:
35
+
Today, OTel supports recording *exceptions* using span events available through
36
+
the Trace API that is [being deprecated](./4430-span-event-api-deprecation-plan.md).
37
+
Outside the OTel world, *exceptions* and *errors* are usually recorded by user apps
38
+
and libraries using logging libraries, and may be recorded as OTel logs via a logging bridge.
44
39
45
-
- They can be recorded for operations that don't have any tracing instrumentation.
46
-
- They can be sampled along with or separately from spans.
47
-
- They can have different severity levels to reflect how critical the error is.
48
-
- They are already reported natively by many frameworks and libraries.
49
-
50
-
Recording errors is essential for troubleshooting, but regardless of how they are recorded, they can be noisy:
40
+
Recording errors is essential for troubleshooting, but they can be noisy:
51
41
52
42
- Distributed applications experience transient errors at a rate proportional to their scale, and
53
43
errors in logs can be misleading. Individual occurrences of transient errors
@@ -86,8 +76,8 @@ be to record exception stack traces when logging exceptions at `Error` or higher
86
76
87
77
### Details
88
78
89
-
1. Errors SHOULD be recorded as [logs](https://github.com/open-telemetry/semantic-conventions/blob/v1.29.0/docs/exceptions/exceptions-logs.md)
90
-
or as [log-based events](https://github.com/open-telemetry/semantic-conventions/blob/v1.29.0/docs/general/events.md).
79
+
1. Errors SHOULD be recorded as [logs](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/exceptions/exceptions-logs.md)
80
+
or as [log-based events](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/general/events.md).
91
81
92
82
2. Instrumentations for incoming requests, message processing, background job execution, or others that wrap application code and usually
93
83
create local root spans, SHOULD record logs for unhandled errors with `Error` severity.
@@ -159,19 +149,18 @@ be to record exception stack traces when logging exceptions at `Error` or higher
159
149
- The application detects an invalid configuration at startup and shuts down.
160
150
- The application encounters a (presumably) terminal error, such as an out-of-memory condition.
161
151
162
-
6. When recording exceptions in logs, applications and instrumentations are encouraged to add additional attributes
163
-
to describe the context in which the exception was thrown.
164
-
They are also encouraged to define their own error events and enrich them with exception details.
152
+
6. When recording exceptions/errors in logs, applications and instrumentations are encouraged to add additional attributes
153
+
to describe the context in which the exception/error has happened.
154
+
They are also encouraged to define their own events and enrich them with exception/error details.
165
155
166
-
7. The OTel SDK SHOULD record stack traces on exceptions with severity `Error` or higher and SHOULD allow users to
167
-
change the threshold.
156
+
7. The OTel SDK SHOULD record exception stack traces on logs with severity `Error` or higher and drop
157
+
then on logs with lower severity. It SHOULD allow users to change the threshold.
168
158
169
159
See [logback exception config](https://logback.qos.ch/manual/layouts.html#ex) for an example of configuration that
170
160
records stack traces conditionally.
171
161
172
162
8. Instrumentation libraries that record exceptions using span events SHOULD gracefully migrate
173
-
to log-based exceptions, offering it as an opt-in feature first and then switching to log-based exceptions
174
-
in the next major version update.
163
+
to log-based exceptions following the migration path outlined in the [Span Event API deprecation plan OTEP](./4430-span-event-api-deprecation-plan.md).
175
164
176
165
## API changes
177
166
@@ -190,32 +179,17 @@ The OTel Logs API SHOULD provide methods that enrich log records with exception
190
179
191
180
The OTel SDK, based on the log severity and configuration, SHOULD record exception details fully or partially.
192
181
193
-
The signature of the method is to be determined by each language
194
-
and can be overloaded as appropriate, including the ability to customize stack trace
195
-
collection.
182
+
The signature of the method is to be determined by each language and can be overloaded
183
+
as appropriate.
196
184
197
-
It MUST be possible to efficiently set exception information on a log record based on configuration
185
+
It MUST be possible to efficiently set exception and error information on a log record based on configuration
198
186
and without using the `setException` method.
199
187
200
-
## Migrating instrumentations
201
-
202
-
> [!NOTE]
203
-
> New instrumentations or existing ones that do not record exceptions on span events SHOULD
204
-
> NOT start recording exceptions on span events. They SHOULD NOT implement the migration plan
205
-
> described below.
206
-
>
207
-
> This section covers migration recommendations for existing instrumentations that already
208
-
> report exceptions using span events.
209
-
210
-
We will define a configuration option to let users choose whether they want instrumentations to record exceptions
211
-
on span events or logs.
188
+
## SDK changes
212
189
213
-
A specific instrumentation SHOULD default to recording exceptions on span events in its current major version
214
-
and record them in logs only when the user opts in.
215
-
216
-
In the next major version, this instrumentation SHOULD stop recording exceptions on span events.
217
-
218
-
This is a simplified version of [stability opt-in migration](https://github.com/open-telemetry/semantic-conventions/blob/727700406f9e6cc3f4e4680a81c4c28f2eb71569/docs/http/README.md?plain=1#L13-L37) used in semantic conventions.
190
+
TODO: we should consider if exception instances should reach log processing pipeline
191
+
where their processing can be customized or we'd rather do it via a separate concept like exception
192
+
customizer.
219
193
220
194
## Examples
221
195
@@ -382,19 +356,9 @@ final class InstrumentedRecordInterceptor<K, V> implements RecordInterceptor<K,
382
356
383
357
See the [corresponding Java (tracing) instrumentation](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/spring/spring-kafka-2.7/library/src/main/java/io/opentelemetry/instrumentation/spring/kafka/v2_7/InstrumentedRecordInterceptor.java) for details.
384
358
385
-
## Trade-offs and mitigations
386
-
387
-
1. Switching from recording exceptions as span events to log records is a breaking change
388
-
for any component following existing [exception guidance](/specification/trace/exceptions.md).
389
-
390
-
2. Recording exceptions as log-based events would result in UX degradation for users
391
-
leveraging trace-only backends such as Jaeger.
359
+
## Prototypes
392
360
393
-
**Mitigation:**
394
-
395
-
In addition to the plan outlined in the [Migration](#migrating-instrumentations) section, we
396
-
should provide opt-in [log <-> span events conversion](https://github.com/open-telemetry/opentelemetry-specification/issues/4393)
397
-
following the [Event vision OTEP](./0265-event-vision.md#relationship-to-span-events).
361
+
TODO (at least prototype in the language that does not have exceptions).
398
362
399
363
## Prior art and alternatives
400
364
@@ -405,11 +369,6 @@ Alternatives:
405
369
We should still provide optimal guidance for greenfield applications and libraries,
406
370
covering the wider problem of recording errors.
407
371
408
-
## Open questions
409
-
410
-
- Do we need to have log-related limits similar to [span event limits](/specification/trace/sdk.md#span-limits)
411
-
on the SDK level?
412
-
413
372
## Future possibilities
414
373
415
374
Exception stack traces can be recorded in structured form instead of their
0 commit comments