Skip to content

Commit ed14d07

Browse files
committed
Add design docs
1 parent 0592075 commit ed14d07

File tree

3 files changed

+237
-0
lines changed

3 files changed

+237
-0
lines changed

docs/design/logs.md

Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# OpenTelemetry Rust Logs Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
## Overview
7+
8+
OpenTelemetry (OTel) Logs support differs from Metrics and Traces as it does not
9+
introduce a new logging API for end users. Instead, OTel recommends leveraging
10+
existing logging libraries such as `log` and `tracing`, while providing bridges
11+
(appenders) to route logs through OpenTelemetry.
12+
13+
Unlike Traces and Metrics, which introduced new APIs, Logs took a different
14+
approach due to the long history of existing logging solutions. In Rust, the
15+
most widely used logging libraries are `log` and `tracing`. OTel Rust maintains
16+
appenders for these libraries, allowing users to seamlessly integrate with
17+
OpenTelemetry without changing their existing logging instrumentation.
18+
19+
The `tracing` appender is particularly optimized for performance due to its
20+
widespread adoption and the fact that `tracing` itself has a bridge from the
21+
`log` crate. Notably, OpenTelemetry Rust itself is instrumented using `tracing`
22+
for internal logs. Additionally, when OTel began supporting logging as a signal,
23+
the `log` crate lacked structured logging support, reinforcing the decision to
24+
prioritize `tracing`.
25+
26+
## Benefits of OpenTelemetry Logs
27+
28+
- **Unified configuration** across Traces, Metrics, and Logs.
29+
- **Automatic correlation** with Traces.
30+
- **Consistent Resource attributes** across signals.
31+
- **Multiple destinations support**: Logs can continue flowing to existing
32+
destinations like stdout while also being sent to an OpenTelemetry-capable
33+
backend, typically via an OTLP Exporter or exporters that export to operating
34+
system native systems like `Windows ETW` or `Linux user_events`.
35+
- **Standalone logging support** for applications that use OpenTelemetry as
36+
their primary logging mechanism.
37+
38+
## Key Design Principles
39+
40+
- High performance - no locks/contention in the hot path, minimal/no heap
41+
allocation.
42+
- Capped resource usage - well-defined behavior when overloaded.
43+
- Self-observable.
44+
- Well defined Error handling, returning Result as appropriate instead of panic.
45+
- Minimal public API, exposing based on need only.
46+
47+
## Logs API
48+
49+
The OTel Logs API is not intended for direct end-user usage. Instead, it is
50+
designed for appender/bridge authors to integrate existing logging libraries
51+
with OpenTelemetry. However, there is nothing preventing it from being used by
52+
end-users.
53+
54+
### API Components
55+
56+
1. **Key-Value Structs**: Used in `LogRecord`, where keys are shared across
57+
signals but values differ from Metrics and Traces. This is because values in
58+
Logs can contain more complex structures than those in Traces and Metrics.
59+
2. **Traits**:
60+
- `LoggerProvider` - provides methods to obtain Logger.
61+
- `Logger` - provides methods to create LogRecord and emit the created
62+
LogRecord.
63+
- `LogRecord` - provides methods to populate LogRecord.
64+
3. **No-Op Implementations**: By default, the API performs no operations until
65+
an SDK is attached.
66+
67+
### Logs Flow
68+
69+
1. Obtain a `LoggerProvider` implementation.
70+
2. Use the `LoggerProvider` to create `Logger` instances, specifying a scope
71+
name (module/component emitting logs). Optional attributes and version are
72+
also supported.
73+
3. Use the `Logger` to create an empty `LogRecord` instance.
74+
4. Populate the `LogRecord` with body, timestamp, attributes, etc.
75+
5. Call `Logger.emit(LogRecord)` to process and export the log.
76+
77+
If only the Logs API is used (without an SDK), all the above steps result in no
78+
operations, following OpenTelemetry’s philosophy of separating API from SDK. The
79+
official Logs SDK provides real implementations to process and export logs.
80+
Users or vendors can also provide alternative SDK implementations.
81+
82+
## Logs SDK
83+
84+
The OpenTelemetry Logs SDK provides an OTel specification-compliant
85+
implementation of the Logs API, handling log processing and export.
86+
87+
### Core Components
88+
89+
#### `SdkLoggerProvider`
90+
91+
- Implements the `LoggerProvider` trait.
92+
- Creates and manages `SdkLogger` instances.
93+
- Holds logging configuration, including `Resource` and processors.
94+
- Does not retain a list of created loggers. Instead, it passes an owned clone
95+
of itself to each logger created. This is done so that loggers get a hold of
96+
the configuration (like which processor to invoke).
97+
- Uses an `Arc<LoggerProviderInner>` and delegates all configuration to
98+
`LoggerProviderInner`. This allows cheap cloning of itself and ensures all
99+
clones point to the same underlying configuration.
100+
- As `SdkLoggerProvider` only holds an `Arc` of its inner, it can only accept
101+
`&self` in its methods like flush and shutdown. Else it needs to rely on
102+
interior mutability that comes with runtime performance costs. Since methods
103+
like shutdown usually need to mutate interior state, components like exporter
104+
use interior mutability to handle shutdown. (More on this in the exporter
105+
section)
106+
- `LoggerProviderInner` implements `Drop`, triggering `shutdown()` when no
107+
references remain. However, in practice, loggers are often stored statically
108+
inside appenders (like tracing-appender), so explicit shutdown by the user is
109+
required.
110+
111+
#### `SdkLogger`
112+
113+
- Implements the `Logger` trait.
114+
- Creates `SdkLogRecord` instances and emits them.
115+
- Calls `OnEmit()` on all registered processors when emitting logs.
116+
- Passes mutable references to each processor (`&mut log_record`), i.e.,
117+
ownership is not passed to the processor. This ensures that the logger avoids
118+
cloning costs. Since a mutable reference is passed, processors can modify the
119+
log, and it will be visible to the next processor in the chain.
120+
- Since the processor only gets a reference to the log, it cannot store it
121+
beyond the `OnEmit()`. If a processor needs to buffer logs, it must explicitly
122+
copy them to the heap.
123+
- This design allows for stack-only log processing when exporting to operating
124+
system native facilities like `Windows ETW` or `Linux user_events`.
125+
- OTLP Exporting requires network calls (HTTP/gRPC) and batching of logs for
126+
efficiency purposes. These exporters buffer log records by copying them to the
127+
heap. (More on this in the BatchLogRecordProcessor section)
128+
129+
#### `LogRecord`
130+
131+
- Holds log data, including attributes.
132+
- Uses an inline array for up to 5 attributes to optimize stack usage.
133+
- Falls back to a heap-allocated `Vec` if more attributes are required.
134+
- Inspired by Go’s `slog` library for efficiency.
135+
136+
#### LogRecord Processors
137+
138+
`SdkLoggerProvider` allows being configured with any number of LogProcessors.
139+
They get called in the order of registration. Log records are passed to the
140+
`OnEmit` method of LogProcessor. LogProcessors can be used to process the log
141+
records, enrich them, filter them, and export to destinations by leveraging
142+
LogRecord Exporters.
143+
144+
Following built-in Log processors are provided in the Log SDK:
145+
146+
##### SimpleLogProcessor
147+
148+
This processor is designed to be used for exporting purposes. Export is handled
149+
by an Exporter (which is a separate component). SimpleLogProcessor is "simple"
150+
in the sense that it does not attempt to do any processing - it just calls the
151+
exporter and passes the log record to it. To comply with OTel specification, it
152+
synchronizes calls to the `Export()` method, i.e., only one `Export()` call will
153+
be done at any given time.
154+
155+
SimpleLogProcessor is only used for test/learning purposes and is often used
156+
along with a `stdout` exporter.
157+
158+
##### BatchLogProcessor
159+
160+
This is another "exporting" processor. As with SimpleLogProcessor, a different
161+
component named LogExporter handles the actual export logic. BatchLogProcessor
162+
buffers/batches the logs it receives into an in-memory buffer. It invokes the
163+
exporter every 1 second or when 512 items are in the batch (customizable). It
164+
uses a background thread to do the export, and communication between the user
165+
thread (where logs are emitted) and the background thread occurs with `mpsc`
166+
channels.
167+
168+
The max amount of items the buffer holds is 2048 (customizable). Once the limit
169+
is reached, any *new* logs are dropped. It *does not* apply back-pressure to the
170+
user thread and instead drops logs.
171+
172+
As with SimpleLogProcessor, this component also ensures only one export is
173+
active at a given time. A modified version of this is required to achieve higher
174+
throughput in some environments.
175+
176+
In this design, at most 2048+512 logs can be in memory at any given point. In
177+
other words, that many logs can be lost if the app crashes in the middle.
178+
179+
## LogExporters
180+
181+
LogExporters are responsible for exporting logs to a destination. Some of them
182+
include:
183+
184+
1. **InMemoryExporter** - exports to an in-memory list, primarily for
185+
unit-testing. This is used extensively in the repo itself, and external users
186+
are also encouraged to use this.
187+
2. **Stdout exporter** - prints telemetry to stdout. Only for debugging/learning
188+
purposes. The output format is not defined and also is not performance
189+
optimized. A production-recommended version with a standardized output format
190+
is in the plan.
191+
3. **OTLP Exporter** - OTel's official exporter which uses the OTLP protocol
192+
that is designed with the OTel data model in mind. Both HTTP and gRPC-based
193+
exporting is offered.
194+
4. **Exporters to OS Kernel facilities** - These exporters are not maintained in
195+
the core repo but listed for completion. They export telemetry to Windows ETW
196+
or Linux user_events. They are designed for high-performance workloads. Due
197+
to their nature of synchronous exporting, they do not require
198+
buffering/batching. This allows logs to operate entirely on the stack and can
199+
scale easily with the number of CPU cores. (Kernel uses per-CPU buffers for
200+
the events, ensuring no contention)
201+
202+
## `tracing` Log Appender
203+
204+
The `tracing` appender bridges `tracing` logs to OpenTelemetry. Logs emitted via
205+
`tracing` macros (`info!`, `warn!`, etc.) are forwarded to OpenTelemetry through
206+
this integration.
207+
208+
- `tracing` is designed for high performance, using *layers* or *subscribers* to
209+
handle emitted logs (events).
210+
- The appender implements a `Layer`, receiving logs from `tracing`.
211+
- Uses the OTel Logs API to create `LogRecord`, populate it, and emit it via
212+
`Logger.emit(LogRecord)`.
213+
- If no Logs SDK is present, the process is a no-op.
214+
215+
## Summary
216+
217+
- OpenTelemetry Logs does not provide a user-facing logging API.
218+
- Instead, it integrates with existing logging libraries (`log`, `tracing`).
219+
- The Logs API defines key traits but performs no operations unless an SDK is
220+
installed.
221+
- The Logs SDK enables log processing, transformation, and export.
222+
- The Logs SDK is performance optimized to minimize copying and heap allocation,
223+
wherever feasible.
224+
- The `tracing` appender efficiently routes logs to OpenTelemetry without
225+
modifying existing logging workflows.

docs/design/metrics.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# OpenTelemetry Rust Metrics Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
TODO:

docs/design/traces.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# OpenTelemetry Rust Traces Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
TODO:

0 commit comments

Comments
 (0)