Skip to content

Commit 7019656

Browse files
authored
Merge branch 'main' into cijothomas/exporter-build-result-logs
2 parents 3561fc6 + ac66848 commit 7019656

File tree

8 files changed

+362
-24
lines changed

8 files changed

+362
-24
lines changed

docs/design/logs.md

Lines changed: 302 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
# OpenTelemetry Rust Logs Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
## Overview
7+
8+
[OpenTelemetry (OTel)
9+
Logs](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/README.md)
10+
support differs from Metrics and Traces as it does not introduce a new logging
11+
API for end users. Instead, OTel recommends leveraging existing logging
12+
libraries such as [log](https://crates.io/crates/log) and
13+
[tracing](https://crates.io/crates/tracing), while providing bridges (appenders)
14+
to route logs through OpenTelemetry.
15+
16+
OTel took this different approach due to the long history of existing logging
17+
solutions. In Rust, these are [log](https://crates.io/crates/log) and
18+
[tracing](https://crates.io/crates/tracing), and have been embraced in the
19+
community for some time. OTel Rust maintains appenders for these libraries,
20+
allowing users to seamlessly integrate with OpenTelemetry without changing their
21+
existing logging instrumentation.
22+
23+
The `tracing` appender is particularly optimized for performance due to its
24+
widespread adoption and the fact that `tracing` itself has a bridge from the
25+
`log` crate. Notably, OpenTelemetry Rust itself is instrumented using `tracing`
26+
for internal logs. Additionally, when OTel began supporting logging as a signal,
27+
the `log` crate lacked structured logging support, reinforcing the decision to
28+
prioritize `tracing`.
29+
30+
## Benefits of OpenTelemetry Logs
31+
32+
- **Unified configuration** across Traces, Metrics, and Logs.
33+
- **Automatic correlation** with Traces.
34+
- **Consistent Resource attributes** across signals.
35+
- **Multiple destinations support**: Logs can continue flowing to existing
36+
destinations like stdout etc. while also being sent to an
37+
OpenTelemetry-capable backend, typically via an OTLP Exporter or exporters
38+
that export to operating system native systems like `Windows ETW` or `Linux
39+
user_events`.
40+
- **Standalone logging support** for applications that use OpenTelemetry as
41+
their primary logging mechanism.
42+
43+
## Key Design Principles
44+
45+
- High performance - no locks/contention in the hot path with minimal/no heap
46+
allocation where possible.
47+
- Capped resource (memory) usage - well-defined behavior when overloaded.
48+
- Self-observable - exposes telemetry about itself to aid in troubleshooting
49+
etc.
50+
- Robust error handling, returning Result where possible instead of panicking.
51+
- Minimal public API, exposing based on need only.
52+
53+
## Architecture Overview
54+
55+
```mermaid
56+
graph TD
57+
subgraph Application
58+
A1[Application Code]
59+
end
60+
subgraph Logging Libraries
61+
B1[log crate]
62+
B2[tracing crate]
63+
end
64+
subgraph OpenTelemetry
65+
C1[OpenTelemetry Appender for log]
66+
C2[OpenTelemetry Appender for tracing]
67+
C3[OpenTelemetry Logs API]
68+
C4[OpenTelemetry Logs SDK]
69+
C5[OTLP Exporter]
70+
end
71+
subgraph Observability Backend
72+
D1[OTLP-Compatible Backend]
73+
end
74+
A1 --> |Emits Logs| B1
75+
A1 --> |Emits Logs| B2
76+
B1 --> |Bridged by| C1
77+
B2 --> |Bridged by| C2
78+
C1 --> |Sends to| C3
79+
C2 --> |Sends to| C3
80+
C3 --> |Processes with| C4
81+
C4 --> |Exports via| C5
82+
C5 --> |Sends to| D1
83+
```
84+
85+
## Logs API
86+
87+
Logs API is part of the [opentelemetry](https://crates.io/crates/opentelemetry)
88+
crate.
89+
90+
The OTel Logs API is not intended for direct end-user usage. Instead, it is
91+
designed for appender/bridge authors to integrate existing logging libraries
92+
with OpenTelemetry. However, there is nothing preventing it from being used by
93+
end-users.
94+
95+
### API Components
96+
97+
1. **Key-Value Structs**: Used in `LogRecord`, where `Key` struct is shared
98+
across signals but `Value` struct differ from Metrics and Traces. This is
99+
because values in Logs can contain more complex structures than those in
100+
Traces and Metrics.
101+
2. **Traits**:
102+
- `LoggerProvider` - provides methods to obtain Logger.
103+
- `Logger` - provides methods to create LogRecord and emit the created
104+
LogRecord.
105+
- `LogRecord` - provides methods to populate LogRecord.
106+
3. **No-Op Implementations**: By default, the API performs no operations until
107+
an SDK is attached.
108+
109+
### Logs Flow
110+
111+
1. Obtain a `LoggerProvider` implementation.
112+
2. Use the `LoggerProvider` to create `Logger` instances, specifying a scope
113+
name (module/component emitting logs). Optional attributes and version are
114+
also supported.
115+
3. Use the `Logger` to create an empty `LogRecord` instance.
116+
4. Populate the `LogRecord` with body, timestamp, attributes, etc.
117+
5. Call `Logger.emit(LogRecord)` to process and export the log.
118+
119+
If only the Logs API is used (without an SDK), all the above steps result in no
120+
operations, following OpenTelemetry’s philosophy of separating API from SDK. The
121+
official Logs SDK provides real implementations to process and export logs.
122+
Users or vendors can also provide alternative SDK implementations.
123+
124+
## Logs SDK
125+
126+
Logs SDK is part of the
127+
[opentelemetry_sdk](https://crates.io/crates/opentelemetry_sdk) crate.
128+
129+
The OpenTelemetry Logs SDK provides an OTel specification-compliant
130+
implementation of the Logs API, handling log processing and export.
131+
132+
### Core Components
133+
134+
#### `SdkLoggerProvider`
135+
136+
This is the implementation of the `LoggerProvider` and deals with concerns such
137+
as processing and exporting Logs.
138+
139+
- Implements the `LoggerProvider` trait.
140+
- Creates and manages `SdkLogger` instances.
141+
- Holds logging configuration, including `Resource` and processors.
142+
- Does not retain a list of created loggers. Instead, it passes an owned clone
143+
of itself to each logger created. This is done so that loggers get a hold of
144+
the configuration (like which processor to invoke).
145+
- Uses an `Arc<LoggerProviderInner>` and delegates all configuration to
146+
`LoggerProviderInner`. This allows cheap cloning of itself and ensures all
147+
clones point to the same underlying configuration.
148+
- As `SdkLoggerProvider` only holds an `Arc` of its inner, it can only take
149+
`&self` in its methods like flush and shutdown. Else it needs to rely on
150+
interior mutability that comes with runtime performance costs. Since methods
151+
like shutdown usually need to mutate interior state, but this component can
152+
only take `&self`, it defers to components like exporter to use interior
153+
mutability to handle shutdown. (More on this in the exporter section)
154+
- An alternative design was to let `SdkLogger` hold a `Weak` reference to the
155+
`SdkLoggerProvider`. This would be a `weak->arc` upgrade in every log
156+
emission, significantly affecting throughput.
157+
- `LoggerProviderInner` implements `Drop`, triggering `shutdown()` when no
158+
references remain. However, in practice, loggers are often stored statically
159+
inside appenders (like tracing-appender), so explicit shutdown by the user is
160+
required.
161+
162+
#### `SdkLogger`
163+
164+
This is an implementation of the `Logger`, and contains functionality to create
165+
and emit logs.
166+
167+
- Implements the `Logger` trait.
168+
- Creates `SdkLogRecord` instances and emits them.
169+
- Calls `OnEmit()` on all registered processors when emitting logs.
170+
- Passes mutable references to each processor (`&mut log_record`), i.e.,
171+
ownership is not passed to the processor. This ensures that the logger avoids
172+
cloning costs. Since a mutable reference is passed, processors can modify the
173+
log, and it will be visible to the next processor in the chain.
174+
- Since the processor only gets a reference to the log, it cannot store it
175+
beyond the `OnEmit()`. If a processor needs to buffer logs, it must explicitly
176+
copy them to the heap.
177+
- This design allows for stack-only log processing when exporting to operating
178+
system native facilities like `Windows ETW` or `Linux user_events`.
179+
- OTLP Exporting requires network calls (HTTP/gRPC) and batching of logs for
180+
efficiency purposes. These exporters buffer log records by copying them to the
181+
heap. (More on this in the BatchLogRecordProcessor section)
182+
183+
#### `LogRecord`
184+
185+
- Holds log data, including attributes.
186+
- Uses an inline array for up to 5 attributes to optimize stack usage.
187+
- Falls back to a heap-allocated `Vec` if more attributes are required.
188+
- Inspired by Go’s `slog` library for efficiency.
189+
190+
#### LogRecord Processors
191+
192+
`SdkLoggerProvider` allows being configured with any number of LogProcessors.
193+
They get called in the order of registration. Log records are passed to the
194+
`OnEmit` method of LogProcessor. LogProcessors can be used to process the log
195+
records, enrich them, filter them, and export to destinations by leveraging
196+
LogRecord Exporters.
197+
198+
Following built-in Log processors are provided in the Log SDK:
199+
200+
##### SimpleLogProcessor
201+
202+
This processor is designed to be used for exporting purposes. Export is handled
203+
by an Exporter (which is a separate component). SimpleLogProcessor is "simple"
204+
in the sense that it does not attempt to do any processing - it just calls the
205+
exporter and passes the log record to it. To comply with OTel specification, it
206+
synchronizes calls to the `Export()` method, i.e., only one `Export()` call will
207+
be done at any given time.
208+
209+
SimpleLogProcessor is only used for test/learning purposes and is often used
210+
along with a `stdout` exporter.
211+
212+
##### BatchLogProcessor
213+
214+
This is another "exporting" processor. As with SimpleLogProcessor, a different
215+
component named LogExporter handles the actual export logic. BatchLogProcessor
216+
buffers/batches the logs it receives into an in-memory buffer. It invokes the
217+
exporter every 1 second or when 512 items are in the batch (customizable). It
218+
uses a background thread to do the export, and communication between the user
219+
thread (where logs are emitted) and the background thread occurs with `mpsc`
220+
channels.
221+
222+
The max amount of items the buffer holds is 2048 (customizable). Once the limit
223+
is reached, any *new* logs are dropped. It *does not* apply back-pressure to the
224+
user thread and instead drops logs.
225+
226+
As with SimpleLogProcessor, this component also ensures only one export is
227+
active at a given time. A modified version of this is required to achieve higher
228+
throughput in some environments.
229+
230+
In this design, at most 2048+512 logs can be in memory at any given point. In
231+
other words, that many logs can be lost if the app crashes in the middle.
232+
233+
## LogExporters
234+
235+
LogExporters are responsible for exporting logs to a destination. Some of them
236+
include:
237+
238+
1. **InMemoryExporter** - exports to an in-memory list, primarily for
239+
unit-testing. This is used extensively in the repo itself, and external users
240+
are also encouraged to use this.
241+
2. **Stdout exporter** - prints telemetry to stdout. Only for debugging/learning
242+
purposes. The output format is not defined and also is not performance
243+
optimized. A production-recommended version with a standardized output format
244+
is in the plan.
245+
3. **OTLP Exporter** - OTel's official exporter which uses the OTLP protocol
246+
that is designed with the OTel data model in mind. Both HTTP and gRPC-based
247+
exporting is offered.
248+
4. **Exporters to OS Kernel facilities** - These exporters are not maintained in
249+
the core repo but listed for completion. They export telemetry to Windows ETW
250+
or Linux user_events. They are designed for high-performance workloads. Due
251+
to their nature of synchronous exporting, they do not require
252+
buffering/batching. This allows logs to operate entirely on the stack and can
253+
scale easily with the number of CPU cores. (Kernel uses per-CPU buffers for
254+
the events, ensuring no contention)
255+
256+
## `tracing` Log Appender
257+
258+
Tracing appender is part of the
259+
[opentelemetry-appender-tracing](https://crates.io/crates/opentelemetry-appender-tracing)
260+
crate.
261+
262+
The `tracing` appender bridges `tracing` logs to OpenTelemetry. Logs emitted via
263+
`tracing` macros (`info!`, `warn!`, etc.) are forwarded to OpenTelemetry through
264+
this integration.
265+
266+
- `tracing` is designed for high performance, using *layers* or *subscribers* to
267+
handle emitted logs (events).
268+
- The appender implements a `Layer`, receiving logs from `tracing`.
269+
- Uses the OTel Logs API to create `LogRecord`, populate it, and emit it via
270+
`Logger.emit(LogRecord)`.
271+
- If no Logs SDK is present, the process is a no-op.
272+
273+
Note on terminology: Within OpenTelemetry, "tracing" refers to distributed
274+
tracing (i.e creation of Spans) and not in-process structured logging and
275+
execution traces. The crate "tracing" has notion of creating Spans as well as
276+
Events. The events from "tracing" crate is what gets converted to OTel Logs,
277+
when using this appender. Spans created using "tracing" crate is not handled by
278+
this crate.
279+
280+
## Performance
281+
282+
// Call out things done specifically for performance
283+
284+
### Perf test - benchmarks
285+
286+
// Share ~~ numbers
287+
288+
### Perf test - stress test
289+
290+
// Share ~~ numbers
291+
292+
## Summary
293+
294+
- OpenTelemetry Logs does not provide a user-facing logging API.
295+
- Instead, it integrates with existing logging libraries (`log`, `tracing`).
296+
- The Logs API defines key traits but performs no operations unless an SDK is
297+
installed.
298+
- The Logs SDK enables log processing, transformation, and export.
299+
- The Logs SDK is performance optimized to minimize copying and heap allocation,
300+
wherever feasible.
301+
- The `tracing` appender efficiently routes logs to OpenTelemetry without
302+
modifying existing logging workflows.

docs/design/metrics.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# OpenTelemetry Rust Metrics Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
TODO:

docs/design/traces.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# OpenTelemetry Rust Traces Design
2+
3+
Status:
4+
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md)
5+
6+
TODO:

opentelemetry-otlp/examples/basic-otlp-http/Cargo.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ default = ["reqwest-blocking"]
1010
reqwest-blocking = ["opentelemetry-otlp/reqwest-blocking-client"]
1111

1212
[dependencies]
13-
once_cell = { workspace = true }
1413
opentelemetry = { path = "../../../opentelemetry" }
1514
opentelemetry_sdk = { path = "../../../opentelemetry-sdk" }
1615
opentelemetry-otlp = { path = "../.."}

opentelemetry-otlp/examples/basic-otlp-http/src/main.rs

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
use once_cell::sync::Lazy;
21
use opentelemetry::{
32
global,
43
trace::{TraceContextExt, Tracer},
@@ -11,16 +10,21 @@ use opentelemetry_sdk::Resource;
1110
use opentelemetry_sdk::{
1211
logs::SdkLoggerProvider, metrics::SdkMeterProvider, trace::SdkTracerProvider,
1312
};
14-
use std::error::Error;
13+
use std::{error::Error, sync::OnceLock};
1514
use tracing::info;
1615
use tracing_subscriber::prelude::*;
1716
use tracing_subscriber::EnvFilter;
1817

19-
static RESOURCE: Lazy<Resource> = Lazy::new(|| {
20-
Resource::builder()
21-
.with_service_name("basic-otlp-example-http")
22-
.build()
23-
});
18+
fn get_resource() -> Resource {
19+
static RESOURCE: OnceLock<Resource> = OnceLock::new();
20+
RESOURCE
21+
.get_or_init(|| {
22+
Resource::builder()
23+
.with_service_name("basic-otlp-example-grpc")
24+
.build()
25+
})
26+
.clone()
27+
}
2428

2529
fn init_logs() -> SdkLoggerProvider {
2630
let exporter = LogExporter::builder()
@@ -31,7 +35,7 @@ fn init_logs() -> SdkLoggerProvider {
3135

3236
SdkLoggerProvider::builder()
3337
.with_batch_exporter(exporter)
34-
.with_resource(RESOURCE.clone())
38+
.with_resource(get_resource())
3539
.build()
3640
}
3741

@@ -44,7 +48,7 @@ fn init_traces() -> SdkTracerProvider {
4448

4549
SdkTracerProvider::builder()
4650
.with_batch_exporter(exporter)
47-
.with_resource(RESOURCE.clone())
51+
.with_resource(get_resource())
4852
.build()
4953
}
5054

@@ -57,7 +61,7 @@ fn init_metrics() -> SdkMeterProvider {
5761

5862
SdkMeterProvider::builder()
5963
.with_periodic_exporter(exporter)
60-
.with_resource(RESOURCE.clone())
64+
.with_resource(get_resource())
6165
.build()
6266
}
6367

opentelemetry-otlp/examples/basic-otlp/Cargo.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@ license = "Apache-2.0"
66
publish = false
77

88
[dependencies]
9-
once_cell = { workspace = true }
109
opentelemetry = { path = "../../../opentelemetry" }
1110
opentelemetry_sdk = { path = "../../../opentelemetry-sdk" }
1211
opentelemetry-otlp = { path = "../../../opentelemetry-otlp", features = ["grpc-tonic"] }

0 commit comments

Comments
 (0)