Skip to content

Conversation

@jack-berg
Copy link
Member

@pellared and I were having a discussion recently which highlighted some differences between log appender implementations in go, and other languages like java, .net, etc.

The comment I linked to spells out the differences, but at a high level, go log appenders implement the entire logger API, where a typical java log appender implementation is plugged into the built-in logger implementation as an extension that is only responsible for mapping a log to the otel log API.

The otel log API wasn't designed to support bridges which act as a full implementation of a log API. This is not to say that it cannot be evolved to meet the requirements, but I want to explore if there is a simpler option. Can go log appenders be written to more closely mirror the appender model in other languages like java?

This draft PR sketches out that idea for zap:

  • We have a new implementation of zapcore.Core called WrappedCore, which requires the following parameters:
    • delegate zapcore.Core - a delegate core instance with the user's existing log config
    • provider log.LoggerProvider - an otel log provider, with the user's otel log configuration. Typically, this will be a batch log processor paired with an otlp exporter.
  • Wrapped core delegates to the the delegate for all methods. However, when Check is called, it registers a CheckWriteHook, which is invoked after the log entry is written. The hook is responsible for bridging the log to the otel logger provider.
  • Since we delegate to the core for everything, the core configuration is solely responsible for evaluating whether or not a record should be logged based on the user's configuration. And only logs that are written to the delegate core are bridged to opentelemetry.
  • The bridging becomes a simple mapping exercise between the zap and otel log data models.
  • Users don't need to worry about configuring their otel logger provider with complex configuration pipelines including multiple destinations, sampling, log level thresholdsd, etc. All that is configured in the delegate core.

Note: I'm not a go developer and this is only meant to sketch out an idea. If the go SIG likes this direction, someone would have to pick it up and run with it.

@codecov
Copy link

codecov bot commented Mar 19, 2025

Codecov Report

Attention: Patch coverage is 41.50943% with 31 lines in your changes missing coverage. Please review.

Project coverage is 75.5%. Comparing base (06004ce) to head (0e7afaf).
Report is 144 commits behind head on main.

Files with missing lines Patch % Lines
bridges/otelzap/core.go 41.5% 27 Missing and 4 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #6964     +/-   ##
=======================================
- Coverage   75.6%   75.5%   -0.1%     
=======================================
  Files        207     207             
  Lines      19354   19407     +53     
=======================================
+ Hits       14643   14665     +22     
- Misses      4275    4302     +27     
- Partials     436     440      +4     
Files with missing lines Coverage Δ
bridges/otelzap/core.go 81.8% <41.5%> (-16.0%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pellared
Copy link
Member

Here are existing alternatives:

@jack-berg
Copy link
Member Author

Here are existing alternatives:

The Custom zapcore.Core solution is essentially the same thing I propose here. It more closely mirrors the concept of a appender that the otel log API was designed to support.

Why not provide a built in utility so this pattern is easily accessible?

@pellared
Copy link
Member

pellared commented Mar 25, 2025

Why not provide a built in utility so this pattern is easily accessible?

Based on the feedback we got so far, users are not interested in it and prefer the current design which offers more diverse setups.

PS. Notice that even the issue author decided not to use the custom zapcore.Core solution.

@jack-berg
Copy link
Member Author

Based on the feedback we got so far, users are not interested in it and prefer the current design which offers more diverse setups.
PS. Notice that even the issue author decided not to use the custom zapcore.Core solution.

Well the user was given the option between configuring code or providing their own core implementation, so of course they choose the config option.

The point of this rearrangement is that it shifts the config story to zap and away from the OTEL SDK. Go's requirements around multiple independent processor pipelines, filtering / sampling, and processor level severity configuration are a function of go log appenders designed to act as a full implementation of the log API. As discussed, the otel log API / SDK were not designed to replace existing log frameworks as a feature-rich logging library.

Existing logging libraries generally provide a much richer set of features than what is defined in OpenTelemetry. It is NOT a goal of OpenTelemetry to ship a feature-rich logging library.

@pellared
Copy link
Member

pellared commented Mar 25, 2025

The point of this rearrangement is that it shifts the config story to zap

It does not shift the config to zap but to one of the zapcore.Core implementations. The users do not have to rely on other zapcore.Core implementations. Users may choose to emit logs only via OTLP. This is a more performant way of emitting logs.

@jack-berg
Copy link
Member Author

It does not shift the config to zap but to one of the zapcore.Core implementations.

Yes, and most of the time that will be the zapcore.Core implementation that is published from the same repo as the zap API.

Users may choose to emit logs only via OTLP.

People won't want to do this. Sending to a network location will (almost) always be supplementary to logging to the stdout / console. You're always going to need an escape hatch to see what's happening if the network exporter is failing.

@pellared
Copy link
Member

pellared commented Mar 25, 2025

People won't want to do this. Sending to a network location will (almost) always be supplementary to logging to the stdout / console. You're always going to need an escape hatch to see what's happening if the network exporter is failing.

otelzap is used by the OTel Collector:

otelzap is also used by other open-source repositories; see: https://pkg.go.dev/go.opentelemetry.io/contrib/bridges/otelzap?tab=importedby. Here is an example where users want to just use the OTel Logs SDK with OTLP exporter for production config: https://github.com/danielbukowski/twitch-chatbot/blob/main/internal/logger/logger.go.

It also follows the OTel Logging Specification:

The second approach is to modify the application so that the logs are output via a network protocol, e.g. via OTLP.
[...]
The addons implement sending over such network protocols, which would then typically require small, localized changes to the application code to change the logging target.
[...]
log appenders use the API to bridge logs from existing logging libraries to the OpenTelemetry data model, where the SDK controls how the logs are processed and exported.

We received no feedback for the users that supports your statement. So far we received only positive feedback about the Go Logs API, SDK, bridges design (e.g. in person during KubeCon EU 2024 Paris). Since Aug 23, 2024 (the initial release of otelzap) we were only asked to add one feature:

PS. People can always use the STDOUT Exporter in case they have some problems e.g, with the network.

@jack-berg
Copy link
Member Author

We received no feedback for the users that supports your statement.

The design of the OpenTelemetry API and SDK built on top of years of iteration and feedback supports my statement. The whole design is built on top of the assumption that there is a set of popular log APIs with implementations that have rich feature sets that would be difficult / wasteful to recreate / compete with.

What about all the users that wanted multiple independent logging pipelines, with the filtering and severity configuration options? These types of requests reflect a user base that wants to: 1. continue using existing log API and configuration to route logs to stdout 2. bolt on the additional capability to export logs to a OTLP network location. You don't ask for multiple independent pipelines if you only care about exporting to an OTLP network location.

Here is an example where users want to just use the OTel Logs SDK with OTLP exporter for production config

So only exporting logs to a network location? No local logging to stdout or filesystem? If so, this is a bad idea and not a pattern we should promote. Imagine being the person responsible for maintaining such an application when an inevitable network issue occurs! 😬

PS. People can always use the STDOUT Exporter in case they have some problems e.g, with the network.

Yes, but with a very opinionated encoding and basically no config options. What if they prefer a different encoding (i.e. not the verbose protobuf JSON, or different formatting of timestamp)? Or want to logs to files with periodic rotation? Or want to export to a different network location, like kafka or DB? Mature log frameworks already have strong configuration stories for all of these features. Why reinvent the wheel by re-implementing in OpenTelemetry?

@pellared
Copy link
Member

pellared commented Mar 25, 2025

The design of the OpenTelemetry API and SDK built on top of years of iteration and feedback supports my statement. The whole design is built on top of the assumption that there is a set of popular log APIs with implementations that have rich feature sets that would be difficult / wasteful to recreate / compete with.

The assumption seems wrong for Go. You are actually getting feedback in opposition with your statement.

Mature log frameworks already have strong configuration stories for all of these features. Why reinvent the wheel by re-implementing in OpenTelemetry?

Go logging libraries that are considered mature such as slog and zap do not have "strong configuration stories".More: open-telemetry/opentelemetry-specification#3917 (comment)

I thank that it is important to share that the slog bridge was even reviewed by the author of slog: #5138 (comment)

@pellared
Copy link
Member

pellared commented Mar 25, 2025

So only exporting logs to a network location? No local logging to stdout or filesystem? If so, this is a bad idea and not a pattern we should promote. Imagine being the person responsible for maintaining such an application when an inevitable network issue occurs! 😬

I think this is an orthogonal problem that should still have a solution in the SDK and relying on stdout may also not be an appropriate solution. Here is an example how it could be solved on the SDK level: open-telemetry/opentelemetry-specification#3645. Currently the users can also choose to use an OTLP file exporter.

@jack-berg
Copy link
Member Author

The assumption seems wrong for Go. You are actually getting feedback in opposition with your statement.

But I don't see a technical reason for go's divergence. I was initially under the impression that something about the go language design or log ecosystem necessitated re-implementing the entire log API, but with this draft PR, I now see that its a design decision.

So I ask why make a design decision that diverges from the intended design of the OpenTelemetry log API / SDK? Sure you've met user requirements by defining new concepts like filtering processor and LogRecordProcessor#enabled, but these aren't necessary if go log appenders don't implement the entire log API.

The reason this all matters is consistency across languages. OpenTelemetry is a better project if logs feel familiar regardless of which langauge implementation you're using. Imagine this through the lens of declarative config. We want to get to a place where you can take the same SDK config and plug it into all your apps written in different languages. This is a strong user story. Log SDK config looks about the same for almost all languages, with a batch exporter configured to export to an OTLP network location, but for go, the config needs to look quite different since if you plug in this simple config to go, you'll only be logging to a network location which is almost always the wrong thing.

I think this is an orthogonal problem that should still have a solution in the SDK

Its not a problem at all if a log appender is not responsible for being a full implementation of a log API.

@pellared
Copy link
Member

pellared commented Mar 26, 2025

I was initially under the impression that something about the go language design or log ecosystem necessitated re-implementing the entire log API
[...]
Its not a problem at all if a log appender is not responsible for being a full implementation of a log API.

You were under the correct impression.
The slog.Handler (as well as zapcore.Core and logr.Sink) is expected to implement the whole logs "backend".
More: https://go.dev/blog/slog.

but with this draft PR

The fact that something is possible does not mean that it is idiomatic in Go (or idiomatic usage of the Go logging libraries).

Similarly for Rust's log crate the logging implementation is expected to implement fn enabled.

Notice also that with recent changes in OTel Spec it is allowed to use Logs API directly (without using any log appender/bridge). This was needed e.g. to allow instrumentation libraries to emit log records (or event records) without requiring to depend on any concrete logging library. With this it is necessary to add fundamental logging features to the Logs SDK. Languages where adding a Logger.Enabled for optimization purposes is not idiomatic/necessary do not need newly added features like LogRecordProcessor#enabled.

you've met user requirements by defining new concepts like filtering processor and LogRecordProcessor#enabled, but these aren't necessary if go log appenders don't implement the entire log API.

This is not true. For instance, filtering processors can be used e.g. to set different log levels for different exporters (we call it "minimum severity processor") or to add a "trace based sampling" filtering processor.

@jack-berg
Copy link
Member Author

The slog.Handler (as well as zapcore.Core and logr.Sink) is expected to implement the whole logs "backend".

Just as it was possible to wrap a zapcore.Core, it appears possible to wrap a slog.Handler.

The fact that something is possible does not mean that it is idiomatic in Go (or idiomatic usage of the Go logging libraries).

You obviously have the authority on what constitutes idiomatic in go, but I've read the docs and am struggling to find guidance that expresses a preference to implement the entire backend (zapcore.Core, slog.Handler, etc) vs. wrapping.

In java, we have a fractured log ecosystem, with major log APIs including: JUL (java.util.logging), log4j, SLF4J, logback. All of these log APIs want to have a good interopt story with alternative log APIs, so they publish bridges which implement competing APIs, purely at the API level. The idea is a user selects a log API and backend for their app (say logback), installs bridges such that logs from JUL, log4j, and SLF4J all get routed to the logback API. Then they configure the logback implementation (i.e. backend in go vocabulary) to configure how the logs from all the APIs are handled.

This is analogous to the otel go log story. The otel go log appenders bridge slog, zap, etc to otel at the API level. Users install these bridges, and configure the otel log SDK (i.e. backend) to configure how the logs from all the APIs are handled.

We could have taken that direction in otel java, and bridged JUL, log4j, SLF4J, logback to otel log API at the API level. There is lots of prior art in java for bridging log APIs at the API level such that we would have been able to make a strong case that doing so is idiomatic in java.

But again, the otel log API / SDK were designed for bridging at the backend level, not at the API level. The idea was always to allow users to bolt on OTLP logs to whatever log tools they were already using. And again, the benefit of bridging at the backend level and not the API level is that it means the otel log SDK doesn't have to be fully featured.

This is not true.

What's not true? The claim that I make is "you've met user requirements by defining new concepts like filtering processor and LogRecordProcessor#enabled, but these aren't necessary if go log appenders don't implement the entire log API".

It is true that we don't need SDK features like filtering processor or LogRecordProcessor#enabled if otel log SDK isn't trying to be a fully featured log backend. Those are the types of features that normally come from the log API backend. If bridging at the backend level, all an otel log SDK really needs to do is buffer logs and send to an OTLP network location.

And we most likely design them differently if otel log SDK doesn't need to be a fully featured log backend. Specifically, we would likely have filtering by logger and severity at the LoggerProvider instead of at the LogRecordProcessor level. Configuration at the LogRecordProcessor level is less ergonomic than at the LoggerProvider level, and only makes sense if you are prioritizing multiple independent processing pipelines. Which again, is not a priority if we're not trying to build a fully featured log backend.

@jack-berg jack-berg closed this Apr 11, 2025
@pellared
Copy link
Member

pellared commented Apr 17, 2025

The claim that I make is "you've met user requirements by defining new concepts like filtering processor

From my previous comment:

filtering processors can be used e.g. to set different log levels for different exporters (we call it "minimum severity processor")

Thanks for your feedback 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants