feat(log)!: intoduce log v2 with OpenTelemetry and slog#25701
feat(log)!: intoduce log v2 with OpenTelemetry and slog#25701technicallyty wants to merge 137 commits intomainfrom
Conversation
log/CHANGELOG.md
Outdated
|
|
||
| ## [Unreleased] | ||
|
|
||
| * [#25701](https://github.com/cosmos/cosmos-sdk/pull/25701) Introduce log v2, enabling OpenTelemetry logging with slog. The logging interface has been updated to accommodate Context logging methods, which allows correlation of logs with traces. |
There was a problem hiding this comment.
can we link to the upgrade guide for this?
log/logger.go
Outdated
| opt(&logCfg) | ||
| // newZerologLogger creates a Logger backed by zerolog directly. | ||
| // This is the fast path with zero allocations. | ||
| func newZerologLogger(dst io.Writer, cfg *Config) Logger { |
There was a problem hiding this comment.
previously we filled out the default config and overrode it. Why are we making this change and should cfg be nullable (a pointer)?
There was a problem hiding this comment.
we still do that, it's just pulled up to the top level NewLogger now. ill remove the pointers
| } | ||
| } | ||
|
|
||
| func TestVerboseMode(t *testing.T) { |
There was a problem hiding this comment.
did we remove this test accidentally?
There was a problem hiding this comment.
yes, adding back in
| // Impl returns the underlying zerolog logger. | ||
| // It can be used to use zerolog structured API directly instead of the wrapper. | ||
| func (l zeroLogWrapper) Impl() interface{} { | ||
| func (l *zeroLogWrapper) Impl() interface{} { |
There was a problem hiding this comment.
should return the concrete type here
There was a problem hiding this comment.
this would break the interface
| // nil = check LoggerProvider. | ||
| // true = force enable | ||
| // false = force disable | ||
| EnableOTEL *bool |
There was a problem hiding this comment.
i wanted to have presence checking here. if set at all, we know that this was explicitly set with the WithOTEL or WIthoutOTEL options. if unset, we can make a decision based on if they had a provider set in their otel configuration.
on second thought though, perhaps we just enable it if they have otel set at all. not actually sure its useful to have this expression here.
There was a problem hiding this comment.
should be a bool → false by default. no need for a pointer
There was a problem hiding this comment.
see above comment
| // this test ensures that when the With and WithContext methods are called, | ||
| // that the log wrapper is properly copied with all of its associated options | ||
| // otherwise, verbose mode will fail | ||
| func TestLoggerWith(t *testing.T) { |
There was a problem hiding this comment.
seems like this test would be fine to still include?
There was a problem hiding this comment.
added back, but removed the wrapper3 case because the return type changed, making that test case invalid.
server/cmd/execute.go
Outdated
| ctx := CreateExecuteContext(context.Background()) | ||
|
|
||
| rootCmd.PersistentFlags().String(flags.FlagLogLevel, zerolog.InfoLevel.String(), "The logging level (trace|debug|info|warn|error|fatal|panic|disabled or '*:<level>,<key>:<level>')") | ||
| rootCmd.PersistentFlags().String(flags.FlagLogLevel, "info", "The logging level (debug|info|warn|error|disabled or '*:<level>,<key>:<level>')") |
There was a problem hiding this comment.
can we not use the library values from slog etc or something?
There was a problem hiding this comment.
yeah added slog.InfoLevel.String() here instead of "info"
server/util.go
Outdated
| viper.New(), | ||
| cmtcfg.DefaultConfig(), | ||
| log.NewLogger(os.Stdout), | ||
| log.NewLogger("cosmos-sdk", log.WithConsoleWriter(os.Stdout)), |
There was a problem hiding this comment.
should we define a constant for this name? or a default somehow?
There was a problem hiding this comment.
added it to a const block
| @@ -1,373 +0,0 @@ | |||
| //go:build !app_v1 | |||
There was a problem hiding this comment.
why did this need to be removed (not opposed)
There was a problem hiding this comment.
oops. meant to add back after testing.
swift1337
left a comment
There was a problem hiding this comment.
Great to see OTEL in logging, left some comments related to the design
| FlagVerboseLogLevel = "verbose_log_level" | ||
| FlagLogFormat = "log_format" | ||
| FlagLogNoColor = "log_no_color" | ||
| FlagLogNoConsole = "log_no_console" |
There was a problem hiding this comment.
I think this refactoring is a good opportunity to fix flags' naming:
--log-level--log-level-verbose(what's that?)--log-format--log-no-color--log-disable-stdout→ is a more suitable name IMO
There was a problem hiding this comment.
trying to reduce API breakage here, leaning towards keeping the flags the same
as for --log-disable-stdout, kinda lean against that as well. we can log to stderr as well, and using a flag with that name could imply stderr logs will display, when the flag currently disables all console output entirely.
There was a problem hiding this comment.
I think it's fine to break a couple of flags since this PR already introduces breaking changes. Especially, since current log flags have this weird -- + _ casing.
we can log to stderr as well
good call, maybe then --log-disable-console ?
log/logger.go
Outdated
|
|
||
| // InfoContext takes a context, message and key/value pairs and logs with level INFO. | ||
| // The context is used for trace/span correlation when using OpenTelemetry. | ||
| InfoContext(ctx context.Context, msg string, keyVals ...any) |
There was a problem hiding this comment.
I find this API a bit too excessive. All of this can be expressed with:
// Logger is the Cosmos SDK logger interface.
type Logger interface {
// Info takes a message and a set of key/value pairs and logs with level INFO.
// The key of the tuple must be a string.
Info(msg string, keyVals ...any)
// Warn takes a message and a set of key/value pairs and logs with level WARN.
// The key of the tuple must be a string.
Warn(msg string, keyVals ...any)
// Error takes a message and a set of key/value pairs and logs with level ERR.
// The key of the tuple must be a string.
Error(msg string, keyVals ...any)
// Debug takes a message and a set of key/value pairs and logs with level DEBUG.
// The key of the tuple must be a string.
Debug(msg string, keyVals ...any)
// Ctx returns logger with context attached. Usable for propagating OTEL traces.
Ctx(ctx context.Context) Logger
// With returns a new wrapped logger with additional context provided by a set.
With(keyVals ...any) Logger
// Impl returns the underlying logger implementation.
// It is used to access the full functionalities of the underlying logger.
// Advanced users can type cast the returned value to the actual logger.
Impl() any
}Example:
app.logger.InfoContext(ctx, "InitChain", "initialHeight", req.InitialHeight, "chainID", req.ChainId)becomes →
app.logger.Ctx(ctx).Info("InitChain", "initialHeight", req.InitialHeight, "chainID", req.ChainId)this also allows to keep most of the code that doesn't need OTEL. Another benefit is that now we can reuse ctx logger in larger functions:
func (k Keeper) LargeMigration(...) error {
ctx, span := tracer.Start(...)
logger := app.logger.Ctx(ctx)
// 20 LoC ...
logger.Info("abc", "k", "v")
// another 50 LoC
logger.Info("xyz", "k2", "v2")
}logger.Ctx() can be a noop for zerolog. for slog we can have:
type slogLogger struct {
+ ctx context.Context
log *slog.Logger
}and call Info() / InfoContext() based on ctx == nil
There was a problem hiding this comment.
I think the rationale for all the InfoContext, etc. functions is basically to match the slog API. But your Ctx method approach also makes sense.
There was a problem hiding this comment.
yeah this is much better - updated here remove contextual methods, add context attachment
| // nil = check LoggerProvider. | ||
| // true = force enable | ||
| // false = force disable | ||
| EnableOTEL *bool |
There was a problem hiding this comment.
should be a bool → false by default. no need for a pointer
| The `cosmossdk.io/log` package provides a structured logging implementation for the Cosmos SDK using [zerolog](https://github.com/rs/zerolog) with optional OpenTelemetry integration. | ||
|
|
||
| To use a logger wrapping an instance of the standard library's `log/slog` package, use `cosmossdk.io/log/slog`. | ||
| ## Features |
There was a problem hiding this comment.
I think we miss two more features:
- Ability to set and get a logger from
ctx(zerolog example:ctx = zerolog.WithContext(ctx),zerolog.Ctx(ctx)) - Ability to retrieve a global logger for fallback ie use
log.Info(...)(where log is the package name)
This would result in a "feature parity" with other logging libs
There was a problem hiding this comment.
the global thing is interesting, i think id prefer to kick that can down the road though. esp since logging conventions in SDK are already pretty solidified imo. could be wrong though
There was a problem hiding this comment.
Agree, it's not urgent + easy to implement.
There was a problem hiding this comment.
I responded in the issue. Basically, I think this proposed context approach follow zerolog conventions but is quite different than the otel approach. If we're adopting otel more generally, I'd prefer we lean towards otel conventions.
| ctx context.Context | ||
| } | ||
|
|
||
| func (l slogLogger) Ctx(ctx context.Context) Logger { |
There was a problem hiding this comment.
Let's explicitly create a new struct
| func (l slogLogger) Ctx(ctx context.Context) Logger { | |
| func (l *slogLogger) Ctx(ctx context.Context) Logger { | |
| return &slogLogger{log: l.log, ctx: ctx} | |
| } |
| return slogLogger{log: log} | ||
| } | ||
|
|
||
| func (l slogLogger) Info(msg string, keyVals ...any) { |
There was a problem hiding this comment.
Let's skip slogLogger allocation for each call
| func (l slogLogger) Info(msg string, keyVals ...any) { | |
| func (l *slogLogger) Info(msg string, keyVals ...any) { |
| // | ||
| // // OTEL-only (no console output) | ||
| // logger := log.NewLogger("cosmos-sdk", log.WithOTEL(), log.WithoutConsole()) | ||
| func NewLogger(name string, opts ...Option) Logger { |
There was a problem hiding this comment.
What do you think of a simpler and more idiomatic naming?
| func NewLogger(name string, opts ...Option) Logger { | |
| func New(name string, opts ...Option) Logger { |
|
|
||
| // slogLogger satisfies Logger with logging backed by an instance of *slog.Logger. | ||
| type slogLogger struct { | ||
| log *slog.Logger |
There was a problem hiding this comment.
This isn't quite right. We can't wrap a *slog.Logger because then file and line information will be all wrong. We can only wrap a *slog.Handler but then we need to basically reimplement the logger functionality of checking enabled and capturing the program counter
Description
Closes: SDK-430
refactors the log package to add support for OpenTelemetry trace correlation. notable changes:
log.With<Option>rather thanlog.SomeOption