Skip to content

Conversation

@ndr-ds
Copy link
Contributor

@ndr-ds ndr-ds commented Sep 12, 2025

Motivation

We currently compile out trace logs, even though the overhead is minimal (it's 2 CPU instructions https://www.reddit.com/r/rust/comments/x9nypb/comment/inutkiv/)

Proposal

Stop compiling it out, so we can have trace logs if we want for debugging with just a restart, without having to recompile the binary.

Test Plan

I'm benchmarking networks with this, saw no visible performance regression

Release Plan

  • Nothing to do / These changes follow the usual release cycle.

Copy link
Contributor Author

ndr-ds commented Sep 12, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Twey
Copy link
Contributor

Twey commented Sep 13, 2025

What does max_level_trace do, concretely? As far as I know there's no class of logs more verbose than trace 🤔

@ma2bd
Copy link
Contributor

ma2bd commented Sep 13, 2025

Not sure about this. It's also the difference between tracing and debugging

Copy link
Contributor Author

ndr-ds commented Sep 15, 2025

@Twey It actually does nothing 😅 i.e. it's the same as if we remove it, I think.
@ma2bd if we're sure that we'll never need trace level logs in production, then maybe this isn't needed, but that hasn't been the case in some of the investigations we've done. Also, why not just have it available? There's no overhead, and we're still protected by whatever we set RUST_LOG to be 🤔

@ndr-ds ndr-ds marked this pull request as ready for review September 15, 2025 13:07
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from 568a3b0 to 3d9a544 Compare September 15, 2025 13:38
@ndr-ds ndr-ds force-pushed the 09-09-decrease_block_cache_size_increase_execution_state_cache_size branch from d7b6485 to 62be5ef Compare September 15, 2025 13:38
@ndr-ds ndr-ds requested review from Twey, afck, deuszx and ma2bd September 15, 2025 13:49
@ndr-ds ndr-ds force-pushed the 09-09-decrease_block_cache_size_increase_execution_state_cache_size branch from 62be5ef to f768ed5 Compare September 16, 2025 03:05
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from 3d9a544 to 7c77512 Compare September 16, 2025 03:05
@ndr-ds ndr-ds force-pushed the 09-09-decrease_block_cache_size_increase_execution_state_cache_size branch from f768ed5 to 9f0c9c4 Compare September 16, 2025 13:01
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from 7c77512 to a30737b Compare September 16, 2025 13:01
@ndr-ds ndr-ds changed the base branch from 09-09-decrease_block_cache_size_increase_execution_state_cache_size to graphite-base/4554 September 16, 2025 17:28
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from a30737b to 6832dc0 Compare September 16, 2025 17:42
@ndr-ds ndr-ds changed the base branch from graphite-base/4554 to 09-09-decrease_block_cache_size_increase_execution_state_cache_size September 16, 2025 17:42
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from 6832dc0 to bf70753 Compare September 16, 2025 17:56
@ndr-ds ndr-ds changed the base branch from graphite-base/4554 to main September 17, 2025 14:30
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from dfb6602 to dd8730c Compare September 17, 2025 16:10
Copy link
Contributor

@deuszx deuszx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you compare binary sizes?

@ndr-ds ndr-ds changed the base branch from main to graphite-base/4554 September 22, 2025 13:50
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from dd8730c to 766c4cd Compare September 22, 2025 13:50
@ndr-ds ndr-ds changed the base branch from graphite-base/4554 to 09-12-tracing_dashboard September 22, 2025 13:50
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from 766c4cd to 6e92379 Compare September 22, 2025 14:17
@ndr-ds ndr-ds force-pushed the 09-12-tracing_dashboard branch 2 times, most recently from 2b5ccd2 to da54d77 Compare September 22, 2025 17:12
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch 2 times, most recently from a28cb7b to f91b05d Compare September 22, 2025 17:24
@ndr-ds ndr-ds force-pushed the 09-12-tracing_dashboard branch from da54d77 to c08e648 Compare September 22, 2025 17:24
@ndr-ds ndr-ds changed the base branch from 09-12-tracing_dashboard to graphite-base/4554 October 9, 2025 18:58
@ndr-ds ndr-ds force-pushed the graphite-base/4554 branch from c08e648 to 9fa3039 Compare October 9, 2025 18:58
@ndr-ds ndr-ds force-pushed the 09-10-use_release_max_level_trace_instead branch from f91b05d to 2edac28 Compare October 9, 2025 18:58
@ndr-ds ndr-ds changed the base branch from graphite-base/4554 to main October 9, 2025 18:58
@ma2bd
Copy link
Contributor

ma2bd commented Oct 10, 2025

What I'm saying is that one could very well define "trace" as the logs that you will never use in production. Perhaps because they are too slow, or perhaps for security/privacy reasons.

Then, "debug" would be the logs you usually don't activate in production, but you could do so for debugging. Etc.

@Twey
Copy link
Contributor

Twey commented Oct 10, 2025

What I'm saying is that one could very well define "trace" as the logs that you will never use in production. Perhaps because they are too slow, or perhaps for security/privacy reasons.

It's impossible to tell in advance what logs will definitely never be needed to debug an issue that may arise in production. If we know for sure that a log line will never be needed in production, I'd suggest we just don't write it (or at least never commit it to main).

Then, "debug" would be the logs you usually don't activate in production, but you could do so for debugging. Etc.

Unfortunately that leaves us in a very awkward place with respect to INFO. We want to be able to do at least some level of retrospective debugging, at least on our internal validators. If DEBUG is turned off by default, that means that we have to either overwhelm INFO (the log level of interest to people who don't know/care about Linera codebase internals) with internal debugging details and references to concepts that end users shouldn't have to care about, or accept that we won't be able to figure out what happened to a service in the past without trying to reproduce the issue with DEBUG turned on.

@deuszx
Copy link
Contributor

deuszx commented Oct 11, 2025

I am of an opinion that TRACE is for logs that are very spammy and give full coverage of the flows. Rarely turned on on production deployments due to the amount of logs produced.

Fortunately, log levels can be set selectively on the per-module base as well. So we could say: linera=INFO,linera::storage=TRACE.

@ndr-ds ndr-ds added this pull request to the merge queue Oct 22, 2025
Merged via the queue into main with commit 30fcdb6 Oct 22, 2025
86 of 194 checks passed
@ndr-ds ndr-ds deleted the 09-10-use_release_max_level_trace_instead branch October 22, 2025 15:38
ndr-ds added a commit that referenced this pull request Oct 30, 2025
## Motivation

We currently compile out trace logs, even though the overhead is minimal
(it's 2 CPU instructions
https://www.reddit.com/r/rust/comments/x9nypb/comment/inutkiv/)

## Proposal

Stop compiling it out, so we can have `trace` logs if we want for
debugging with just a restart, without having to recompile the binary.

## Test Plan

I'm benchmarking networks with this, saw no visible performance
regression

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
ndr-ds added a commit that referenced this pull request Oct 30, 2025
## Motivation

We currently compile out trace logs, even though the overhead is minimal
(it's 2 CPU instructions
https://www.reddit.com/r/rust/comments/x9nypb/comment/inutkiv/)

## Proposal

Stop compiling it out, so we can have `trace` logs if we want for
debugging with just a restart, without having to recompile the binary.

## Test Plan

I'm benchmarking networks with this, saw no visible performance
regression

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
ndr-ds added a commit that referenced this pull request Oct 30, 2025
## Motivation
Backport a few PRs:

- **Make sending traces to Tempo less confusing (#4771)**
- **Add otlp_exporter_endpoint option to proxy/server as well (#4829)**
- **Stop using init and entrypoint scripts (#4830)**
- **Use release_max_level_trace instead (#4554)**
- **Make all heap profile logs trace (#4845)**
- **exporter: Add retry logging to indexer connection (#4846)**
- **Fix notification forwarding to avoid duplicate messages and handle
lag (#4848)**
- **Fix a few explorer frontend bugs (#4849)**
- **Add support for exporter/indexer/explorer stack deployment (#4850)**
- **Stop getting service monitor CRD from GitHub (#4855)**
- **Fix small bug on exporter config (#4870)**
- **Make sure that the Scylla config ConfigMap exists when Scylla gets
created (#4871)**
- **Remove UNIQUE constraints – those should be tracked by the node
anyway (#4852)**

## Proposal

Backport

## Test Plan

CI

---------

Co-authored-by: deuszx <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants