Better GSM tracing and metrics #6306
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Three commits:
More consistent naming
Higher severities for the GSM state transition messages. These are very rare, and (especially with Genesis enabled) strongly influence the behavior of a node, so
Notice
seems appropriate.GsmEventLeaveCaughtUp
indicates that something is wrong as the chain isn't growing from the perspective of the node, soWarning
is advisable.Add a metric for the GSM state (
cardano_node_metrics_GSM_state_int
) which encodesPreSyncing = 0
,Syncing = 1
andCaughtUp = 2
. Tooling such as Grafana can then easily display the GSM state over time.Currently, the initial GSM state isn't traced, this is fixed by IntersectMBO/ouroboros-consensus#1628
Works towards IntersectMBO/ouroboros-consensus#1530
Checklist
See Runnings tests for more details
CHANGELOG.md
for affected package.cabal
files are updatedhlint
. See.github/workflows/check-hlint.yml
to get thehlint
versionstylish-haskell
. See.github/workflows/stylish-haskell.yml
to get thestylish-haskell
versionghc-9.6
andghc-9.12
Note on CI
If your PR is from a fork, the necessary CI jobs won't trigger automatically for security reasons.
You will need to get someone with write privileges. Please contact IOG node developers to do this
for you.