consensus: standardize slow block JSON output for cross-client metrics by CPerezz · Pull Request #10288 · NethermindEth/nethermind

CPerezz · 2026-01-21T09:47:28Z

Summary

Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz.

Motivation

Standardized execution metrics are critical for:

Cross-client performance comparison
Network health monitoring
Data-driven protocol research

Real-world example: The EIP-7907 analysis used execution metrics to measure code read latency, per-call overhead scaling, and block execution breakdown. Without standardized metrics across clients, such analysis cannot be validated cross-client.

References

Original proposal: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ (Gary Rong)
EIP-7907 analysis: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

Configuration

Two new flags in Blocks config section:

Flag	Default	Description
`Blocks.SlowBlockThresholdMs`	`-1` (disabled)	Threshold in ms for logging slow blocks. Set to `1000` to log blocks slower than 1s. Set to `0` to log all blocks.
`Blocks.SlowBlockPerTxThresholdMs`	`-1` (disabled)	Per-transaction threshold in ms. When enabled, individual transactions exceeding this threshold are included in the slow block JSON. Set to `0` to log all transactions.

Both flags default to -1 (disabled). When disabled, there is zero overhead on block processing — no metric captures, no delta computation, no JSON serialization.

When enabled, the overhead is ~31μs per block (~0.08% on a 40ms block) from reading thread-local counters. Per-tx timing adds ~6μs for a 150-tx block when enabled.

Example usage

# Log blocks slower than 1s with all transactions
--Blocks.SlowBlockThresholdMs=1000 --Blocks.SlowBlockPerTxThresholdMs=0

# Log blocks slower than 500ms, only transactions slower than 50ms
--Blocks.SlowBlockThresholdMs=500 --Blocks.SlowBlockPerTxThresholdMs=50

# Log every block, no per-tx breakdown
--Blocks.SlowBlockThresholdMs=0

JSON Output Format

{
  "level": "warn",
  "msg": "Slow block",
  "block": {
    "number": 22075123,
    "hash": "0xabc...",
    "gas_used": 29850000,
    "gas_limit": 30000000,
    "tx_count": 142,
    "blob_count": 6
  },
  "timing": {
    "execution_ms": 52.1,
    "evm_ms": 45.2,
    "blooms_ms": 1.3,
    "receipts_root_ms": 3.8,
    "commit_ms": 2.1,
    "storage_merkle_ms": 18.4,
    "state_root_ms": 5.2,
    "state_hash_ms": 23.6,
    "total_ms": 82.0
  },
  "throughput": {
    "mgas_per_sec": 364.1
  },
  "state_reads": {
    "accounts": 1842,
    "storage_slots": 3201,
    "code": 215,
    "code_bytes": 482310
  },
  "state_writes": {
    "accounts": 312,
    "accounts_deleted": 0,
    "storage_slots": 1805,
    "storage_slots_deleted": 12,
    "code": 3,
    "code_bytes": 8420,
    "eip7702_delegations_set": 0,
    "eip7702_delegations_cleared": 0
  },
  "cache": {
    "account": { "hits": 1520, "misses": 322, "hit_rate": 82.52 },
    "storage": { "hits": 2800, "misses": 401, "hit_rate": 87.48 },
    "code": { "hits": 180, "misses": 35, "hit_rate": 83.72 }
  },
  "evm": {
    "opcodes": 1542000,
    "sload": 3201,
    "sstore": 1805,
    "calls": 2412,
    "empty_calls": 48,
    "creates": 3,
    "self_destructs": 0,
    "contracts_analyzed": 118,
    "cached_contracts_used": 97
  },
  "transactions": [
    {
      "index": 42,
      "hash": "0xdef...",
      "gas_used": 8200000,
      "execution_ms": 12.3,
      "type": "Legacy",
      "to": "0x7a25..."
    },
    {
      "index": 107,
      "hash": "0x123...",
      "gas_used": 5100000,
      "execution_ms": 8.7,
      "type": "EIP1559",
      "to": "0xdead..."
    }
  ]
}

Timing field definitions

Field	Definition
`total_ms`	Wall-clock block processing time
`execution_ms`	`total - state_hash - commit` (backwards-compatible)
`evm_ms`	`total - state_hash - commit - blooms - receipts_root` (pure EVM)
`blooms_ms`	Bloom filter calculation (parallel)
`receipts_root_ms`	Receipts Patricia trie root computation
`commit_ms`	State commit without root recalculation
`storage_merkle_ms`	Storage trie merkleization (`Commit(commitRoots: true)`)
`state_root_ms`	State root recalculation + commit tree
`state_hash_ms`	`storage_merkle + state_root` (total merkleization)

The transactions array is only present when SlowBlockPerTxThresholdMs >= 0.

Implementation

All metrics use ZeroContentionCounter (lock-free thread-local counters) — no locks, no contention
Timing uses Stopwatch.GetTimestamp() only at coarse block-phase boundaries (not per state read/write)
JSON is serialized via Utf8JsonWriter + ArrayBufferWriter<byte> on the ThreadPool, not the block-processing thread
Per-tx timing uses a thread-static array with one Stopwatch.GetTimestamp() per tx (~20ns)
Delta computation and JSON output are queued to ThreadPool via UnsafeQueueUserWorkItem

Implements execution metrics following the cross-client specification: https://github.com/ethereum/execution-specs/blob/main/docs/execution-metrics-spec.md - Add slow block JSON logging (threshold: 1000ms) - Include timing, throughput, and EVM operation counts - Leverage existing SLOAD/SSTORE/CALL metrics infrastructure - Output format matches cross-client standardization spec

Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, code, code_bytes - state_writes: accounts, storage_slots - cache: account/storage/code hits, misses, hit_rate - evm: sload, sstore, calls, creates Also adds ThreadLocal accessors to Db.Metrics for per-block cache statistics tracking without global synchronization overhead.

Convert timing calculations from long to double to preserve sub-millisecond precision in slow block JSON output. Changes: - ProcessingStats: timing variables changed to double with float division - Math.Round(..., 3) for consistent 3 decimal place output - ZeroContentionCounter: add Increment(long) overload for type safety

Copilot

Pull request overview

This pull request implements standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. The changes add comprehensive execution metrics tracking and output them in a cross-client standardized JSON format when block processing exceeds a threshold.

Changes:

Added ThreadLocal accessor methods to Db.Metrics for per-block cache statistics
Defined new state access and timing metrics in Evm.Metrics (account/storage/code reads and writes)
Integrated metric tracking into StateProvider and PersistentStorageProvider for state access monitoring
Updated ProcessingStats to capture metrics and output slow block logs in standardized JSON format

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
src/Nethermind/Nethermind.State/StateProvider.cs	Adds account read tracking via EvmMetrics.IncrementAccountReads()
src/Nethermind/Nethermind.State/PersistentStorageProvider.cs	Adds storage read tracking via EvmMetrics.IncrementStorageReads()
src/Nethermind/Nethermind.Evm/Metrics.cs	Defines new state access and timing metrics with ThreadLocal accessors
src/Nethermind/Nethermind.Evm/CodeInfoRepository.cs	Adds code read and code bytes tracking
src/Nethermind/Nethermind.Db/Metrics.cs	Adds ThreadLocal accessors for cache hit metrics
src/Nethermind/Nethermind.Core/Threading/ZeroContentionCounter.cs	Adds long value support for metrics tracking
src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs	Implements slow block JSON logging with comprehensive metrics capture

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/Nethermind/Nethermind.Evm/Metrics.cs

src/Nethermind/Nethermind.State/StateProvider.cs

src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs

src/Nethermind/Nethermind.Evm/Metrics.cs

Add tracking for EIP-7702 delegation set/cleared operations as part of the cross-client execution metrics standardization effort. New metrics in Nethermind.Evm.Metrics: - Eip7702DelegationsSet: Number of EIP-7702 delegations set - Eip7702DelegationsCleared: Number of EIP-7702 delegations cleared Both metrics include thread-local variants for use in slow block logging. The slow block JSON output now includes: - state_writes.eip7702_delegations_set - state_writes.eip7702_delegations_cleared These fields will be 0 for pre-Pectra blocks (per spec).

ContractsAnalysed tracks jump destination analysis, not cache misses. CodeReads is incremented when code is loaded from DB (cache miss).

Instead of duplicating total_ms, calculate: execution_ms = total_ms - state_read_ms - state_hash_ms - commit_ms Falls back to total_ms if timing metrics not fully captured.

- Add IncrementCodeWrites/IncrementCodeBytesWritten in InsertCode - Add IncrementAccountWrites in SetState - Add IncrementStorageWrites in Set

Move IncrementAccountReads inside GetState to only count trie/DB reads, matching storage read semantics. Previously counted all reads including in-memory cache hits.

…metrics Pipe timing data to EvmMetrics for state_read_ms, state_hash_ms, and commit_ms: - WorldStateMetricsDecorator: pipe existing timing to EvmMetrics for RecalculateStateRoot, Commit (both overloads), and CommitTree - StateProvider.GetState: add timing around DB reads (not cache hits) - PersistentStorageProvider.LoadFromTree: add timing for storage reads Uses Stopwatch.GetTimestamp()/GetElapsedTime() with TimeSpan.Ticks (100ns precision) for minimal overhead (~10-20ns per measurement).

…tandardization Add new metrics for tracking account and storage deletions during EVM execution: - AccountDeleted: incremented when SetState is called with null account - StorageDeleted: incremented when SSTORE sets a slot to zero - Both metrics flow through ProcessingStats to slow block JSON output

Wire up existing but unused EIP-7702 delegation metric increment methods: - IncrementEip7702DelegationsSet() when setting delegation (non-zero address) - IncrementEip7702DelegationsCleared() when clearing delegation (zero address)

Add comprehensive integration tests for all cross-client execution metrics: - EIP-7702 delegation tests (set, clear, multiple) - Account metrics tests (reads, writes, deleted) - Storage metrics tests (reads, writes, deleted) - Code metrics tests (loaded, updated)

LukaszRozmej

Looks good, added some questions.

Food for thought @benaadams
With so many metrics, maybe it would be useful to have them continuous in memory, so we could make copies by copying memory region and potentially delta's using SIMD?

src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs

src/Nethermind/Nethermind.Core.Test/MetricsSnapshot.cs

benaadams · 2026-01-27T09:11:16Z

Number of additional thread locals bothers me; probably better to have as normal statics but only write on main block (i.e. are these metrics relevent to any other background or parallel activities?)

They are only interesting if mainnet block is slow?

Also they are expensive to write to, so rolling up counts via class field, and then writing batch change would be better

LukaszRozmej · 2026-01-27T10:03:22Z

Number of additional thread locals bothers me; probably better to have as normal statics but only write on main block (i.e. are these metrics relevent to any other background or parallel activities?)

They are only interesting if mainnet block is slow?

Also they are expensive to write to, so rolling up counts via class field, and then writing batch change would be better

I'm kind of thinking it shouldn't be a static, or the only threadstatic should be an object (struct?). That would require bigger changes.

Dead code — the MetricsSnapshot and MetricsDelta classes were never referenced anywhere in the codebase.

src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs

flcl42 · 2026-01-27T11:04:58Z

src/Nethermind/Nethermind.Evm.Test/MetricsIntegrationTests.cs

+    private void DeployCode(Address address, byte[] code)
+    {
+        _stateProvider.CreateAccountIfNotExists(address, 0);
+        _stateProvider.InsertCode(address, code, _specProvider.GetSpec(ForkActivation.TimestampOnly(MainnetSpecProvider.PragueBlockTimestamp)));


Can user just Prague.Instance or so instead of getting it from the spec provider

Yes but this will need to be maintained always. In fact, we would have to use Osaka here to be sure we are fwd. compatible no?

I updated anyways, but I'm not fully convinced..

It's just so using mainnet timestamps for single spec provider does not look perfect to me. Btw you can init a _releaseSpec some where with value like Prague.Instance or Osaka.Instance and reuse it.

If you mean that you would better use just the top mainnet spec, we might try to introduce Mainnet.Instance that points to the most recent fork and we change it when time comes. Can't recall of something existing like this

flcl42 · 2026-01-27T11:26:52Z

src/Nethermind/Nethermind.State/PartialStorageProviderBase.cs

        public void Set(in StorageCell storageCell, byte[] newValue)
        {
+            // Track storage write for execution metrics
+            EvmMetrics.IncrementStorageWrites();


@benaadams @LukaszRozmej maybe we can make some generic argument trick to avoid calling metric if don't need to(if MetricsOff is passed metrics are excluded due to unconditional if)?

…cumulator Replace 16 separate ZeroContentionCounter instances (each with its own ThreadLocal<BoxedLong>) with a single ThreadLocal<ExecutionMetricsAccumulator>. This addresses Ben's review feedback about the number of additional thread locals on hot execution paths. The consolidated accumulator reduces ThreadLocal overhead by requiring only one TLS lookup per increment instead of one per counter. Prometheus exposure is preserved via SumExecutionMetric() which iterates the shared ThreadLocal's Values collection. Call sites now use direct field increments (e.g. ThreadExecutionMetrics.AccountReads++) instead of method calls.

CPerezz · 2026-01-27T12:19:15Z

@LukaszRozmej @benaadams Can yoou look at ead23b9 and see if that cuts it for you?

…simplify test spec - Replace anonymous object in LogSlowBlock with named record types for structured JSON serialization (SlowBlockLog, SlowBlockInfo, etc.) - Simplify spec retrieval in MetricsIntegrationTests to use Prague.Instance directly instead of resolving via MainnetSpecProvider

LukaszRozmej · 2026-01-30T10:07:52Z

@CPerezz I think it is fine for now, but it will break when parallel execution comes in. Regardless if it is Block-STM or BAL based.

asdacap · 2026-01-30T11:33:29Z

Maybe this can help. https://github.com/NethermindEth/nethermind/blob/master/src/Nethermind/Nethermind.Init/Modules/MainProcessingContext.cs#L39. There is a HintGet for get that is cached by prewarmer.

CPerezz · 2026-01-30T14:04:13Z

@CPerezz I think it is fine for now, but it will break when parallel execution comes in. Regardless if it is Block-STM or BAL based.

Hey @LukaszRozmej thanks! Indeed the plan is to also start metering based on BALs via: https://notes.ethereum.org/@ethpandaops/bal-otel

CPerezz · 2026-02-09T11:41:07Z

@benaadams @LukaszRozmej any way we can merge this this week?

So far Reth and Besu are almost done, so you're the last team we are waiting on such that the stateful tests come with more insights.

LMK if you need any help 🙏

…flicts with master - Replace ExecutionMetricsAccumulator with ZeroContentionCounter pattern - Remove Stopwatch.GetTimestamp() from per-read hot paths (StateProvider, PersistentStorageProvider) - Use Utf8JsonWriter for zero-allocation slow block JSON serialization - Collapse BlockData Start/Current pairs into pre-computed deltas - Adapt to master's refactored CodeInfoRepository, WrappedWorldState, IProcessingStats - Condense tests from ~1600 to ~530 lines

Address PR review feedback from flcl42.

…block log Add per-phase timing (blooms_ms, receipts_root_ms, storage_merkle_ms, state_root_ms, evm_ms) and extended EVM/block fields (opcodes, self_destructs, empty_calls, contracts_analyzed, cached_contracts_used, blob_count, gas_limit) to the slow block JSON output. execution_ms retains its original definition for backwards compatibility. All new timing is captured via the same ZeroContentionCounter pattern used by existing metrics — ~25μs overhead per block.

Wire slow block threshold to IBlocksConfig so it can be set via config file or CLI. Default: 1000ms (log blocks slower than 1s). Set to 0 to log all blocks, -1 to disable slow block logging entirely. When disabled (-1), CaptureStartStats and delta computation are skipped on the block-processing thread — zero overhead from slow block analysis.

Slow block logging is opt-in. Set Blocks.SlowBlockThresholdMs=1000 to enable logging blocks slower than 1s.

New config flag Blocks.SlowBlockPerTxThresholdMs (default -1 = disabled). When enabled alongside SlowBlockThresholdMs, each transaction exceeding the per-tx threshold is included individually in the slow block JSON: "transactions": [ { "index": 42, "hash": "0x...", "gas_used": 8200000, "execution_ms": 12.3, "type": "Legacy", "to": "0x..." } ] Uses a thread-static PerTxTimingCollector with one Stopwatch.GetTimestamp per tx (~20ns). When disabled, zero overhead — the collector is not even checked in the tx loop.

When SlowBlockThresholdMs defaults to -1, processingMs >= -1 was always true causing every block to be logged. But CaptureStartStats/UpdateStats correctly skipped delta capture for -1, resulting in all-zero fields.

The async processing loop (RunProcessingLoop) can resume on a different ThreadPool thread after each await WaitToReadAsync. Since _enabled is [ThreadStatic], setting it once in Start() only affects the first thread. Moving it to CaptureStartStats ensures it's set on the current block- processing thread for every block.

CPerezz requested review from Demuirgos, LukaszRozmej, MarekM25, asdacap, benaadams, damian-orzechowski and flcl42 as code owners January 21, 2026 09:47

CPerezz added 2 commits January 21, 2026 12:00

CPerezz force-pushed the feature/execution-metrics-standardization branch from b6dfcec to 8cdef27 Compare January 21, 2026 11:01

benaadams requested a review from Copilot January 22, 2026 10:41

Copilot started reviewing on behalf of benaadams January 22, 2026 10:42 View session

Copilot AI reviewed Jan 22, 2026

View reviewed changes

CPerezz added 7 commits January 22, 2026 12:20

fix(metrics): set slow block threshold to 1000ms per spec

170e2cf

fix(metrics): use CodeReads for code cache miss tracking

1c0573d

ContractsAnalysed tracks jump destination analysis, not cache misses. CodeReads is incremented when code is loaded from DB (cache miss).

fix(metrics): derive execution_ms from timing breakdown

a355db7

Instead of duplicating total_ms, calculate: execution_ms = total_ms - state_read_ms - state_hash_ms - commit_ms Falls back to total_ms if timing metrics not fully captured.

feat(metrics): add state write tracking

20aadd6

- Add IncrementCodeWrites/IncrementCodeBytesWritten in InsertCode - Add IncrementAccountWrites in SetState - Add IncrementStorageWrites in Set

fix(metrics): track account reads at DB level only

84c84c7

Move IncrementAccountReads inside GetState to only count trie/DB reads, matching storage read semantics. Previously counted all reads including in-memory cache hits.

CPerezz force-pushed the feature/execution-metrics-standardization branch from 147cd96 to cd54c85 Compare January 22, 2026 15:14

CPerezz mentioned this pull request Jan 25, 2026

fix(chainspec): add Prague support to spaceneth dev chain #10316

Merged

4 tasks

CPerezz force-pushed the feature/execution-metrics-standardization branch from cd54c85 to 49ba4db Compare January 25, 2026 21:27

CPerezz added 6 commits January 25, 2026 22:29

fix(chainspec): add maxCodeSize to spaceneth for EIP-3860

6d0a208

fix(chainspec): add explicit chainID to spaceneth

2061afc

fix(chainspec): add Prague system contracts to spaceneth genesis

b28789e

LukaszRozmej reviewed Jan 26, 2026

View reviewed changes

src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs Show resolved Hide resolved

src/Nethermind/Nethermind.Core.Test/MetricsSnapshot.cs Outdated Show resolved Hide resolved

chore: remove unused MetricsSnapshot.cs

d53bebc

Dead code — the MetricsSnapshot and MetricsDelta classes were never referenced anywhere in the codebase.

flcl42 reviewed Jan 27, 2026

View reviewed changes

shekhirin mentioned this pull request Jan 27, 2026

feat(engine): slow block logs paradigmxyz/reth#21433

Merged

benaadams added evm database labels Feb 13, 2026

CPerezz mentioned this pull request Feb 27, 2026

Gas Lighting Committee #18, Mar 3, 2026 ethpandaops/gas-lighting-tracker#38

Closed

kamilchodola added 2 commits March 20, 2026 16:13

Merge origin/master into feature/execution-metrics-standardization

b5f2594

github-actions bot added the state+storage label Mar 20, 2026

kamilchodola added 2 commits March 20, 2026 16:44

test: use Prague.Instance directly instead of spec provider lookups

88b20ab

Address PR review feedback from flcl42.

kamilchodola force-pushed the feature/execution-metrics-standardization branch from 8be8415 to 8d7cb78 Compare March 20, 2026 16:38

kamilchodola force-pushed the feature/execution-metrics-standardization branch from 5d6022c to ae494cf Compare March 20, 2026 16:49

kamilchodola requested a review from rubo as a code owner March 20, 2026 16:49

github-actions bot added the configuration label Mar 20, 2026

kamilchodola added 4 commits March 20, 2026 17:49

chore: default SlowBlockThresholdMs to -1 (disabled)

7103fa3

Slow block logging is opt-in. Set Blocks.SlowBlockThresholdMs=1000 to enable logging blocks slower than 1s.

fix: guard slow block logging when threshold is disabled (-1)

3511aef

When SlowBlockThresholdMs defaults to -1, processingMs >= -1 was always true causing every block to be logged. But CaptureStartStats/UpdateStats correctly skipped delta capture for -1, resulting in all-zero fields.

Conversation

CPerezz commented Jan 21, 2026 • edited by kamilchodola Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

References

Configuration

Example usage

JSON Output Format

Timing field definitions

Implementation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LukaszRozmej left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

benaadams commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukaszRozmej commented Jan 27, 2026

Uh oh!

Uh oh!

flcl42 Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

CPerezz Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

flcl42 Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

flcl42 Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

CPerezz commented Jan 27, 2026

Uh oh!

LukaszRozmej commented Jan 30, 2026

Uh oh!

asdacap commented Jan 30, 2026

Uh oh!

CPerezz commented Jan 30, 2026

Uh oh!

CPerezz commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

CPerezz commented Jan 21, 2026 •

edited by kamilchodola

Loading

benaadams commented Jan 27, 2026 •

edited

Loading