Skip to content

analyzer: Prometheus metric reports negative queue lengths, leading to massive uint64 value #1201

@uscinski

Description

@uscinski
SUMMARY

{consensus,<runtime>}_queue_length metrics report massive values of 18446744073709552000, especially when Nexus is indexing most recent blocks.

ISSUE TYPE
  • Bug Report
STEPS TO REPRODUCE

Observe above mentioned metrics. Or check with the following SQL query that points to the root cause of the issue:

WITH latest_height AS (
  SELECT height
  FROM chain.latest_node_heights
  WHERE layer = 'consensus'
)
SELECT b.height
FROM chain.blocks b, latest_height h
WHERE b.height > h.height

This query will occasionally return a non-empty result. When this happens, the node stats analyzer is behind the heights that have been already processed. In block.go, the metrics are calculated as nodeHeight - height, leading to a negative value. When cast to uint64, it returns a massive positive integer, close to 264.

ACTUAL RESULTS

Observe PromQL sum(consensus_queue_length{service=<service>})

Time Metric value
2025-11-15 18:04:00 18446744073709552000
2025-11-15 18:04:30 18446744073709552000
2025-11-15 18:05:00 18446744073709552000
EXPECTED RESULTS

The metrics should report correct queue lengths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions