[Prometheus] Metrics are sometimes returning 0 #15355

dforste · 2026-01-27T19:07:01Z

dforste
Jan 27, 2026

Describe the bug

We recently upgrade rabbitMQ running from docker. rabbitmq:4.1.3-management => rabbitmq:4.2.2-management.

Post upgrade several Prometheus metrics started behaving inconsistently. This started about 7 hrs post upgrade at midnight.

The metrics we noticed this with are:
rabbitmq_global_messages_delivered_total
rabbitmq_global_messages_acknowledged_total

We are scraping from the /metrics/per-object endpoint every 10s. Metric counters would randomly return 0 and then go back up to what seems a correct value. Not all are 0 just some randomly.

Metrics returned from the Management Plugin seem not to have this issue.

Reproduction steps

Form Cluster
Upgrade to latest
Collect metrics and wait until midnight for metrics to start returning 0.

Expected behavior

Timeout or return correct metric.

Additional context

Only one message from this time and it doesn't seem unusual:

rabbit_sysmon_handler busy_dist_port <0.859.0> [{name,delegate_management_4},{initial_call,{delegate,init,1}},{gen_server2,process_next_msg,1},{message_queue_len,0}] {#Port<0.18>,unknown}�[0m

Answered by michaelklishin

Jan 28, 2026

@dforste that "doesn't seem unusual" message tells you that the inter-node communication link on that node has been overloaded for a certain amount of time continuously.

That can directly affect the metrics that are aggregated across all nodes: not all responses arrive within the short timeout such operations use, and therefore you get underreported metrics in the UI.

Without clear evidence of other scenarios, that's my conclusion. Perhaps you have periodic processes that publish large messages running at midnight, or something like that.

View full answer

michaelklishin · 2026-01-28T01:24:41Z

michaelklishin
Jan 28, 2026
Maintainer

@dforste that "doesn't seem unusual" message tells you that the inter-node communication link on that node has been overloaded for a certain amount of time continuously.

That can directly affect the metrics that are aggregated across all nodes: not all responses arrive within the short timeout such operations use, and therefore you get underreported metrics in the UI.

Without clear evidence of other scenarios, that's my conclusion. Perhaps you have periodic processes that publish large messages running at midnight, or something like that.

2 replies

dforste Jan 29, 2026
Author

Metrics found on api/overview page under message_stats are consistently correct and never 0. Whereas the Prometheus stats are occasionally 0. I still think there may be an issue with the Prometheus metric management.

michaelklishin Jan 29, 2026
Maintainer

@dforste you are assuming that all metrics are computed the same way, which is not true.

The metrics exposed via Prometheus are node-local, Prometheus aggregates them.

The metrics exposed via the HTTP API are often not node local, the node handling the HTTP API request does aggregation and therefore is directly affected by anything that can slow down or block such aggregation.

As #15289 or the kernel page cache docs demonstrate, it can take a good while to dig up what is presented as a RabbitMQ behavior.

The burden of proof that there is an issue is on the reporter (a way to reproduce, or any other kind of analysis), not on the maintainers (to prove that there is no issue).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Prometheus] Metrics are sometimes returning 0 #15355

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

[Prometheus] Metrics are sometimes returning 0 #15355

Uh oh!

Uh oh!

dforste Jan 27, 2026

Describe the bug

Reproduction steps

Expected behavior

Additional context

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

michaelklishin Jan 28, 2026 Maintainer

Uh oh!

dforste Jan 29, 2026 Author

Uh oh!

Uh oh!

michaelklishin Jan 29, 2026 Maintainer

dforste
Jan 27, 2026

Replies: 1 comment 2 replies

michaelklishin
Jan 28, 2026
Maintainer

dforste Jan 29, 2026
Author

michaelklishin Jan 29, 2026
Maintainer