Replies: 1 comment 1 reply
-
|
3.11.28 is an old version that is out of community support. Please try your test against the only supported community version: 3.13.x |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
When a node in a cluster is restarting, running
rabbitmqctl list_queueson another node can sometimes cause the second node to hang. All subsequentrabbitmqctlandrabbitmq-diagnosticscommands fail because the target node is unreachable. The hung node also shows high, persistent CPU use by thebeam.smperlang process. The node is recovered only when we restart the container running RabbitMq.There are previous reports of
list_queuescommand hanging but in this case it is the node that it is run against that hangs. Thelist_queuescommand itself exits immediately with the following error:We have been able to reproduce the issue with both
list_queuesandlist_unresponsive_queueson version 3.11.28.Reproduction steps:
rabbitmqctl stop_appetc.:list_queuescommand loop:Corresponding output from node 2:
The
for loopcompletes successfully on node 2 and node 2 remains responsive.In some cases, node 1 does NOT become unresponsive and the
for loopcontinues even after receiving thebadrpcerror:Other information
rabbit.logon the node that is hung. Even enabling debug logs did not yield anything new.erl -remshalso times out:rabbitmq-diagnosticson the node before it hangs also eventually times out when the node becomes unresponsive:topoutput showing 100% CPU core usage:cluster_statuson other nodes shows that status of hung nodes isunknown:Beta Was this translation helpful? Give feedback.
All reactions