RabbitMQ sometimes unresponsive ("doing siesta") #6944
Replies: 2 comments 7 replies
-
Does not match what's inside your config files but you seem to be using the caching auth backend. Another hypothesis: you use LDAP, an external dependency, which, if it slows down of fails to accept connection without refusing them immediately, will make the node wait before it can respond to the client. Start by removing both of those backends, then re-introduce LDAP, then get that to work. |
Beta Was this translation helpful? Give feedback.
-
|
I suggest to make it clear in the log files when there is an unresponsive LDAP server. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am running RabbitMQ 3.11.5 / Erlang 25.1.2 on a RHEL 8.6 system, patched until Nov 30, 2022.
My system is running as a three node cluster, I use quorum queues. In front of that is a HAproxy load balancer, but my issue is independent from HAproxy.
After "some time" my nodes seem to go into a status where they do not respond (correctly) to connection attempts but instead the client runs into a timeout after 12 seconds. The next connection attempt to the same node will succeed. "Some time" is anything below 30 minutes, I did not measure that yet.
Naturally, in this scenario load balancer clients will get up to three timeouts before they can connect successfully. But as I said: the issue is independent from the load balancer.
To produce some diagnostic data I wrote a small Perl script, attached as send.txt. It only does connect(), channel_open() and close(). It tends to run into "request timed out" while doing connect() the first time. See attached send.out.txt.
My users use the Java client and other libraries and experience the same. This is not a Perl issue, I use Perl just bc I'm old. :-)
When the timeout happens, I see this in the log (but with a different number than the failing connection):
I habe attached [email protected], rabbitmq.conf.txt, advanced.config.txt, output of rabbitmq-diagnostics.report.txt and rabbitmq-diagnostics.cluster_status.txt.
Can I add any other diagnostic data to help solving this issue?
One a side note - this link gets displayed when opening a new issue but does not exist
https://github.com/rabbitmq/rabbitmq-server/blob/main/CONTRIBUTING.md#github-issues
Beta Was this translation helpful? Give feedback.
All reactions