Classic mirrored queues ended up without an electable leader #9379
-
Describe the bugWe had an incident in production where a Mirrored queue stopped working and could not recover, but we cannot explain why. It seems as if the cluster just 'forgot' which node was the master for this queue. Some background information:
Reproduction steps
Expected behaviorWe expected that in a high availability setup, when one node encounters problems, another node would take over, and the queue would continue to function. Additional contextThis log entry coincides exactly with when we started experiencing issues with the queue:
But this message does not make sense because no pods were shut down/restarted. All were up for 23 days. There is also no sign of an actual restart in the logs (usually you would see a Attempts to reconnect to the queue were met with timeouts:
We eventually deleted and recreated the queue, and this solved the issue. This was shown in logs when queue was deleted:
The logs for each instance is below. They are the only logs from the day of the incident. Node 1:
Node 2:
Node 3:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
RabbitMQ 3.9 has reached end of life. Classic mirrored queues have been deprecated for a couple of years now, scheduled to be removed in 4.0, which is expected to come out in Q1-Q2 next year. You must switch to quorum queues and/or streams (and upgrade from 3.9).
is the line you are looking for. This scenario has a dedicated documentation section. |
Beta Was this translation helpful? Give feedback.
RabbitMQ 3.9 has reached end of life. Classic mirrored queues have been deprecated for a couple of years now, scheduled to be removed in 4.0, which is expected to come out in Q1-Q2 next year. You must switch to quorum queues and/or streams (and upgrade from 3.9).
is the line you are looking for. This scenario has a dedicated documentation section.