Replies: 1 comment 2 replies
-
We don't have enough specific to pin point the root cause. The only fundamental solution is to use Khepri, which will be considered a mature option by 4.0 this fall. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I had the following issue with release 3.13.4 from this discussion number: 11712
When 3.13.5 was released, everything seemed to have been fixed. At least with the few updates that I tried.
Now, with updating from 3.13.5 to 3.13.6 I am getting a different breakage where the cluster goes to network partitioning. This is after the upgrade when doing a rollout restart of the statefulset.
I am seeing lines like this in the attached logs:
2024-07-28 00:38:25.963849+00:00 [error] <0.1115.0> Mnesia('[email protected]'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, '[email protected]'}
2024-07-28 00:38:25.963849+00:00 [error] <0.1115.0>
2024-07-28 00:38:25.965275+00:00 [info] <0.1711.0> Autoheal request sent to '[email protected]'
The strange thing is that the cluster rebalance shows that it rebalanced across all the nodes in the cluster and running rabbitmq-queues quorum_status command on all rabbitmq nodes shows consistent tables.
The only way that I have found to get around this is to just restart the nodes that the portal shows as having network partitioning from rabbitmq-0.
I am attaching the logs from all three nodes and a screen capture of the network partitioning message in the portal.

rabbitmq-2.txt
rabbitmq-1.txt
rabbitmq-0.txt
Beta Was this translation helpful? Give feedback.
All reactions