You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Keep exclusive/auto-delete queues with Khepri + network partition
[Why]
With Mnesia, when the network partition strategy is set to
`pause_minority`, nodes on the "minority side" are stopped.
Thus, the exclusive queues that were hosted by nodes on that minority
side are lost:
* Consumers connected on these nodes are disconnected because the nodes
are stopped.
* Queue records on the majority side are deleted from the metadata
store.
This was ok with Mnesia and how this network partition handling strategy
is implemented. However, it does not work with Khepri because the nodes
on the "minority side" continue to run and serve clients. Therefore the
cluster ends up in a weird situation:
1. The "majority side" deleted the queue records.
2. When the network partition is solved, the "minority side" gets the
record deletion, but the queue processes continue to run.
This was similar for auto-delete queues.
[How]
With Khepri, we stop to delete transient queue records in general, just
because there is a node going down. Thanks to this, an exclusive or an
auto-delete queue and its consumer(s) are not affected by a network
partition: they continue to work.
However, if a node is really lost, we need to clean up dead queue
records. This was already done for durable queues with both Mnesia and
Khepri. But with Khepri, transient queue records persist in the store
like durable queue records (unlike with Mnesia).
That's why this commit changes the clean-up function,
`rabbit_amqqueue:forget_all_durable/1` into
`rabbit_amqqueue:forget_all/1` which deletes all queue records of queues
that were hosted on the given node, regardless if they are transient or
durable.
In addition to this, the queue process will spawn a temporary process
who will try to delete the underlying record indefinitely if no other
processes are waiting for a reply from the queue process. That's the
case for queues that are deleted because of an internal event (like the
exclusive/auto-delete conditions). The queue process will exit, which
will notify connections that the queue is gone.
Thanks to this, the temporary process will do its best to delete the
record in case of a network partition, whether the consumers go away
during or after that partition. That said, the node monitor drives some
failsafe code that cleans up record if the queue process was killed
before it could delete its own record.
Fixes#12949, #12597, #14527.
(cherry picked from commit 3c4d073)
0 commit comments