Replies: 1 comment 4 replies
-
Please start by explaining what you mean by |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Given: RMQ (
v3.12.0
if matter) cluster consisting of 3 nodes.There are several services. Each service operates with a set of
N
number ofQQ
. The volumes processed vary from service to service, i.e. some queues receive larger messages (and in greater quantities) than others.Problem:
If a service that processes large amounts of data stops reading messages, this can lead to an overflow of the entire RMQ and a crash of the entire system.
Solution:
In order to reduce blast radius and protect services from each other, the idea arose to implement something like
anti-affinity
rules: usex-quorum-initial-group-size
for queues and try to bind (via something like this: #8532 (I suppose this needs updating) them to certain cluster nodes -> queues of service A are located on nodes from0 to 2
, B from3 to 5
, C from6 to 8
. In this case, the system continues to operate within one cluster and the overflow of queues of one of the groups does not affect the state of the entire cluster and other groups, but this will require adding an additional 6 nodes to the cluster.Actually, I would like to hear thoughts on this matter, to what extent would such a use of RMQ be reasonable?
Let's assume this is a working solution, then a question arises based on this discussion: #7209
What pitfalls may arise in the future in a scenario when the nodes providing the service fail - will the queues located on them automatically be promoted to other nodes of the cluster? Will it be possible to disable promotion per queue (if not, then there can be no talk of any anti-affinity)?
Beta Was this translation helpful? Give feedback.
All reactions