Queue Federation doesn't work with V2 Classic Queues #8297
-
Describe the bugA recent upgrade of RabbitMQ from 3.9.x --> 3.11.x and use of V2 Classic queues has broken queue federation between our clusters. This queue Federation has worked fine for us, with no change in upstreams or federation policy, for multiple years. Queue Federation seems to be broken with V2 Classic Mirrored Queues. Reproduction steps
Expected behavior
Additional contextThis appears to have broken with the move to V2 Classic Queues in 3.11.x |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 4 replies
-
Queue federation does not depend on what storage implementation of classic queues is used. I find it hard to believe that it is CQv2 that are the root cause. A more likely cause is a common misunderstanding of how queue federation works. Queue federation only moves messages between clusters if there are no local consumers. Federation links are very low (negative) priority consumers that only kick in if the node does not have any local consumers. |
Beta Was this translation helpful? Give feedback.
-
If you want messages to be replicated from cluster A to B, you want exchange federation. If you want them to be unconditionally moved from A to B, you want |
Beta Was this translation helpful? Give feedback.
-
I just tested this and it (the instructions in "Expected Behaviour" seems to work just fine. I think we'd need very detailed instructions with the exact policies / parameters etc to investigate further. |
Beta Was this translation helpful? Give feedback.
-
Here are some specific steps that can be used to reproduce: Given two nodes, A (standard ports) and B (AMQP port: 5673, HTTP API: 15673) with the following # config.a.conf
classic_queue.default_version = 2 # config.b.conf
classic_queue.default_version = 2
management.tcp.port = 15673
I can observe a federation link from A to B:
I open a publishing connection to node B and a consuming one to node A: c72 = Bunny.new(port: 5672); c72.start
c73 = Bunny.new(port: 5673); c73.start
ch1 = c72.create_channel
ch2 = c73.create_channel
q1 = ch1.queue("a.cqv2.1", durable: true)
q2 = ch2.queue("a.cqv2.1", durable: true)
q1.subscribe { |*delivery| puts(delivery) }
# publish on ch2 |
Beta Was this translation helpful? Give feedback.
-
Here is some evidence that queue federation works as expected with CQv1 with the setup above (or a slightly modified version, to test both CQv1 and CQv2 using a single pair of nodes). Evidence for CQv1Below you see a Federation link connection that has a non-zero transfer rate, and two queues with identical names, one on side B (the upstream), and another on side A (the downstream), having The above screenshots are with CQv1: Evidence for CQv2Now CQv2: And the same screenshots with non-zero rates as I was publishing 2M messages to the upstream queue: |
Beta Was this translation helpful? Give feedback.
-
Having tested a few different scenarios makes me fairly certain that the behavior comes down to This paused state can be easily observed only by inspecting the number of consumers on the While CQv2 is very unlikely to report the relevant metrics differently, we will take a look. |
Beta Was this translation helpful? Give feedback.
Queue federation does not depend on what storage implementation of classic queues is used. I find it hard to believe that it is CQv2 that are the root cause.
A more likely cause is a common misunderstanding of how queue federation works. Queue federation only moves messages between clusters if there are no local consumers. Federation links are very low (negative) priority consumers that only kick in if the node does not have any local consumers.