Message partitioning & consumer groups #9638
Replies: 1 comment 2 replies
-
Superstreams have been available for more than a year now, and (together with Single Active Consumer) accomplish the same thing with a standard stream plot twist: if you have competing consumers (as opposed to a SAC), your consumers need to be ready to deal with duplicates. Our team is small and we need to focus on what the majority of our users benefit from. I am not convinced that "everyone" needs the same feature for queues. In the world where superstreams (partitioned streams) exist and no longer a very novel feature, I don't think this is as important as having reliable well-defined failure recovery, significant improvements to MQTT, keeping up with protocol evolution, correctness and efficiency improvements for AMQP 1.0 and STOMP. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is your feature request related to a problem? Please describe.
Kafka-like partitioning and consumer groups in RMQ is something that is desired by me and many other people for years.
Its all needed to be able to keep the correct order of messages while still being able to scale horizontally in an elegant way.
Consider the case where message processing order is critical and messages throughput is not constant.
It differs so much that sometimes its enough to have only 1 consumer working on these messages, and sometimes you need to run 10 or 30 workers to finish jobs in reasonable time.
Its just a short description of a problem, but its more complex when you really go with such a thing to a production.
Other things like simplicity of understanding (by the whole team), simplicity of configuration for different environments and others comes in.
There are so many blogs, articles describing the problem and separate peaces of software trying to solve the problem:
https://jack-vanlightly.com/blog/2018/11/14/why-i-am-not-a-fan-of-the-rabbitmq-sharding-plugin
https://jack-vanlightly.com/blog/2018/7/22/creating-consumer-groups-in-rabbitmq-with-rebalanser-part-1
https://helix.apache.org/1.0.2-docs/recipes/rabbitmq_consumer_group.html
https://www.cloudamqp.com/blog/the-consistent-hash-exchange-making-rabbitmq-a-better-broker.html#hard-things-about-using-the-consistent-hash-exchange
Ive seen that this issue was already here but wasn't really answered and closed:
#4974
Ive seen that someone mentioned x-consistent-hash exchange together with single-active-consumer.
Yes its very close but it only solves half of the problem.
Using this method messages are processed in a correct order and different consumer is picked in case of problems with the "active" consumer.
But on the other hand this doesnt help to scale horizontally at all.
When you have x-consistent-hash exchange, 10 queues bound to it (evenly distributed with load).
Then you start 10 processes subscribing to all 10 queues - you end up having the first process to be an "active" consumer on all 10 queues, getting all messages and other 9 processes doing nothing.
Of course you can ask "why would you subscribe to all queues with all your workers? If you do, its an expected behavior".
This is only because its easier and doesnt require any extra coordination with any external piece of software. Specially when sometimes you need 1 worker and sometimes you want 10 workers running because of a huge load. Without extra coding or complex environment configuration its the only way of keeping the correct order and be sure that all messages are processed.
Describe the solution you'd like
The minimum solution would be to introduce some kind of a "random-single-active-consumer".
I heard that there are plans of taking consumer priority into consideration while picking the "active" consumer with "single-active-consumer". While using such a feature together with above mentioned setup & x-consistent-hash exchange, doing this complex and messy setup should solve the real problem.
The solution that everyone is really hoping for is something like:
A new exchange type, lets call it "ordered-distributed-exchange" which will hide all this magic underneath.
I dream of an extra argument:
partitionsCount: integer
andhash-header
just like in x-consistent-hash.I also dream of a logical queue similar to sharding plugin, created automatically (partitionsCount queues are created under the hood).
Consumers are subscribing to the logical queue with the same name as an exchange but under the hood they really subscribe to all queues with "single-active-consumer-load-balanced". This ensures that only one consumer will get messages from one queue, but additionally it re-balance consumers when they attach or leave so that when there are 5 conusmers and 10 queues, every consumer will have exactly 2 "active" queues.
Describe alternatives you've considered
No response
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions