Message partitioning & consumer groups #9638

mwilniewiec · 2023-10-05T00:14:46Z

mwilniewiec
Oct 5, 2023

Is your feature request related to a problem? Please describe.

Kafka-like partitioning and consumer groups in RMQ is something that is desired by me and many other people for years.

Its all needed to be able to keep the correct order of messages while still being able to scale horizontally in an elegant way.
Consider the case where message processing order is critical and messages throughput is not constant.
It differs so much that sometimes its enough to have only 1 consumer working on these messages, and sometimes you need to run 10 or 30 workers to finish jobs in reasonable time.

Its just a short description of a problem, but its more complex when you really go with such a thing to a production.
Other things like simplicity of understanding (by the whole team), simplicity of configuration for different environments and others comes in.

There are so many blogs, articles describing the problem and separate peaces of software trying to solve the problem:
https://jack-vanlightly.com/blog/2018/11/14/why-i-am-not-a-fan-of-the-rabbitmq-sharding-plugin
https://jack-vanlightly.com/blog/2018/7/22/creating-consumer-groups-in-rabbitmq-with-rebalanser-part-1
https://helix.apache.org/1.0.2-docs/recipes/rabbitmq_consumer_group.html
https://www.cloudamqp.com/blog/the-consistent-hash-exchange-making-rabbitmq-a-better-broker.html#hard-things-about-using-the-consistent-hash-exchange

Ive seen that this issue was already here but wasn't really answered and closed:
#4974

Ive seen that someone mentioned x-consistent-hash exchange together with single-active-consumer.
Yes its very close but it only solves half of the problem.
Using this method messages are processed in a correct order and different consumer is picked in case of problems with the "active" consumer.
But on the other hand this doesnt help to scale horizontally at all.
When you have x-consistent-hash exchange, 10 queues bound to it (evenly distributed with load).
Then you start 10 processes subscribing to all 10 queues - you end up having the first process to be an "active" consumer on all 10 queues, getting all messages and other 9 processes doing nothing.

Of course you can ask "why would you subscribe to all queues with all your workers? If you do, its an expected behavior".
This is only because its easier and doesnt require any extra coordination with any external piece of software. Specially when sometimes you need 1 worker and sometimes you want 10 workers running because of a huge load. Without extra coding or complex environment configuration its the only way of keeping the correct order and be sure that all messages are processed.

Describe the solution you'd like

The minimum solution would be to introduce some kind of a "random-single-active-consumer".
I heard that there are plans of taking consumer priority into consideration while picking the "active" consumer with "single-active-consumer". While using such a feature together with above mentioned setup & x-consistent-hash exchange, doing this complex and messy setup should solve the real problem.

The solution that everyone is really hoping for is something like:
A new exchange type, lets call it "ordered-distributed-exchange" which will hide all this magic underneath.
I dream of an extra argument: partitionsCount: integer and hash-header just like in x-consistent-hash.
I also dream of a logical queue similar to sharding plugin, created automatically (partitionsCount queues are created under the hood).
Consumers are subscribing to the logical queue with the same name as an exchange but under the hood they really subscribe to all queues with "single-active-consumer-load-balanced". This ensures that only one consumer will get messages from one queue, but additionally it re-balance consumers when they attach or leave so that when there are 5 conusmers and 10 queues, every consumer will have exactly 2 "active" queues.

Describe alternatives you've considered

No response

Additional context

No response

michaelklishin · 2023-10-05T00:51:29Z

michaelklishin
Oct 5, 2023
Maintainer

Superstreams have been available for more than a year now, and (together with Single Active Consumer) accomplish the same thing with a standard stream plot twist: if you have competing consumers (as opposed to a SAC), your consumers need to be ready to deal with duplicates.

Our team is small and we need to focus on what the majority of our users benefit from. I am not convinced that "everyone" needs the same feature for queues.

In the world where superstreams (partitioned streams) exist and no longer a very novel feature, I don't think this is as important as having reliable well-defined failure recovery, significant improvements to MQTT, keeping up with protocol evolution, correctness and efficiency improvements for AMQP 1.0 and STOMP.

2 replies

mwilniewiec Oct 5, 2023
Author

I heard that superstreams can solve this problem but didn't try it yet. When superstreams can do it, I expected queues could work the same way.
Both queues and streams solve similar problems but may have some differences and I believe there are cases where you want to use queues over streams.
Specially in case when you go production with and a single queue and a single-active-consumer which may solve the problem of ordered messages for a long time. Later when you decide to scale as your consumer is too slow, its quite of a big deal to refactor your simple queue to RMQ superstreams. From what I know a different RMQ client is recommended and possibly lots of other changes are required.
At this point of time you may also decide to migrate to Kafka or make some other difficult and complex decision :)
I personally did a workaround for this problem by implementing my own, external subscribers coordinator for x-consistent-hash.

Im not saying that this is top priority for RMQ and all users will benefit from this feature. Im just saying that there were lots of people trying to solve this problem using RMQ over years. RMQ was introducing features like x-consistent-hash, sharding-plugin, single-active-consumer that looked like a cure for the problem, but they were just getting us closer with no final luck.

Superstreams added recently may be a real cure but I think they may be turning everything upside down in working systems with years of development. I will take a deeper dive into it as it may solve some of my future problems.

michaelklishin Oct 5, 2023
Maintainer

@mwilniewiec so it is a big deal to refactor your consumers to use superstreams (no, it's not) but for the RabbitMQ core team it should not be a big deal to sink in several person-months to reinvent a feature that RabbitMQ already supports, but this time for queues. Right.

Our small team does not have the cycles to work on this. RabbitMQ is open source software you very likely have been getting entirely for free, with upgrades not costing anything either. You are welcome to implement what you need, or adopt superstreams, or migrate to any alternative you like.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Message partitioning & consumer groups #9638

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Message partitioning & consumer groups #9638

Uh oh!

mwilniewiec Oct 5, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Replies: 1 comment · 2 replies

Uh oh!

michaelklishin Oct 5, 2023 Maintainer

Uh oh!

mwilniewiec Oct 5, 2023 Author

Uh oh!

Uh oh!

michaelklishin Oct 5, 2023 Maintainer

mwilniewiec
Oct 5, 2023

Replies: 1 comment 2 replies

michaelklishin
Oct 5, 2023
Maintainer

mwilniewiec Oct 5, 2023
Author

michaelklishin Oct 5, 2023
Maintainer