New node in a cluster with thousands of Shovels takes too long to restart (more than 40 minutes) #9714
-
Describe the bugHi, System Specifications:
Cluster Specification:
Issue in Detail: The typical shovel configuration looks something like below, and both source and destination exist within the same cluster. {
"component": "shovel",
"name": "to-vhost-a",
"value": {
"ack-mode": "on-confirm",
"dest-add-forward-headers": true,
"dest-exchange": "command",
"dest-protocol": "amqp091",
"dest-uri": "amqp:///vhost-a",
"src-delete-after": "never",
"src-exchange": "command",
"src-exchange-key": "*.<uuid>.#",
"src-prefetch-count": 1,
"src-protocol": "amqp091",
"src-uri": "amqp:///vhost-b"
},
"vhost": "vhost-b"
} Also note that import the original definitions into a fresh 3 node cluster only takes about 10 minutes. Reproduction steps
Expected behaviorThe 3rd node to finish starting up in at least 10 minutes and bring amqp and http/prometheus ports online. Additional contextThe reason why we have such a high number of shovels is because we need to pass messages between each customer vhost and our own managed services vhost. We do so at an exchange level (w/ both command and event exchanges). |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 15 replies
-
Starting hundreds of virtual hosts can take a long time, since every virtual host waits for all nodes to report a success. |
Beta Was this translation helpful? Give feedback.
-
@yohanAnushka any time you report a software issue, you must say what version you are using. In this case it would be:
|
Beta Was this translation helpful? Give feedback.
-
@yohanAnushka a large number of vhosts is almost certainly the culprit. There's a lot of mnesia locking going on in such scenarios. It's worth mentioning however, that we are getting close to replacing Mnesia altogether. RabbitMQ 3.13-rc1 was shipped last week which already adds experimental (opt-in) support for Khepri (our new metadata store). I just ran a quick test to import 1000 empty vhosts and then stop/start a node (no clustering). With Mnesia, I get:
With Khepri (in 3.13, you need to do
This is not to say that every scenario/situation will get this level of improvement, but we are addressing your issue as part of this multi-year effort of replacing how metadata is stored/replicated within the cluster. It'd be fantastic if you could re-run your tests with 3.13-rc1 and Khepri enabled (again, you need to enable |
Beta Was this translation helpful? Give feedback.
It should get much faster in the next 3.11/3.12 patch release. From a few minutes to a few seconds to import 1000 shovels to a single node and from 20-30 minutes to scale the cluster to just seconds. Thanks for reporting this.
#9785