Replies: 8 comments 1 reply
-
Any single process part can become a bottleneck given a degenerate enough case. Unless there is a specific suggestion as to what can be done to make Dropping events much like Erlang logger and Lager do would not be well received by some users. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
My suggestion would be to add a configuration option with sets |
Beta Was this translation helpful? Give feedback.
-
Sure but our prior experience suggests that it's a band aid that won't help much. Have you @mkuratczyk @lhoguin @dumbbell FYI. |
Beta Was this translation helpful? Give feedback.
-
I would highly suggest never using |
Beta Was this translation helpful? Give feedback.
-
@luos would you be able to test #5301? Ideally with and without the additional |
Beta Was this translation helpful? Give feedback.
-
Regarding earlier discussions, we set the |
Beta Was this translation helpful? Give feedback.
-
I just noticed this PR. Haven't tested but there's a pretty good chance that it will further improve the situation (although moving tracking tables to ETS was probably the most important step): erlang/otp#6199 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
In installations with high connection churn, the
rabbit_event
process can become a bottleneck. We know it is not advised to have this high connection churn - but it happens sometimes.This can lead to RabbitMQ crashing with OOM. This can even happen if only the management interface is started and no other plugins. When RabbitMQ is allocated with a high number of CPU cores then it is especially susceptible to this as it's able to accept more connections / second.
I made some tests and if
rabbit_event
process is set as highpriority
then the issue does not happen.Would you be open to exposing this as a config option?
This could also affect
pg_local
as it seems like that is the next in line for this bottleneck. Though it seems that only happens with very high core counts (128).Maybe it could be just set default as high?
Probably in the future is to rework
rabbit_event
as now it is getting more and more overloaded with events, for example remove stats from it and move it to a different event handler at least.To reproduce this issue is to start at least a RabbitMQ with 36 cores ore more and start an application with many workers connecting/reconnecting from at least two other hosts configured with the following:
It's easy to reproduce if you turn on the
rabbit_event_exchange
plugin, but it can happen without it as well.Let me know what you think.
Beta Was this translation helpful? Give feedback.
All reactions