Improve rabbitmq-management-agent gc handling #9320
-
Note> I debated a while if this should be an 'issue' or a 'discussion', and clearly ended up here. Also note this is a situation of (but I still found it interesting):
Setup: 3 node cluster, CMQ. We noticed a scenario recently where a client created a lot (over 100k) of consumers in a channel, and then the client died. I reproduced the scenario, but with a single broker, and ran some flamegraphs, and ended up with the conclusion the issue was with the ets deletions in the So, I looked into how the cleanup works, and I think we can do some improvements. When a channel dies like that, rabbitmq produces a bunch of events, among others it will produce a rabbitmq-management-agent subscribes on both. For the the So, there is a bit of overwork there. I suggest that when a channel goes down, in The above change will not improve the CPU usage much though. As the
And instead do the following:
I've only tried the above for the consumer_stats table, and imagine the MatchPattern might differ. With the above changes, I saw no CPU issue killing 100k consumers in one go. Thoughts? Am I missing something vital? I have a draft PR but would like to this discussed here first. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
@SimonUnge please submit a PR, both changes sound good to me. I have changed my mind on whether we should keep the |
Beta Was this translation helpful? Give feedback.
@SimonUnge please submit a PR, both changes sound good to me.
I have changed my mind on whether we should keep the
consumer_deleted
events in place. These internal events are used by audit and monitoring systems. In this particular case, we are notreally dealing with consumer cancellation, so I suspect there isn't much use to emitting the
consumer_deleted
event. In fact, it may even be counterintuitive and counterproductive to delete them.