Skip to content

Quorum queues: memory spike when applying a max-length policy retroactively to a long queue #12608

@mkuratczyk

Description

@mkuratczyk

Describe the bug

Given a long quorum queue, if I apply a policy to limit the queue's length (in a real-world scenario, likely with the intention of preventing further queue growth and running out of memory), a significant memory spike occurs to drop the messages above the new threshold. This can easily cause the opposite effect than intended: I run out of memory because I was trying to prevent running out of memory...

In my particular case, it was even "funnier": I had a cluster on Kubernetes, applied a policy, the leader was OOMkilled, a new leader was elected and tried to apply the policy, so it was OOMkilled. The remaining node survived because a leader could not be elected, but as soon as one of the nodes restarted, a leader was elected and OOMkilled. A policy that was meant to limit memory usage, cause an OOMkill-loop. :)

Reproduction steps

  1. make run-broker (tested on main)
  2. Publish a significant number of messages: perf-test -qq -u qq -x 4 -y 0 -c 100 -s 5000 -ms -C 1250000
  3. Apply a policy that sets the limit to a low value: rabbitmqctl set_policy max qq '{"max-length": 1234}'
  4. Observe memory usage
Screenshot 2024-10-29 at 12 35 47

set_policy-main

Expected behavior

Ideally there should be no significant spike when dropping messages.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions