Skip to content

Conversation

kjnilsson
Copy link
Contributor

@kjnilsson kjnilsson commented Aug 29, 2025

As it cannot be updated as part of a policy when it is disabled (-1)
it needs to be excluded from comparison

Fixes #14202

As it cannot be updated as part of a policy when it is disabled (-1)
 it needs to be excluded from comparison
@gomoripeti
Copy link
Contributor

is it only when it is disabled (-1) that it cannot be updated from a policy?
can it be updated eg from 20 to 40? if it can then this change will prevent the periodic policy repair to notice such change?

I really did not investigate thoroughly to find out what prevents the disabled delivery limit to be changed by a policy but this code part seems to allow it https://github.com/rabbitmq/rabbitmq-server/blob/v4.2.0-beta.2/deps/rabbit/src/rabbit_fifo.erl#L227.

isn't to problem of continuous update_config that rabbit_fifo converts a negative delivery limit to undefined so when maybe_apply_policies compares current config with new config, the former contains undefined while the later contains a negative value for the same unchanged "disabled" delivery limit?

ShouldUpdate = NewPolicyConfig =/= CurrentPolicyConfig,

I will shortly link a commit to show what I mean. (Sorry I just came back from vacation and couldn't look at this issue earlier)

@kjnilsson kjnilsson added this to the 4.2.0 milestone Aug 29, 2025
@kjnilsson
Copy link
Contributor Author

is it only when it is disabled (-1) that it cannot be updated from a policy?
can it be updated eg from 20 to 40? if it can then this change will prevent the periodic policy repair to notice such change?

yes good point, this will probably disable "valid" repairs. that said with the defaulting logic I feel it is probably safest to exclude this key irrespectively.

@gomoripeti
Copy link
Contributor

This change f855eb5 prevents update_config every tick if delivery-limit is negative, but still allows the periodic check to detect when delivery-limit is changed via policy. What do you think?

@kjnilsson
Copy link
Contributor Author

This change f855eb5 prevents update_config every tick if delivery-limit is negative, but still allows the periodic check to detect when delivery-limit is changed via policy. What do you think?

yes I think this is fine - do we need the not IsQueueDeclaration guard?

@gomoripeti
Copy link
Contributor

do we need the not IsQueueDeclaration guard?

Not necessarily. Just wanted to highlight that the conversion is necessary only for the comparison, for the actual config update it does not matter.
Does this conversion (https://github.com/rabbitmq/rabbitmq-server/blob/v4.2.0-beta.2/deps/rabbit/src/rabbit_fifo.erl#L203-L205) still needed if the guard is removed and the fifo_client passes the disabled value as undefined? (I guess yes because of mixed version clusters or when applying older raft logs)

I will add tests and submit a PR early next week.

@kjnilsson
Copy link
Contributor Author

I have been doing some testing and your change allows the delivery_limit to be disable after it has been active, this didn't use to be the case and was a deliberate choice. We could ofc extract the delivery_limit key in the comparison and compare them using adjusted logic but I'm not convinced it is worth it. Especially as we are probably going to make some adjustments to deliver limit defaulting once we introduce Ra v3 (Which is better able to cope with repeated returns).

@gomoripeti
Copy link
Contributor

thanks for testing and looking Karl. Don't want to delay the fix for #14202 so I'm fine with merging your PR which has well understood behaviour.

To continue the discussion do you mean it is not allowed to disable after active on main or by your PR?

My change (especially with the not IsQueueDeclaration guard) and your change as well only modify when a config update is triggered in case the queue would miss the policy_changed notification, but does not change what values the config is updated with. So I don't think it is modified which transitions are allowed.

@kjnilsson
Copy link
Contributor Author

yes I think just removing the key from comparison is fine for now, Once we ship Ra v3 (with log compaction) the restrictions on delivery limits may well be lifted and we will start allowing unlimited returns to the queue (potentially). So we can keep it simple for now even if it means that we'd not repair the case where only the delivery limit has changed.

@kjnilsson kjnilsson marked this pull request as ready for review September 3, 2025 08:02
@michaelklishin michaelklishin merged commit bd74503 into main Sep 3, 2025
283 checks passed
@michaelklishin michaelklishin deleted the qq-policy-repair-fix branch September 3, 2025 15:07
michaelklishin added a commit that referenced this pull request Sep 3, 2025
QQ: exclude delivery_limit from policy repair comparison. (backport #14458)
michaelklishin added a commit that referenced this pull request Sep 3, 2025
QQ: exclude delivery_limit from policy repair comparison. (backport #14458) (backport #14504)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quorum queue delivery count set to unlimited can cause the number segment files to increase without truncation
3 participants