Skip to content

Raise offence queue eras bound limit#11435

Open
cirko33 wants to merge 11 commits intomasterfrom
cirko33-offence-queue-raise-limit
Open

Raise offence queue eras bound limit#11435
cirko33 wants to merge 11 commits intomasterfrom
cirko33-offence-queue-raise-limit

Conversation

@cirko33
Copy link
Member

@cirko33 cirko33 commented Mar 19, 2026

Raise offence queue BondingDuration limit to ensure no loss of offences.

@cirko33
Copy link
Member Author

cirko33 commented Mar 19, 2026

/cmd prdoc --audience runtime_dev --bump patch

@cirko33 cirko33 self-assigned this Mar 19, 2026
@cirko33
Copy link
Member Author

cirko33 commented Mar 19, 2026

/cmd prdoc --audience runtime_dev --bump minor --force

@cirko33 cirko33 added the A4-backport-stable2603 Pull request must be backported to the stable2603 release branch label Mar 19, 2026
@sigurpol
Copy link
Contributor

Few comments:

@cirko33
Copy link
Member Author

cirko33 commented Mar 20, 2026

/cmd bench --pallet pallet_staking_async --runtime asset-hub-westend

@github-actions
Copy link
Contributor

Command "bench --pallet pallet_staking_async --runtime asset-hub-westend" has started 🚀 See logs here

…t_staking_async --runtime asset-hub-westend'
@github-actions
Copy link
Contributor

Command "bench --pallet pallet_staking_async --runtime asset-hub-westend" has finished ✅ See logs here

Details

Subweight results:
File Extrinsic Old New Change [%]
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_staking_configs_all_remove 8.12us 805.28us +9817.19
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_staking_configs_all_set 8.89us 805.70us +8959.91
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_stakers_paged 385.92us 12.81ms +3219.05
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_validator_count 4.03us 103.03us +2455.18
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_stakers_overview 399.79us 10.16ms +2442.31
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_min_commission 4.10us 102.96us +2412.40
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_claimed_rewards 419.64us 10.18ms +2325.32
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_validator_prefs 426.54us 10.17ms +2283.41
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_validator_slash_in_era 417.46us 6.28ms +1403.44
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_single_entry_cleanups 48.32us 384.41us +695.49
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs kick 2.36ms 18.14ms +669.43
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs deprecate_controller_batch 45.77ms 334.39ms +630.54
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_validator_reward 46.04us 282.25us +513.09
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs prune_era_reward_points 46.11us 281.86us +511.24
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs validate 161.42us 913.96us +466.21
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_controller 77.23us 427.51us +453.55
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs force_new_era 21.74us 115.58us +431.69
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs force_new_era_always 22.01us 115.63us +425.30
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs force_no_eras 22.65us 115.54us +410.04
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs bond 162.70us 666.82us +309.85
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs restore_ledger 114.86us 430.86us +275.14
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs withdraw_unbonded_update 181.11us 609.86us +236.74
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs set_payee 61.30us 188.83us +208.03
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs update_payee 75.53us 223.85us +196.35
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs migrate_currency 169.85us 478.55us +181.76
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs force_apply_min_commission 67.31us 187.29us +178.25
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs rc_on_offence 57.11ms 147.50ms +158.26
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs apply_slash 34.48ms 43.99ms +27.59
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs chill 5.96ms 1.03ms -82.78
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs rc_on_session_report 11.44ms 1.88ms -83.57
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs reap_stash 13.40ms 1.87ms -86.05
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs withdraw_unbonded_kill 14.83ms 1.95ms -86.85
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs unbond 10.29ms 1.25ms -87.81
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs nominate 13.24ms 1.61ms -87.83
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs force_unstake 16.18ms 1.90ms -88.24
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs rebond 10.70ms 1.17ms -89.09
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs bond_extra 13.18ms 1.31ms -90.04
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs chill_other 12.29ms 1.13ms -90.78
cumulus/parachains/runtimes/assets/asset-hub-westend/src/weights/pallet_staking_async.rs process_offence_queue 12.92ms 738.74us -94.28
Command output:

✅ Successful benchmarks of runtimes/pallets:
-- asset-hub-westend: ['pallet_staking_async']

// Minimum execution time: 151_823_000 picoseconds.
Weight::from_parts(162_698_000, 0)
.saturating_add(Weight::from_parts(0, 0))
// Estimated: `4218`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggwpez - weights generated by #10794 seemed a bit off. Bunch of Estimated: 0 for example etc
This PR doesn't introduce significant changes and affects only few benchmarks where we actually access the OffenceQueueEras and as expected - we are seeing e.g.

Staking::OffenceQueueEras max_size increased: 917 (added: 504512)    

This is the change from this branch — the raised offence queue limit.
For the rest the results are basically very close to what we have on chain before #10794

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm weird... I think there is a non-determinism bug somewhere in the PoV benchmarking, but its highly unlikely to trigger for all extrinsics like it did in that MR.
No idea honestly, but can look into it when i have time.

@sigurpol
Copy link
Contributor

Few comments:

@kianenigma , @Ank4n wdyt? Kian, in the related issue you were hinting as raising the bound as proper solution. I am fine with that, I would prefer the conversion to Vec + force_from approach but I would like to understand the reason behind your preference (or it was mostly because it was good enough and faster to implement?)

impl<T: Config> Get<u32> for OffenceQueueErasBound<T> {
fn get() -> u32 {
let bonding_duration = T::BondingDuration::get();
bonding_duration.saturating_add(bonding_duration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should take a min of this and something like 10, so its never 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't need to double it, we just need to add few eras so the system has time to clear the offences. So, something likebonding_duration.saturating_add(10) will do.

@Ank4n
Copy link
Contributor

Ank4n commented Mar 24, 2026

Few comments:

@kianenigma , @Ank4n wdyt? Kian, in the related issue you were hinting as raising the bound as proper solution. I am fine with that, I would prefer the conversion to Vec + force_from approach but I would like to understand the reason behind your preference (or it was mostly because it was good enough and faster to implement?)

IMO this is a non-issue. I don't see how we can exceed BondingDuration. We could try to reproduce the overflow path but I believe it should be impossible unless you manipulate the storage somehow or shorten the era to 1 block or something (so not a practical scenario).

In any case, I don't see any harm in going with this solution . Also benchmark should not really get affected because we still have bonding duration worth of eras.

@sigurpol
Copy link
Contributor

sigurpol commented Mar 24, 2026

IMO this is a non-issue. I don't see how we can exceed BondingDuration. We could try to reproduce the overflow path but I believe it should be impossible unless you manipulate the storage somehow or shorten the era to 1 block or something (so not a practical scenario).

In any case, I don't see any harm in going with this solution . Also benchmark should not really get affected because we still have bonding duration worth of eras.

IIUC, this is true for polkadot / any real current runtime configuration.
But can OffenceQueueEras bound overflow with SlashDeferDuration = 0 ?

Let's consider this example:

  • BondingDuration = 3:
  • active_era = 3
  • SlashDeferDuration = 0

In this case:

  • oldest_reportable_offence_era = active_era.index.saturating_sub(T::BondingDuration::get()) = 3 - 3 = 0
  • OffenceQueueEras is bounded at BondingDuration=3 (on master, not on this PR!)

The check here discards new offences if offence_era < oldest_reportable_offence_era.

With oldest_reportable_offence_era = 0, offences for eras 0, 1, 2, 3 are all accepted because (offence_era >= oldest_reportable_offence_era).

After three offence reports arrive (one each for eras 0, 1, and 2), OffenceQueueEras = [0, 1, 2] => the vec is at its bound of 3 (again, not on this PR, but on master!).

A fourth offence report for era 3 arrives => it is inserted into OffenceQueue, but try_insert into OffenceQueueEras fails silently.

Do I get it right?

If my example makes sense (maybe it's just theoretical / not plausible / not possible / I get something seriously wrong - all very possible 😄 ), then this PR makes the code more robust and maybe it's worth adding a similar test to prove it.
But maybe the point of the original issue is to make this class of issues from unrealistic because of current runtime configuration to impossible by construction replacing the try_insert with conversion to vec + force_from

@Ank4n
Copy link
Contributor

Ank4n commented Mar 25, 2026

IMO this is a non-issue. I don't see how we can exceed BondingDuration. We could try to reproduce the overflow path but I believe it should be impossible unless you manipulate the storage somehow or shorten the era to 1 block or something (so not a practical scenario).
In any case, I don't see any harm in going with this solution . Also benchmark should not really get affected because we still have bonding duration worth of eras.

IIUC, this is true for polkadot / any real current runtime configuration. But can OffenceQueueEras bound overflow with SlashDeferDuration = 0 ?

Let's consider this example:

  • BondingDuration = 3:
  • active_era = 3
  • SlashDeferDuration = 0

In this case:

  • oldest_reportable_offence_era = active_era.index.saturating_sub(T::BondingDuration::get()) = 3 - 3 = 0
  • OffenceQueueEras is bounded at BondingDuration=3 (on master, not on this PR!)

The check here discards new offences if offence_era < oldest_reportable_offence_era.

With oldest_reportable_offence_era = 0, offences for eras 0, 1, 2, 3 are all accepted because (offence_era >= oldest_reportable_offence_era).

After three offence reports arrive (one each for eras 0, 1, and 2), OffenceQueueEras = [0, 1, 2] => the vec is at its bound of 3 (again, not on this PR, but on master!).

A fourth offence report for era 3 arrives => it is inserted into OffenceQueue, but try_insert into OffenceQueueEras fails silently.

Do I get it right?

If my example makes sense (maybe it's just theoretical / not plausible / not possible / I get something seriously wrong - all very possible 😄 ), then this PR makes the code more robust and maybe it's worth adding a similar test to prove it. But maybe the point of the original issue is to make this class of issues from unrealistic because of current runtime configuration to impossible by construction replacing the try_insert with conversion to vec + force_from

I think you are right. This is the corner case (buggy line).

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

When SlashDeferDuration > 0, the range is [active_era - SlashDeferDuration + 1, active_era], and given SlashDeferDuration < BondingDuration, this is always lower than BondingDuration.

I think we should fix the range for SlashDeferDuration == 0 path as well in this PR (along with increasing the bound).

@Ank4n
Copy link
Contributor

Ank4n commented Mar 25, 2026

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

We should make this [active_era - BondingDuration + 2, active_era]. This is similar to SlashDeferDuration > 0 range as SlashDeferDuration in polkadot is 1 less than Bonding Duration.

cc: @cirko33
Thoughts @sigurpol ?

@sigurpol
Copy link
Contributor

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

We should make this [active_era - BondingDuration + 2, active_era]. This is similar to SlashDeferDuration > 0 range as SlashDeferDuration in polkadot is 1 less than Bonding Duration.

cc: @cirko33 Thoughts @sigurpol ?

that should work indeed. Again, there is always the other option to convert to Vec + force_from that wouldn't require any of this. What is the strong argument against that approach?

@cirko33
Copy link
Member Author

cirko33 commented Mar 25, 2026

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

We should make this [active_era - BondingDuration + 2, active_era]. This is similar to SlashDeferDuration > 0 range as SlashDeferDuration in polkadot is 1 less than Bonding Duration.
cc: @cirko33 Thoughts @sigurpol ?

that should work indeed. Again, there is always the other option to convert to Vec + force_from that wouldn't require any of this. What is the strong argument against that approach?

I'll quote Kian's answer (that makes kind of sense to me):

A Vec on a parachain will have no upper bound on its size, and will be tricky to benchmark

@Ank4n
Copy link
Contributor

Ank4n commented Mar 25, 2026

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

We should make this [active_era - BondingDuration + 2, active_era]. This is similar to SlashDeferDuration > 0 range as SlashDeferDuration in polkadot is 1 less than Bonding Duration.
cc: @cirko33 Thoughts @sigurpol ?

that should work indeed. Again, there is always the other option to convert to Vec + force_from that wouldn't require any of this. What is the strong argument against that approach?

IMO (but not a strong opinion 😆 )try_insert is the right API here. force_from is essentially a hack and it bypasses the bound, which defeats the purpose of using a bounded type. If we need to exceed the bound, the bound itself is wrong and should be fixed.

@sigurpol
Copy link
Contributor

When SlashDeferDuration == 0, the range is [active_era - BondingDuration, active_era], this is one greater than BondingDuration.

We should make this [active_era - BondingDuration + 2, active_era]. This is similar to SlashDeferDuration > 0 range as SlashDeferDuration in polkadot is 1 less than Bonding Duration.
cc: @cirko33 Thoughts @sigurpol ?

that should work indeed. Again, there is always the other option to convert to Vec + force_from that wouldn't require any of this. What is the strong argument against that approach?

IMO (but not a strong opinion 😆 )try_insert is the right API here. force_from is essentially a hack and it bypasses the bound, which defeats the purpose of using a bounded type. If we need to exceed the bound, the bound itself is wrong and should be fixed.

I don't disagree 😄 Well, to be fair, I consider this whole weak bounded vector a debatable hack by itself, not only force_from... something that should be used in very limited cases like some runtime migration or similar but not in regular code (not a strong opinion either, just the way I see it). If we identify a proper upper bound for OffenceErasQueue, maybe that should a BoundedVec and not a WeakBoundedVec

@Ank4n
Copy link
Contributor

Ank4n commented Mar 25, 2026

I don't disagree 😄 Well, to be fair, I consider this whole weak bounded vector a debatable hack by itself, not only force_from... something that should be used in very limited cases like some runtime migration or similar but not in regular code (not a strong opinion either, just the way I see it). If we identify a proper upper bound for OffenceErasQueue, maybe that should a BoundedVec and not a WeakBoundedVec

I think WeakBoundedVec is useful if a runtime update reduces the bound. We want older values to decode, but current code to not exceed bound.

// this implies that slashes are applied immediately, so we can accept any offence up to
// bonding duration old.
active_era.index.saturating_sub(T::BondingDuration::get())
active_era.index.saturating_sub(T::BondingDuration::get()).saturating_add(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add a comment explaining the magic number 2, reduces the range to BondingDuration - 1. An extra buffer era ensures old offences have time to be cleared out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A4-backport-stable2603 Pull request must be backported to the stable2603 release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants