PlatformVM: Support arbitrary validator addition and deletions#5003
PlatformVM: Support arbitrary validator addition and deletions#5003yacovm wants to merge 1 commit intoava-labs:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the Platform VM's staker diff logic to support validator replacements by allowing a validator to be deleted and re-added in the same diff. Previously, this operation was explicitly forbidden via ErrAddingStakerAfterDeletion.
Changes:
- Refactored
diffValidatorto hold separateaddedandremovedpointers instead of a singlevalidatorpointer with status, enabling simultaneous tracking of validator replacements - Updated weight diff calculation to compute net changes when both validators are present, ensuring delegator weights are preserved across replacements
- Modified test expectations to verify that validator replacements are now allowed and function correctly
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| vms/platformvm/state/staker.go | Added Equals() method to compare Staker instances for determining if add/delete operations cancel out |
| vms/platformvm/state/stakers.go | Core refactoring: replaced validatorStatus field with added/removed pointers in diffValidator; updated PutValidator() and DeleteValidator() to handle replacements |
| vms/platformvm/state/state.go | Updated getInheritedPublicKey(), updateValidatorManager(), and persistence functions to handle validator replacements correctly |
| vms/platformvm/state/diff.go | Modified Apply() to process removals before additions for validator replacements |
| vms/platformvm/state/stakers_test.go | Added comprehensive unit tests for add-delete-add sequences, replacements, and weight diff calculations |
| vms/platformvm/vm_regression_test.go | Updated TestAddValidatorDuringRemoval to expect success instead of error; added integration tests for weight persistence and multiple replacements |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f272767 to
6b19328
Compare
51f60e9 to
5363c9b
Compare
f43295a to
b0d88d8
Compare
| validatorDiff := v.getOrCreateValidatorDiff(staker.SubnetID, staker.NodeID) | ||
| validatorDiff.validatorStatus = deleted | ||
| validatorDiff.validator = staker | ||
| validatorDiff.added = nil |
There was a problem hiding this comment.
validatorDiff is used to keep track of uncommitted changes.
If Delete is called after Put, validatorDiff.validatorStatus() will return deleted. The correct status should be unmodified.
And also, because of that, WeightDiff has a wrong value.
I think this test is a comprehensive TLDR of my comment.
func TestBaseStakersValidatorDiff(t *testing.T) {
require := require.New(t)
v := newBaseStakers()
staker := newTestStaker()
v.PutValidator(staker)
v.DeleteValidator(staker)
vdrDiff := v.getOrCreateValidatorDiff(staker.SubnetID, staker.NodeID)
require.Equal(unmodified, vdrDiff.validatorStatus())
weighDiff, err := vdrDiff.WeightDiff()
require.NoError(err)
require.Equal(uint64(0), weighDiff.Amount)
}
There was a problem hiding this comment.
validatorDiffis used to keep track of uncommitted changes.If
Deleteis called afterPut,validatorDiff.validatorStatus()will returndeleted. The correct status should beunmodified.And also, because of that,
WeightDiffhas a wrong value.I think this test is a comprehensive TLDR of my comment.
func TestBaseStakersValidatorDiff(t *testing.T) { require := require.New(t) v := newBaseStakers() staker := newTestStaker() v.PutValidator(staker) v.DeleteValidator(staker) vdrDiff := v.getOrCreateValidatorDiff(staker.SubnetID, staker.NodeID) require.Equal(unmodified, vdrDiff.validatorStatus()) weighDiff, err := vdrDiff.WeightDiff() require.NoError(err) require.Equal(uint64(0), weighDiff.Amount) }
On the master branch, your test fails. 🤷♂️
If I change the test to the below:
func TestBaseStakersValidatorDiff(t *testing.T) {
require := require.New(t)
v := newBaseStakers()
staker := newTestStaker()
v.PutValidator(staker)
v.DeleteValidator(staker)
vdrDiff := v.getOrCreateValidatorDiff(staker.SubnetID, staker.NodeID)
require.Equal(deleted, vdrDiff.validatorStatus)
weighDiff, err := vdrDiff.WeightDiff()
require.NoError(err)
require.Equal(uint64(1), weighDiff.Amount)
}
the test passes both on master and on my branch.
There was a problem hiding this comment.
Ok, but that's not correct.
PutValidator shouldn't allow (through invariants or errors) to be called with an existing validator.
So that means, the calls of PutValidator and after DeleteValidator should be a no-op.
There was a problem hiding this comment.
The behavior you are describing is implemented in the diffStakers.
The baseStakers just blindly follows the directives of its caller before this PR, so I kept the behavior as is.
The goal of this PR is to change the functionality of the diff stakers.
| newWeight, err := computeNewStakerWeight(weightDiff, currentWeight) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if newWeight > 0 { |
There was a problem hiding this comment.
weightDiff.Decrease is false, because of line 2548.
weightDiff > 0 and currentWeight > 0 => newWeight > 0.
Isnt this if check redundant?
There was a problem hiding this comment.
This is to support the flow of updating a validator.
We can update the validator by removing it first and then re-adding it:
// The validator was removed and re-added. We need to remove the
// old entry entirely (including any delegator weight) and
// re-add with the correct total weight.
So, computeNewStakerWeight may return a > 0 newWeight.
There was a problem hiding this comment.
Also if you put a panic here and run the tests you'll see that we actually do reach this branch.
There was a problem hiding this comment.
I put a panic and I ran all tests, and I cannot see the panic.
But how can newWeight <= 0?
How can you remove more than the initial weight (or equal)?
There was a problem hiding this comment.
diff --git a/vms/platformvm/state/state.go b/vms/platformvm/state/state.go
index 0b122e9f2a..8ce3431635 100644
--- a/vms/platformvm/state/state.go
+++ b/vms/platformvm/state/state.go
@@ -2582,6 +2582,7 @@ func (s *state) updateValidatorManager(updateValidators bool) error {
}
if newWeight > 0 {
+ panic("bla")
err = s.validators.AddStaker(
subnetID,
nodeID,
fails with:
--- FAIL: TestSubnetValidatorRemoveAndReplaceInSingleBlock/lower_weight (0.00s)
--- FAIL: TestSubnetValidatorRemoveAndReplaceInSingleBlock (0.00s)
panic: bla [recovered, repanicked]
goroutine 65207 [running]:
testing.tRunner.func1.2({0x100e22d80, 0x10103bd90})
/Users/yacov.manevich/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.7.darwin-arm64/src/testing/testing.go:1872 +0x190
testing.tRunner.func1()
/Users/yacov.manevich/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.7.darwin-arm64/src/testing/testing.go:1875 +0x31c
panic({0x100e22d80?, 0x10103bd90?})
/Users/yacov.manevich/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.25.7.darwin-arm64/src/runtime/panic.go:783 +0x120
github.com/ava-labs/avalanchego/vms/platformvm/state.(*state).updateValidatorManager(0x14000401c08, 0xd2?)
/Users/yacov.manevich/avalanchego/vms/platformvm/state/state.go:2585 +0xe48
github.com/ava-labs/avalanchego/vms/platformvm/state.(*state).write(0x14000401c08, 0x1, 0x3)
/Users/yacov.manevich/avalanchego/vms/platformvm/state/state.go:2248 +0x78
github.com/ava-labs/avalanchego/vms/platformvm/state.(*state).CommitBatch(0x14000401c08)
/Users/yacov.manevich/avalanchego/vms/platformvm/state/state.go:2394 +0x28
There was a problem hiding this comment.
I am saying that newWeight cannot be <= 0, and that checking newWeight > 0 is redundant.
I put the panic in else branch.
if newWeight > 0 {
...
} else {
panic("blabla")
}
vms/platformvm/state/state.go
Outdated
| // The validator was removed and re-added. We need to remove the | ||
| // old entry entirely (including any delegator weight) and | ||
| // re-add with the correct total weight. |
There was a problem hiding this comment.
I don't think we should allow removing and re-adding of validators with delegators. 🤔
There was a problem hiding this comment.
This isn't the right place to do such sanity checks. We're in the updateValidatorManager called by the write() function.
Sanity checks and illegal logic checks should be done earlier, not here.
There was a problem hiding this comment.
I removed the text from the comment
vms/platformvm/state/state.go
Outdated
There was a problem hiding this comment.
Lets define:
- staker A: [nodeID_A, subnetID_A, txID_A, weight_A]
- staker B: [nodeID_A, subnetID_A, txID_B, weight_B], weight_B < weight_A.
If:
- delete staker A
- put staker A'
weightDiff will be: decrease: True, amount: weight_A - weight_B
That means the flow will stop in this if and s.validators (validators.Manager type) will contain old txID of the validator.
There was a problem hiding this comment.
We cannot do that because we need to update the validator manager with the new information about the validator.
The way we do it, is removing it and then adding it again.
I don't want to further extend the code of other packages to achieve this. I want to preserve the logic and API of other packages and that's why I am using this pattern of delete - then - add.
There was a problem hiding this comment.
That means the flow will stop in this if and s.validators (validators.Manager type) will contain old txID of the validator.
Yes you're right, nice catch 👍
I thought I covered this but I didn't check the txID in the test... ouch.
I refactored the area in the code to be more explicit about re-adding a validator and changed the test to test also the txID and all 3 combinations of weight changes.
c22e2f4 to
4c3a16d
Compare
This commit simplifies the staker diff logic by allowing post deletion addition. By doing so, we essentially allow an update of a validator, which was previously not supported by diffs. Refactor of `diffValidator`. We now derive the status dynamically from the internal state, and instead of holding a single `validator *Staker`, we hold the two updates we could have: `added *Staker` and `removed *Staker`. If both are non-nil then it means we have updated the validator. Added UTs and in CI. - **TestSubnetValidatorManagerAfterMultipleExpiration**: Performs two consecutive subnet validator replacements on the same node, each via expiry+add in a single block. - **TestSubnetValidatorRemoveAddRemoveInSingleBlock**: single block with 3 transactions: remove validator, add validator (weight 20), remove validator. After acceptance, verifies the node is no longer a validator and the validator set is empty. - **TestSubnetValidatorRemoveAndReplaceInSingleBlock**: Adds validator, then builds a single block with 2 transactions: remove the validator, add the same validator (with weight 20). After acceptance, verifies V2 is the current validator with weight 20 in the validator set. No. Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
| if vdr.added != nil { | ||
| return vdr.added.PublicKey, nil | ||
| } |
There was a problem hiding this comment.
I don't think vdr.added != nil can happen in the second if, because an added/modified validator that is present in the diff, should also be present in the s.currentStakers.validators, otherwise something is wrong.
Should we throw an error?
Why this should be merged
This commit simplifies the staker diff logic by allowing post deletion addition. By doing so, we essentially allow an update of a validator, which was previously not supported by diffs.
How this works
Refactor of
diffValidator. We now derive the status dynamically from the internal state, and instead of holding a singlevalidator *Staker, we hold the two updates we could have:added *Stakerandremoved *Staker.If both are non-nil then it means we have updated the validator.
How this was tested
Added UTs and in CI.
TestSubnetValidatorManagerAfterMultipleExpiration: Performs two consecutive subnet validator replacements on the same node, each via expiry+add in a single block.
TestSubnetValidatorRemoveAddRemoveInSingleBlock: single block with 3 transactions: remove validator, add validator (weight 20), remove validator. After acceptance, verifies the node is no longer a validator and the validator set is empty.
TestSubnetValidatorRemoveAndReplaceInSingleBlock: Adds validator, then builds a single block with 2 transactions: remove the validator, add the same validator (with weight 20). After acceptance, verifies the validator is the current validator with weight 20 in the validator set.
Some tests in
stakers_test.gothat add and remove validators in various combinations.Need to be documented in RELEASES.md?
No.