-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Consensus] update peer address via EndpointManager #25162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
mwtian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think later we will need an integration test to make sure nodes running at the new address can get connected.
| } | ||
|
|
||
| // Otherwise, set the first address as the override. | ||
| // TODO: support multiple addresses. For now, we only support one address per peer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's file a task because this is a big missing piece.
|
|
||
| struct Inner { | ||
| discovery_handle: discovery::Handle, | ||
| consensus_address_updater: ArcSwapOption<Arc<dyn ConsensusAddressUpdater>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless we think we are going to add a lot more non-clonable things to the EndpointManager struct, it might be simpler to just wrap this in an Arc rather than have a separate Inner struct thing? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how I had this initially, but especially with the EndpointManager deriving Clone, I believe it would be simple to have the Inner struct now and extend as we add more things. If it's not too much of an issue from your point of view, I would suggest to keep as is.
| epoch_start_state.get_validator_as_p2p_peers(config.protocol_public_key()) | ||
| { | ||
| endpoint_manager.update_endpoint(EndpointId::P2p(peer_id), vec![address]); | ||
| let _ = endpoint_manager.update_endpoint(EndpointId::P2p(peer_id), vec![address]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we prob should not be ignoring errors here?
| .parse() | ||
| .unwrap(); | ||
| endpoint_manager_1.update_endpoint( | ||
| let _ = endpoint_manager_1.update_endpoint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably should not be ignoring errors here?
| Ok(()) | ||
| } else { | ||
| warn!( | ||
| "Consensus authority node is not running, ignoring update of peer addresses for network public key {network_pubkey:?} and addresses {addresses:?}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to ignore updates here? could you get into a race where you have an update that you want to be applied once consensus starts, but it gets dropped here because it came too soon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method returns an error and basically I am thinking it's the responsibility of the caller to retry or take further decisions. Now, for the use cases we are looking at I guess this will be quite an edge case?
| // Size is limited by known authorities in the committee. | ||
| channels: RwLock<BTreeMap<AuthorityIndex, Channel>>, | ||
| // Address overrides for peers, indexed by AuthorityIndex | ||
| address_overrides: RwLock<BTreeMap<AuthorityIndex, Multiaddr>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will probably need some refactoring to work with the source-based priorities
maybe it's worth moving that logic to track which address group currently has priority into EndpointManager, instead of duplicating it in both the p2p network code and the consensus code? but I will leave that up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok let me rebase and see what can I do here
Description
Add the ability to update a consensus peer address via the EndpointManager and the Admin server. The new address is stored separately as an override. Once the new address is provided then the node will attempt to re-subscribe to the new address and also invalidate the previous connection and establish new one for the TonicClient.
Test plan
CI/PT
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.