Skip to content

Conversation

@akichidis
Copy link
Contributor

@akichidis akichidis commented Jan 29, 2026

Description

Add the ability to update a consensus peer address via the EndpointManager and the Admin server. The new address is stored separately as an override. Once the new address is provided then the node will attempt to re-subscribe to the new address and also invalidate the previous connection and establish new one for the TonicClient.

Test plan

CI/PT


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • gRPC:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • Indexing Framework:

@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
sui-docs Ready Ready Preview, Comment Jan 29, 2026 10:14pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
multisig-toolkit Ignored Ignored Preview Jan 29, 2026 10:14pm
sui-kiosk Ignored Ignored Preview Jan 29, 2026 10:14pm

Request Review

@akichidis akichidis temporarily deployed to sui-typescript-aws-kms-test-env January 29, 2026 22:12 — with GitHub Actions Inactive
@akichidis akichidis marked this pull request as ready for review January 29, 2026 22:14
@akichidis akichidis requested a review from a team as a code owner January 29, 2026 22:14
@akichidis akichidis temporarily deployed to sui-typescript-aws-kms-test-env January 29, 2026 22:14 — with GitHub Actions Inactive
@akichidis akichidis requested a review from aschran January 29, 2026 22:14
Copy link
Contributor

@mwtian mwtian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think later we will need an integration test to make sure nodes running at the new address can get connected.

}

// Otherwise, set the first address as the override.
// TODO: support multiple addresses. For now, we only support one address per peer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's file a task because this is a big missing piece.


struct Inner {
discovery_handle: discovery::Handle,
consensus_address_updater: ArcSwapOption<Arc<dyn ConsensusAddressUpdater>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we think we are going to add a lot more non-clonable things to the EndpointManager struct, it might be simpler to just wrap this in an Arc rather than have a separate Inner struct thing? wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how I had this initially, but especially with the EndpointManager deriving Clone, I believe it would be simple to have the Inner struct now and extend as we add more things. If it's not too much of an issue from your point of view, I would suggest to keep as is.

epoch_start_state.get_validator_as_p2p_peers(config.protocol_public_key())
{
endpoint_manager.update_endpoint(EndpointId::P2p(peer_id), vec![address]);
let _ = endpoint_manager.update_endpoint(EndpointId::P2p(peer_id), vec![address]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we prob should not be ignoring errors here?

.parse()
.unwrap();
endpoint_manager_1.update_endpoint(
let _ = endpoint_manager_1.update_endpoint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should not be ignoring errors here?

Ok(())
} else {
warn!(
"Consensus authority node is not running, ignoring update of peer addresses for network public key {network_pubkey:?} and addresses {addresses:?}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safe to ignore updates here? could you get into a race where you have an update that you want to be applied once consensus starts, but it gets dropped here because it came too soon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method returns an error and basically I am thinking it's the responsibility of the caller to retry or take further decisions. Now, for the use cases we are looking at I guess this will be quite an edge case?

// Size is limited by known authorities in the committee.
channels: RwLock<BTreeMap<AuthorityIndex, Channel>>,
// Address overrides for peers, indexed by AuthorityIndex
address_overrides: RwLock<BTreeMap<AuthorityIndex, Multiaddr>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably need some refactoring to work with the source-based priorities

maybe it's worth moving that logic to track which address group currently has priority into EndpointManager, instead of duplicating it in both the p2p network code and the consensus code? but I will leave that up to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok let me rebase and see what can I do here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants