Skip to content

Make vault fall back to other overlay node if overlay rejects connection #568

@ebma

Description

@ebma

Context

It can happen that a validator node reaches its limit of open connections. If that happens, the node returns Error{ code:ErrLoad message:peer rejected }. While we already have a fallback to other overlays in place when the initial connection fails, in this particular case the validator will throw an error only later in the flow so the default fallback does not work. As can be seen in the following logs, the connection is first accepted but in a later message, the peer is rejected.

Nov 07 09:22:22.182  INFO service: Version: 1.0.13
Nov 07 09:22:22.182  INFO service: Vault uses Substrate account with ID: 6jvAnyNWmtZSF5efLSxgx9awf4nn44r2Pbwuu8L3Ru5B77aN
Nov 07 09:22:22.182  INFO runtime::conn: Connecting to the spacewalk-parachain...    
Nov 07 09:22:22.320  INFO runtime::conn: Connected!    
Nov 07 09:22:22.584  INFO runtime::rpc: spec_name="amplitude"    
Nov 07 09:22:22.584  INFO runtime::rpc: spec_version=19    
Nov 07 09:22:22.584  INFO runtime::rpc: transaction_version=13    
Nov 07 09:22:22.587  INFO wallet::cache: Caching stellar transactions at .//GDI3XRT3OXFIUNGKEOC5273DUM6BJUBKBJ4BLTIQU7M3HYGDXTYH5SHY_true/txs
Nov 07 09:22:24.402  INFO vault::system: Got new block at height 4702605
Nov 07 09:22:24.402  INFO vault::system: Starting client service...
Nov 07 09:22:24.458  INFO vault::system: Not registering public key -- already registered
Nov 07 09:22:24.519  INFO vault::system: [6jvAnyNWmtZSF5efLSxgx9awf4nn44r2Pbwuu8L3Ru5B77aN[XCM(0)->{ code: AUDD, issuer: GDC7X2MXTYSAKUUGAIQ7J7RPEIM7GXSAIWFYWWH4GLNFECQVJJLB2EEU }]] Not registering vault -- already registered
Nov 07 09:22:24.716  INFO vault::system: Adding vault with ID: VaultId { account_id: AccountId32([134, 170, 122, 75, 136, 133, 147, 64, 79, 224, 247, 246, 119, 142, 219, 181, 233, 52, 167, 28, 46, 70, 222, 230, 11, 186, 32, 110, 34, 29, 213, 122]), currencies: VaultCurrencyPair { collateral: Static(XCM(0)), wrapped: Static({ code: AUDD, issuer: GDC7X2MXTYSAKUUGAIQ7J7RPEIM7GXSAIWFYWWH4GLNFECQVJJLB2EEU }) } }
Nov 07 09:22:24.719  INFO stellar_relay_lib::config: connection_info(): Connecting to Stellar overlay network using public key: GDI3XRT3OXFIUNGKEOC5273DUM6BJUBKBJ4BLTIQU7M3HYGDXTYH5SHY
Nov 07 09:22:24.723  INFO stellar_relay_lib::overlay: connect(): connecting to ConnectionInfo { address: "85.190.254.217", port: 11625, secret_key: "****", auth_cert_expiration: 0, receive_tx_messages: false, receive_scp_messages: true, remote_called_us: false, timeout_in_seconds: 10 }
Nov 07 09:22:24.837  INFO stellar_relay_lib::connection::connector::message_reader: poll_messages_from_stellar(): started.
Nov 07 09:22:24.945  INFO stellar_relay_lib::connection::connector::message_handler: process_stellar_message(): Hello message processed successfully
Nov 07 09:22:25.048 ERROR stellar_relay_lib::connection::connector::message_handler: process_raw_message(): Received ErrorMsg during authentication: Error{ code:ErrLoad message:peer rejected }
Nov 07 09:22:25.049 ERROR stellar_relay_lib::connection::error: Stellar Node returned error: Error{ code:ErrLoad message:peer rejected }
Nov 07 09:22:25.049 ERROR stellar_relay_lib::connection::connector::message_reader: poll_messages_from_stellar(): Error occurred during processing xdr message: OverlayError(ErrLoad)
Nov 07 09:22:25.049  INFO stellar_relay_lib::connection::connector::message_reader: poll_messages_from_stellar(): stopped.
Nov 07 09:22:35.278  INFO vault::system: Starting all services...
Nov 07 09:22:35.278  INFO vault::requests::execution: execute_open_requests(): started
Nov 07 09:22:35.278  INFO vault::oracle::agent: listen_for_stellar_messages(): started
Nov 07 09:22:35.278  INFO stellar_relay_lib::overlay: stop(): closing connection to overlay network
Nov 07 09:22:35.278 ERROR stellar_relay_lib::overlay: listen(): sender half of overlay has closed.
Nov 07 09:22:35.278 ERROR vault::oracle::agent: listen_for_stellar_messages(): encounter error in overlay: Disconnected
Nov 07 09:22:35.278  INFO vault::oracle::agent: listen_for_stellar_messages(): shutting down overlay connection

The error seems to be thrown here and there is a similar check here

TODO

Make the vault fall back to a different Stellar validator node when this error is encountered. A simple solution to this would be to make the vault connect to a different validator node every time it restarts, regardless of the reason for the restart.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions