Skip to content

Peer discovery and unavailable host #1871

@martinsumner

Description

@martinsumner

If a replrtq sink has a peer that is unavailable, i.e. a request to connect to the peer will timeout (due to repeated SYN handshake failure), this may cause some issues:

1 - The PB client will try and connect on initialisation (whereas a http client does not). The clients are initialised during the init function of the riak_kv_replrtq_snk process meaning that startup of Riak can be delayed by the 1-2 minute timeout of each client.

2 - If peer discovery (riak_kv_replrtq_peer) times out, it will try and update the riak_kv_replrtq_snk to use the timed out peer, but this will prompt client initialisation which will take longer than the gen_server:call timeout and this will crash the riak_kv_replrtq_peer.

3 - If peer discovery times out using http the error will directly crash the riak_kv_replrtq_peer process:

exception error: {function_clause,[{riak_kv_replrtq_peer,handle_info,[{#Ref<0.3982582579.1267466245.194086>,{error,{conn_failed,{error,timeout}}}},{state,[{q1_ttaaefs,[{1,8,<<redacted>>,8087,http}]}]}],[{file,"/Users/martinsumner/dbroot/basho/riak/_build/default/lib/riak_kv/src/riak_kv_replrtq_peer.erl"},{line,131}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,637}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,711}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions