couchdb v3.1 cluster stability issue #3559

nicknaychov · 2021-05-11T12:28:58Z

nicknaychov
May 11, 2021

Hi
we recently did upgrade from 2.3.1. to 3.1.1 . We noticed that when we restart one of the nodes all nodes are starting to report errors and whole cluster become unusable. To fix the issue we have to restart the rest of the nodes as well. This does not seems like reliable design of v3, I do not think is normal restart of one of nodes to bring the whole cluster down. Even if our cluster is not setup correctly still that behavior seems very odd to me.

[cluster] q=5 n=3 placement = z1:2,z2:1

Errors we get after restart of pbx1-z2:

on pbx1-z2:
[error] 2021-05-11T11:35:43.688273Z [email protected] <0.502.0> -------- Error checking security objects for _replicator :: {error,timeout} [error] 2021-05-11T11:35:43.723096Z [email protected] <0.561.0> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/99999999-cccccccb/_users.1619581901">> [error] 2021-05-11T11:35:43.723325Z [email protected] <0.561.0> -------- Error checking security objects for _users :: {error,timeout}
pbx2-z1 node:
[error] 2021-05-11T11:35:42.566342Z [email protected] <0.17661.0> 0794fd3f5d fabric_worker_timeout open_doc,'[email protected]',<<"shards/66666666-99999998/_users.1619581901">> [error] 2021-05-11T11:35:42.566343Z [email protected] <0.17660.0> dbd0ec51bc fabric_worker_timeout open_doc,'[email protected]',<<"shards/66666666-99999998/_users.1619581901">>
pbx1-z1 node:

[error] 2021-05-11T11:32:23.537733Z [email protected] <0.6339.206> 6969682e36 rexi_server: from: [email protected](<0.14168.206>) mfa: fabric_rpc:map_view/5 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,187}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,94}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,84}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,512}]},{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,146}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,138}]}]

[error] 2021-05-11T11:36:31.003402Z [email protected] <0.9286.206> 95132a4007 rexi_server: from: [email protected](<0.29152.206>) mfa: fabric_rpc:open_shard/2 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,187}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,94}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,84}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,160}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,138}]}]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

couchdb v3.1 cluster stability issue #3559

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

couchdb v3.1 cluster stability issue #3559

Uh oh!

Uh oh!

nicknaychov May 11, 2021

Description

Steps to Reproduce

Expected Behaviour

Your Environment

Additional Context

Replies: 0 comments

nicknaychov
May 11, 2021