couchdb v3.1 cluster stability issue #3559
Unanswered
nicknaychov
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
we recently did upgrade from 2.3.1. to 3.1.1 . We noticed that when we restart one of the nodes all nodes are starting to report errors and whole cluster become unusable. To fix the issue we have to restart the rest of the nodes as well. This does not seems like reliable design of v3, I do not think is normal restart of one of nodes to bring the whole cluster down. Even if our cluster is not setup correctly still that behavior seems very odd to me.
[cluster] q=5 n=3 placement = z1:2,z2:1
Errors we get after restart of pbx1-z2:
on pbx1-z2:
[error] 2021-05-11T11:35:43.688273Z [email protected] <0.502.0> -------- Error checking security objects for _replicator :: {error,timeout} [error] 2021-05-11T11:35:43.723096Z [email protected] <0.561.0> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/99999999-cccccccb/_users.1619581901">> [error] 2021-05-11T11:35:43.723325Z [email protected] <0.561.0> -------- Error checking security objects for _users :: {error,timeout}
pbx2-z1 node:
[error] 2021-05-11T11:35:42.566342Z [email protected] <0.17661.0> 0794fd3f5d fabric_worker_timeout open_doc,'[email protected]',<<"shards/66666666-99999998/_users.1619581901">> [error] 2021-05-11T11:35:42.566343Z [email protected] <0.17660.0> dbd0ec51bc fabric_worker_timeout open_doc,'[email protected]',<<"shards/66666666-99999998/_users.1619581901">>
pbx1-z1 node:
Let me know if you need some further details
Thank you
Description
Steps to Reproduce
Expected Behaviour
Your Environment
Additional Context
Beta Was this translation helpful? Give feedback.
All reactions