Skip to content

Conversation

kezhuw
Copy link
Member

@kezhuw kezhuw commented Jun 15, 2025

The cause is multifold:

  1. Leader will commit a proposal once quorum acked.
  2. Proposal is able to be committed in node's memory even if it has not been written to that node's disk.
  3. In case of disk error, the txn log could lag behind memory database.

This way, node experienced temporary disk error will have hole in its txn log after re-join. Once restarted, data will loss.

This commit complains the lag so to reload disk database to memory. This way, the node will not be able to become leader and sync missing txns from leader.

Refs: ZOOKEEPER-4882, ZOOKEEPER-4925

@kezhuw kezhuw force-pushed the ZOOKEEPER-4882-fix-data-loss-from-node-experienced-temporary-disk-error branch 2 times, most recently from 4b44db5 to 0ac0f3f Compare June 15, 2025 04:04
@kezhuw kezhuw force-pushed the ZOOKEEPER-4882-fix-data-loss-from-node-experienced-temporary-disk-error branch from 0ac0f3f to 2a6a6d3 Compare June 26, 2025 00:44
…rienced temporary disk error

The cause is multifold:
1. Leader will commit a proposal once quorum acked.
2. Proposal is able to be committed in node's memory even if it has not
   been written to that node's disk.
3. In case of disk error, the txn log could lag behind memory database.

This way, node experienced temporary disk error will have hole in its
txn log after re-join. Once restarted, data will loss.

This commit complains the lag so to reload disk database to memory. This
way, the node will not be able to become leader and sync missing txns
from leader.

Refs: ZOOKEEPER-4882, ZOOKEEPER-4925
@kezhuw kezhuw force-pushed the ZOOKEEPER-4882-fix-data-loss-from-node-experienced-temporary-disk-error branch from 2a6a6d3 to 8a868da Compare July 8, 2025 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant