-
Notifications
You must be signed in to change notification settings - Fork 566
Description
请教初始化流程与提案有序性的流程
- Dragonboat version: v3.x
1. 关于初始化与回放的触发机制 (Initialization & Replay)
我的理解 (My Understanding):
NodeHost 启动时,节点应先加载最新的快照(Snapshot),然后重放(Replay)快照索引之后的 Raft 日志(Log Entries)来恢复状态机。
疑问 (Question):
我在 node.go 中看到了 replayLog 方法,它通过 logdb.ReadRaftState 获取了 EntryCount 和 Commit 索引。但我没找到物理触发点:
到底是谁、在哪个时刻,真正把这些从磁盘读出来的 pb.Update.CommittedEntries 丢进 TaskChan 的?
是在 StartCluster 之后由 node.run 的第一次迭代触发的吗?
2. 提案的有序性与并发控制 (Ordering & Concurrency)
我的理解 (My Understanding):
Dragonboat 支持高并发提案。我关注当我在“追加日志(初始化后重放)” 的期间尝试 SyncPropose 时,系统如何保证有序性。
疑问 (Question):
这种有序性是通过在 node.go 层的某个 Request Queue 加锁实现的,还是依靠 internal/raft 协议栈内部的串行化处理?
当 applyWorker 正在忙于回放大量旧日志时,新到达的 Propose 是否会被阻塞在某个队列中?它是如何实现无锁(或低锁)的高效入队的?
1. Mechanism of Initialization and Replay
My Understanding:
When a NodeHost starts, a node should first load the most recent snapshot and then replay the Raft log entries (aplog) following the snapshot index to restore the state machine's state.
Question:
I have examined the replayLog method in node.go, which retrieves the EntryCount and Commit index via logdb.ReadRaftState. However, I am struggling to locate the exact "physical" trigger point:
- Who is responsible, and at what exact moment, for actually pushing those
pb.Update.CommittedEntries(read from disk) into theTaskChan? - Is this process triggered during the very first iteration of the
node.runloop afterStartClusteris called? I suspect it is related to the gap betweenappliedandcommittedindex during Raft initialization, but I would like to confirm the exact call stack.
2. Ordering and Concurrency Control
My Understanding:
Dragonboat supports high-concurrency proposals. I am interested in how the system guarantees ordering when a SyncPropose is attempted while the system is still replaying a large volume of historical logs during the post-initialization phase.
Question:
- Is this strict ordering enforced by a specific Request Queue with locking at the
node.golayer, or does it rely on the internal serialization of theinternal/raftprotocol stack? - While the
applyWorkeris busy replaying a large backlog of logs, will a newly arrivedProposebe blocked in a specific queue? - How does Dragonboat achieve high-efficiency entry into the queue? Is it implemented via a lock-free or low-lock mechanism (e.g., specific Go channels or internal task queues)?