Skip to content

feat(rpc): add tn_syncing, tn_epochInfo, tn_currentCommittee methods#539

Open
TanguyDeTaxis wants to merge 8 commits intoTelcoin-Association:mainfrom
TanguyDeTaxis:530-tn-rpc-methods
Open

feat(rpc): add tn_syncing, tn_epochInfo, tn_currentCommittee methods#539
TanguyDeTaxis wants to merge 8 commits intoTelcoin-Association:mainfrom
TanguyDeTaxis:530-tn-rpc-methods

Conversation

@TanguyDeTaxis
Copy link
Contributor

@TanguyDeTaxis TanguyDeTaxis commented Feb 4, 2026

Summary

  • Add tn_syncing, tn_epochInfo, and tn_currentCommittee RPC methods to the tn namespace
  • Add current_committee watch channel to ConsensusBus so RPC can access committee info
  • Fix tn_syncing to check both epoch and execution progress before declaring a node synced (addresses review feedback)
  • Add currentEpoch / highestEpoch fields to the sync progress response for better validator visibility
  • Add comprehensive tests: 7 serialization tests + 11 sync logic tests
  • Fix JSON-RPC error codes to use the standard server error range (-32000 to -32099)

tn_syncing behavior

Returns false when synced, or a progress object when syncing:

{
  "startingBlock": 100,
  "currentBlock": 5000,
  "highestBlock": 10000,
  "currentEpoch": 3,
  "highestEpoch": 5
}

A node is considered synced when:

  1. It is an active CVV (CvvActive), or
  2. Its local epoch >= network epoch and execution has caught up to the latest known consensus block

Previously, condition 2 only checked the epoch — a node could report as synced while execution was still lagging behind consensus. This is now fixed.

Files changed

File Change
crates/consensus/primary/src/consensus_bus.rs Add current_committee watch channel
crates/node/src/manager.rs Populate committee channel on epoch creation
crates/execution/tn-rpc/src/rpc_ext.rs New RPC types, methods, and serialization tests
crates/execution/tn-rpc/src/lib.rs Extend EngineToPrimary trait with new methods
crates/execution/tn-rpc/src/error.rs Fix JSON-RPC error codes
crates/node/src/lib.rs Implement sync logic fix + add 11 logic tests
crates/execution/faucet/tests/it/faucet.rs Update mock to implement new trait methods

tn_syncing previously returned Synced based only on epoch comparison,
ignoring execution lag. A node could have consensus data for the current
epoch but still be behind on execution. Now requires both epoch AND
execution to be caught up before declaring synced.

Also adds currentEpoch/highestEpoch fields to SyncProgress and
comprehensive tests for serialization and sync_status logic.
Add a watch channel to track the current committee in ConsensusBus,
enabling RPC and other components to access committee info without
going through the primary node. Updated epoch manager to populate
the channel when creating a committee.
Extend the tn RPC namespace with sync status, epoch info, and committee
endpoints. Add SyncStatus, SyncProgress, EpochInfo, CommitteeInfo,
ValidatorInfo types. Fix JSON-RPC error codes to use the standard
server error range (-32000 to -32099). Update EngineToPrimary trait
with the new methods and adapt faucet test mock accordingly.
@TanguyDeTaxis TanguyDeTaxis marked this pull request as ready for review February 5, 2026 22:14
@grantkee grantkee self-requested a review February 11, 2026 20:41
@grantkee grantkee linked an issue Feb 11, 2026 that may be closed by this pull request
# Conflicts:
#	crates/consensus/primary/src/consensus_bus.rs
Copy link
Contributor

@grantkee grantkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TanguyDeTaxis Thank you for taking this on. It's going in the right direction, but I think there is more to consider. I'm leaving feedback for you, but I am still reviewing/thinking about some of the implementation details. Your work reflects the issue #530 however, I'm realizing there are still outstanding design decisions I need to figure out. I will follow up soon with more comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please move everything after L47 to execution/tn-rpc/src/rpc_ext.rs - this was missed in a previous refactor and aligns better with the separation of domain logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, agreed on separation. I moved this out of node/lib.rs into node/engine_to_primary_rpc.rs in f586ad9 so lib.rs is only wiring now. I kept it in node for now because it depends on node-side consensus bus/reth wiring, but I can move it into execution/tn-rpc if you want that boundary in this PR.

return SyncStatus::Synced;
}

// Block-level progress: compare executed blocks vs known consensus blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pointed out a similar problem in PR feedback for #542. This is comparing execution state with consensus state which is no longer guaranteed to be 1:1+ after #531

After addressing #542, I think the same approach will work here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. In e98b524 I removed the 1:1 execution<->consensus assumption. Sync now tracks consensus progress and execution progress independently (round-based), and returns Synced only when epoch + consensus + execution are all caught up.

fn current_epoch_info(&self) -> Option<EpochInfo> {
let (_, record) = self.db.last_record::<EpochRecords>()?;
Some(record.into())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epoch records are only generated after the epoch concludes, so this logic needs to be adjusted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in e98b524. network_epoch is now derived from the highest downloaded epoch record + 1, with explicit handling for the epoch-0 dummy record.

db: DB,
/// The block number when sync started, captured on first syncing status check.
/// Derived from the last EpochRecord's parent_state as specified in issue #530.
sync_starting_block: OnceLock<u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made RethEnv thread-safe and cheap to clone in #549 so we can use it in the RPC struct

pub struct EngineToPrimaryRpc<DB> {
    /// Access to the execution database.
    reth_env: RethEnv,
    //...snip
}

This will help address other feedback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I wired RethEnv directly into EngineToPrimaryRpc (new signature includes reth_env), and use it in sync and epoch RPC paths.

Comment on lines +125 to +127
// Epoch-based sync decision: compare network epoch (from committee) vs local epoch (from
// DB). This follows the issue #530 spec: "A node is considered synced when its
// execution state is within the current consensus epoch."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content is good, and I appreciate well-commented code. That being said, please remove the reference to the "issue #530"

The comment might be more accurate by saying:

...compare network epoch (from last epoch record) vs local epoch (from execution DB). A node is considered synced when its execution state is within the current consensus epoch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I removed the issue #530 wording and updated the comments to describe the data sources directly (network epoch from epoch records, local epoch from execution state).

Comment on lines +1458 to +1459
self.consensus_bus.current_committee().send_replace(Some(committee.clone()));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This committee is the local committee - loaded from the latest execution state per epoch. I think the consensus_bus.current_committee().send_replace() should happen when a new epoch record is verified (~manger.rs:1238)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I now publish current_committee after persisting a new epoch record (manager.rs around save_epoch_record path), and also publish at committee load so RPC reflects the live epoch promptly.


// Synced only when both epoch AND execution are caught up.
// A node can have consensus data for the current epoch but execution may still
// lag behind — we must wait for execution to complete before declaring synced.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with this comment (which is more specific than the "execution state is within the current consensus epoch" from the issue. The problem is even with ~6h epochs, a node might not be synced for some time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and this is now reflected in e98b524: even with long epochs we do not report Synced unless execution progress has also caught up.

return SyncStatus::Synced;
}

// Fallback when committee info is unavailable (e.g. early startup):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another edge case to consider is the 0 epoch when there is only a dummy epoch record until the first epoch closes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in e98b524: I added the epoch-0 dummy-record guard so network_epoch does not advance incorrectly before the first real epoch record.

Comment on lines +62 to +73
pub struct SyncProgress {
/// The block number the node started syncing from.
pub starting_block: u64,
/// The current block number the node has reached.
pub current_block: u64,
/// The highest known block number.
pub highest_block: u64,
/// The node's current local epoch.
pub current_epoch: u64,
/// The network's current epoch.
pub highest_epoch: u64,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The starting_block, current_block, highest_block should all mention and correlate to consensus blocks (aka - ConsensusHeader). Please add another field to this struct that includes execution_block_height or something along those lines that uses RethEnv::canonical_tip() number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented. I clarified these fields as consensus block metrics and added execution_block_height from reth canonical tip, with serialization/deserialization test updates.

// DB). This follows the issue #530 spec: "A node is considered synced when its
// execution state is within the current consensus epoch."
let network_epoch =
self.consensus_bus.current_committee().borrow().as_ref().map(|c| c.epoch());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network_epoch is most likely the highest downloaded epoch record (from DB) + 1 (assuming the node has successfully fetched all existing epoch records as outlined in SYNC.md)

local_epoch = epoch from the node's execution state (current_committee epoch, or canonical tip)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this approach in e98b524: network_epoch uses highest downloaded epoch record + 1, while local_epoch comes from execution state (canonical tip epoch), with committee fallback.

@TanguyDeTaxis
Copy link
Contributor Author

TanguyDeTaxis commented Feb 19, 2026

Thanks for the detailed feedback, @grantkee .

I pushed a follow-up series tonight to address the sync/epoch/committee points and improve maintainability:

  • e98b524: reworked tn_syncing to align consensus + execution progress (including consensus round tracking), and added executionBlockHeight in SyncProgress.
  • 90e73d2: tn_epochInfo now reflects active execution epoch data (instead of relying only on the latest persisted epoch record), and the active committee is now published earlier so RPC reflects live epoch state.
  • f586ad9: extracted EngineToPrimaryRpc from node::lib into a dedicated module for clarity and easier iteration.

I also added/updated tests around sync status and active-epoch behavior. Happy to adjust further if you want a different interpretation for specific edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: tn_ RPC Namespace Methods

2 participants