Refactor `manager.rs`: Extract Epoch Vote Collector and Harden Logic

# Problem

`node::manager.rs` should be broken up into smaller mods. The `spawn_epoch_vote_collector` is one example of code that would benefit from separating into a separate file.

## Summary

`crates/node/src/manager.rs` is nearly 1900 lines and mixes epoch vote collection logic with the broader `EpochManager` orchestration. The epoch vote collector should be extracted into its own module, and several logic improvements should be evaluated.

## Refactoring

- Move `spawn_epoch_vote_collector` and `epoch_vote_collector_tests` into a dedicated module (e.g., `crates/node/src/epoch_vote_collector.rs`) to reduce LOC in `manager.rs` and improve logical organization.

## Epoch Vote Collector Logic Improvements

The following items should be evaluated as part of or after the extraction:

### 1. Retrieving epoch record by number

The `consensus_db.get_epoch_by_number` check that skips collection when a cert already exists uses a `tokio::select` with `epoch_rx.changed() => continue`. Confirm this correctly moves to the next epoch record without skipping any watch updates.

### 2. Vote digest validation

Ensure that epoch vote signatures and digests are validated at the network handler layer before they reach the collector. The collector currently trusts that signatures are already verified — confirm this assumption holds and document where validation occurs.

### 3. Alt-record vote tracking — potential attack vector

Currently, votes for alternative epoch records are tracked in `alt_recs` but the voter's key is **not** removed from `committee_keys`. This means a malicious actor could send multiple votes for different alt hashes and inflate the count, potentially triggering the `votes + 1 >= quorum` error branch and breaking out of the collection loop early. Committee members should be removed from the tracking set after casting an alt vote to prevent double-counting.

### 4. Drain `vote_rx` in the initial `tokio::select` loop

While unlikely given ~24hr epoch boundaries, votes could theoretically accumulate in the mpsc buffer while the collector is waiting for an epoch record. Adding `vote_rx` as a branch in the first `tokio::select` loop would drain and buffer early votes, preventing potential overflow under adversarial conditions.

### 5. Handle ejected validators that signed an epoch record (#500)

Address the edge case described in #500 where ejected validators may have already signed an epoch record.

### 6. Replace timeout with interval for vote collection

The current approach uses a 5-second timeout that shortens to 1 second after quorum, with a hard cap of 12 timeouts (~60 seconds total). Consider replacing this with a `tokio::time::interval` that allows votes to arrive for the duration of the epoch. This would lower the chances of a failed epoch record while still providing a mechanism to finalize once quorum is reached and re-gossip records.

### 7. Evaluate fallback approach for ensuring epoch records are signed

When quorum is not reached locally, the collector falls back to requesting epoch certs from peers (up to 3 attempts). Evaluate whether this fallback is sufficient or if additional strategies are needed (e.g., periodic retries, longer collection windows, or broadcasting a request to multiple peers simultaneously).

follow up from #534 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `manager.rs`: Extract Epoch Vote Collector and Harden Logic #547

Problem

Summary

Refactoring

Epoch Vote Collector Logic Improvements

1. Retrieving epoch record by number

2. Vote digest validation

3. Alt-record vote tracking — potential attack vector

4. Drain `vote_rx` in the initial `tokio::select` loop

5. Handle ejected validators that signed an epoch record (#500)

6. Replace timeout with interval for vote collection

7. Evaluate fallback approach for ensuring epoch records are signed

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor manager.rs: Extract Epoch Vote Collector and Harden Logic #547

Description

Problem

Summary

Refactoring

Epoch Vote Collector Logic Improvements

1. Retrieving epoch record by number

2. Vote digest validation

3. Alt-record vote tracking — potential attack vector

4. Drain vote_rx in the initial tokio::select loop

5. Handle ejected validators that signed an epoch record (#500)

6. Replace timeout with interval for vote collection

7. Evaluate fallback approach for ensuring epoch records are signed

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Refactor `manager.rs`: Extract Epoch Vote Collector and Harden Logic #547

4. Drain `vote_rx` in the initial `tokio::select` loop