Skip to content

Bf/epochs#4096

Draft
bfish713 wants to merge 6 commits intomainfrom
bf/epochs
Draft

Bf/epochs#4096
bfish713 wants to merge 6 commits intomainfrom
bf/epochs

Conversation

@bfish713
Copy link
Copy Markdown
Contributor

Closes #<ISSUE_NUMBER>

This PR:

This PR does not:

Key places to review:

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements epoch transition logic and refactors proposal handling. Key changes include the introduction of an EpochManager, the addition of DRB result validation, and the replacement of QuorumProposal2 with a new Proposal struct. The Consensus implementation is now asynchronous and includes epoch-aware staking checks. Feedback identifies a critical bug in the Proposal to Leaf2 conversion where the next_epoch_justify_qc field is omitted. Additionally, the reviewer recommends addressing TODOs regarding message filtering for old views and the explicit handling of DrbResult view numbers to ensure protocol robustness.

view_number: self.view_number,
epoch: Some(self.epoch),
justify_qc: self.justify_qc,
next_epoch_justify_qc: None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In the Into<Leaf2<T>> implementation for Proposal<T>, the next_epoch_justify_qc field is hardcoded to None. This means that if a Proposal contains a next_epoch_justify_qc, it will be lost when converting to a Leaf2. Since the Leaf2 is what gets committed to, this is a critical bug that could lead to consensus failures or state divergence, as nodes will not be committing to the full proposal data.

Suggested change
next_epoch_justify_qc: None,
next_epoch_justify_qc: self.next_epoch_justify_qc,

Comment on lines +166 to +168
// TODO: This isn't correct, I think we need to process vote2 for views
// that haven't timed out, but are before the timeout view
if !matches!(input, ConsensusInput::DrbResult(_, _)) && view <= self.timeout_view {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The TODO comment here highlights a potential correctness issue. While DrbResult is now exempt from the old view check, other messages like Vote2 might also need to be processed for views that are before timeout_view but not yet timed out. Ignoring these could prevent certificate formation and stall progress. This should be addressed to ensure robust consensus.

Comment on lines +1014 to +1015
// TODO: where else can this cause problems?
ConsensusInput::DrbResult(..) => ViewNumber::genesis(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The TODO here is valid. While returning ViewNumber::genesis() for DrbResult works for the input filtering logic in handle_input, it's a bit of a hack. If view_number() is ever used for DrbResult in other contexts, this could lead to unexpected behavior. It would be safer to handle DrbResult more explicitly, perhaps by matching on ConsensusInput before calling view_number(), or by restructuring the input handling to not require a view_number for all variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant