Skip to content

fix(consensus): get votes to actually send and get retained#1196

Open
InKryption wants to merge 9 commits intomainfrom
ink/landing-votes
Open

fix(consensus): get votes to actually send and get retained#1196
InKryption wants to merge 9 commits intomainfrom
ink/landing-votes

Conversation

@InKryption
Copy link
Contributor

This PR does a few things:

  • It actually passes through the tpu_vote_addr in contact info and other relevant places, so that consensus/replay can actually send votes through to the correct place.
  • Actually passes the gossip table into consensus, so it can get contact info for leaders, and send votes through it.
  • Use just one actual underlying allocator so as to avoid an invalid free between them, caused by gossip's local message broker (changed in this way, because it makes no sense for gossip to use two allocators anyway).
  • Simplifies the logic for sending packets for votes so as to not accidentally give sendmmsg empty iovecs with empty buffers.
  • Send our contact info to gossip every 500ms instead of every 15s, so that it actually survives long enough & often enough.
  • Use just one latest_validator_votes; one was leftover before the overarching SlotData struct was introduced, meaning there were two inconsistent states tracking things differently.
  • Fixed an integer overflow in the RpcContextService that occurs when on epoch 0 (which can arise when starting from genesis/testing on a local cluster).
  • Replace incorrect usage of std.posix.errno with std.os.linux.E.init.
  • A few other small fixes, like making a var into a const, removing an empty IES,

@github-project-automation github-project-automation bot moved this to 🏗 In progress in Sig Feb 2, 2026
@InKryption InKryption force-pushed the ink/landing-votes branch 3 times, most recently from 09b7b44 to 6e2dea7 Compare February 2, 2026 20:30
@InKryption InKryption marked this pull request as ready for review February 2, 2026 20:46
@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 86.11111% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/gossip/service.zig 69.23% 4 Missing ⚠️
src/replay/service.zig 0.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/accountsdb/snapshot/download.zig 75.34% <ø> (+0.06%) ⬆️
src/core/transaction.zig 93.48% <100.00%> (+0.08%) ⬆️
src/gossip/data.zig 76.60% <100.00%> (+0.07%) ⬆️
src/net/socket_utils.zig 93.18% <100.00%> (ø)
src/replay/consensus/cluster_sync.zig 91.57% <100.00%> (+0.01%) ⬆️
src/replay/consensus/core.zig 94.87% <ø> (-0.11%) ⬇️
src/replay/consensus/process_result.zig 93.35% <100.00%> (ø)
src/shred_network/service.zig 90.38% <ø> (ø)
src/replay/service.zig 93.44% <0.00%> (-0.19%) ⬇️
src/gossip/service.zig 86.75% <69.23%> (-0.14%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@InKryption InKryption force-pushed the ink/landing-votes branch 6 times, most recently from 6798730 to 68b68c3 Compare February 4, 2026 18:55
src/adapter.zig Outdated
};
errdefer gpa.free(ls2);
try self.state.put(this_epoch, ctx2);
} else |e| if (this_epoch == this_epoch) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always true

push_msg_queue.queue.deinit();
}

// self.broker.deinit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, nice catch. At some point early on I was debugging gossip, and wanted to fix some leaks for some local testing code, but the specific way in which this works caused issues in the actual validator on exits I believe, so I commented it out. Probably will just delete it, unless you think it's worth handling deallocation here for whatever reason.

Comment on lines +151 to +153
// TODO: This method can result in us signing the same message for the same keypair more than once.
// Should investigate whether a hashmap to amortize the signing is faster. The current only usecase
// is when we send votes, and that *does* sign the same message twice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's rare to have more than 2-3 signers in a transaction so I doubt a hashmap would be favorable over a linear search.

Is "This method can result in us signing the same message for the same keypair more than once." supposed to be part of the todo, or is it a separate note?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point - I left this in as part of @Rexicon226's original suggestion (pinged in case he has any specific views or intel on this).

The redundant signage should probably be its own comment.

@github-project-automation github-project-automation bot moved this from 🏗 In progress to 👀 In review in Sig Feb 10, 2026
also replace `catch unreachable` with `expectEqual` in unit test.
the old instance was left over, and using it is a bug
Add test coverage for this to satisfy codecov
Adjust unit tests to insert a hash into the blockhash, instead of
setting the slot's bank hash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 👀 In review

Development

Successfully merging this pull request may close these issues.

2 participants