feat(cluster): wire multi-shard cross-shard communication for server-ng#3269
Open
hubcio wants to merge 1 commit into
Open
feat(cluster): wire multi-shard cross-shard communication for server-ng#3269hubcio wants to merge 1 commit into
hubcio wants to merge 1 commit into
Conversation
e0e289d to
84e79d3
Compare
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (38.61%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #3269 +/- ##
=============================================
- Coverage 73.87% 62.99% -10.89%
Complexity 943 943
=============================================
Files 1193 1192 -1
Lines 108970 99041 -9929
Branches 85988 76076 -9912
=============================================
- Hits 80505 62393 -18112
- Misses 25733 33771 +8038
- Partials 2732 2877 +145
🚀 New features to boost your workflow:
|
core/server-ng bootstrapped exactly one shard with SHARD_ID=0 and senders=vec![sender] hardcoded; the multi-shard path was dead code. Cross-shard primitives copied from legacy core/server also did not fit core/shard's crossfire model (bounded MTx + try_send-or-drop, fd-transfer coordinator instead of SO_REUSEPORT). bootstrap() now spawns N OS threads from sharding.cpu_allocation, each pinned via nix::sched + hwlocality and running its own compio runtime + IggyMessageBus + IggyShard for the partitions hashed to that shard. Cross-thread shutdown rides an Arc<AtomicBool> polled by a per-shard watchdog, since the bus' Shutdown is !Send and cannot be triggered from the main thread directly. Partial shard-spawn failure and shard-thread panic now signal cluster-wide shutdown instead of hanging; the shutdown watchdog is detached from the bus drain. ShardFrame becomes a concrete enum (Consensus + Lifecycle); the R generic is lifted off IggyShard. Named routers (route_metadata / route_partition / route_consensus_control) replace the duplicated MessageBag match blocks, and a debug_assert at pump entry catches receiver-thread mis-binding that the ctor's assert_sender_ordering cannot see. ShardMetrics records frame_drops_total (counter, variant+reason labels), bumped at every inter-shard try_send rejection; without it, drop-and-recover under VSR retransmit is operationally indistinguishable from a livelock. The counter is atomic, so it is safe to bump from !Send compio reactor contexts. The legacy shard-mapping broadcast subsystem (periodic snapshot refresh task, three-state MappingSlot table, ReplicaMappingUpdate / ReplicaMappingClear frames) is retired entirely. Cross-shard replica routing now flows through the cluster-shared ReplicaOwnerTable: the owning shard's installer stamps its slot on a successful registry insert and CAS-clears it on disconnect, so every bus' send_to_replica slow path reads authoritative state with no broadcast or reconcile loop. Builder accepts the coordinator at ctor; IggyShard stays immutable post-construction. message_bus forward-fn types widen to carry replica/client id, and send_to_replica routes via the shared owner table so non-zero shards reconcile against shared state rather than shard 0's private bus. WAL recovery is serialized across shards: non-zero shards open the WAL read-only, reject mutating ops (drain, set_snapshot_op) on a read-only storage, route an invalid WAL header to truncate_or_fail, and close the read-only fd once recovery completes. Storage milestones (durable PartitionJournal, durable (view, commit_op) watermark) and SDK (client_id, request_id) durability across reconnect remain out of scope; tracked separately.
84e79d3 to
92db4e4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
core/server-ng bootstrapped exactly one shard with SHARD_ID=0 and
senders=vec![sender] hardcoded; the multi-shard path was dead code.
Cross-shard primitives copied from legacy core/server also did not
fit core/shard's crossfire model (bounded MTx + try_send-or-drop,
fd-transfer coordinator instead of SO_REUSEPORT).
bootstrap() now spawns N OS threads from sharding.cpu_allocation,
each pinned via nix::sched + hwlocality and running its own compio
runtime + IggyMessageBus + IggyShard for the partitions hashed to
that shard. Cross-thread shutdown rides an Arc polled by
a per-shard watchdog, since the bus' Shutdown is !Send and cannot be
triggered from the main thread directly. Partial shard-spawn failure
and shard-thread panic now signal cluster-wide shutdown instead of
hanging; the shutdown watchdog is detached from the bus drain.
ShardFrame becomes a concrete enum (Consensus + Lifecycle); the R
generic is lifted off IggyShard. Named routers (route_metadata /
route_partition / route_consensus_control) replace the duplicated
MessageBag match blocks, and a debug_assert at pump entry catches
receiver-thread mis-binding that the ctor's assert_sender_ordering
cannot see.
ShardMetrics records frame_drops_total (counter, variant+reason
labels), bumped at every inter-shard try_send rejection; without it,
drop-and-recover under VSR retransmit is operationally
indistinguishable from a livelock. The counter is atomic, so it is
safe to bump from !Send compio reactor contexts.
The legacy shard-mapping broadcast subsystem (periodic snapshot
refresh task, three-state MappingSlot table, ReplicaMappingUpdate /
ReplicaMappingClear frames) is retired entirely. Cross-shard replica
routing now flows through the cluster-shared ReplicaOwnerTable: the
owning shard's installer stamps its slot on a successful registry
insert and CAS-clears it on disconnect, so every bus' send_to_replica
slow path reads authoritative state with no broadcast or reconcile
loop. Builder accepts the coordinator at ctor; IggyShard stays
immutable post-construction.
message_bus forward-fn types widen to carry replica/client id, and
send_to_replica routes via the shared owner table so non-zero shards
reconcile against shared state rather than shard 0's private bus.
WAL recovery is serialized across shards: non-zero shards open the
WAL read-only, reject mutating ops (drain, set_snapshot_op) on a
read-only storage, route an invalid WAL header to truncate_or_fail,
and close the read-only fd once recovery completes.
Storage milestones (durable PartitionJournal, durable
(view, commit_op) watermark) and SDK (client_id, request_id)
durability across reconnect remain out of scope; tracked separately.