Skip to content

fix: chainsync responder#729

Open
etorreborre wants to merge 4 commits intomainfrom
etorreborre/fix/chainsync-responder
Open

fix: chainsync responder#729
etorreborre wants to merge 4 commits intomainfrom
etorreborre/fix/chainsync-responder

Conversation

@etorreborre
Copy link
Contributor

@etorreborre etorreborre commented Mar 19, 2026

WIP!

Summary by CodeRabbit

  • New Features

    • Added chain-fragment API and best-chain-intersection to compute rollback/forward sequences.
    • Exposed node wakeup control helpers for tests.
    • Added accessors to retrieve generated headers and per-peer actions.
  • Refactor

    • Reworked chain-sync responder to a plan-then-emit model with queued rollback/forward actions.
    • Introduced an effect-based path to obtain chain fragments.
  • Tests

    • Improved simulation draining (quiescence fixed-point, transport-only checks) and block storage during roll-forward.
    • Added unit coverage for best-chain-intersection.

Signed-off-by: Eric Torreborre <etorreborre@yahoo.com>
Signed-off-by: Eric Torreborre <etorreborre@yahoo.com>
Signed-off-by: Eric Torreborre <etorreborre@yahoo.com>
@etorreborre etorreborre self-assigned this Mar 19, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds chain navigation APIs (ChainFragment, chain_fragment, best_chain_intersection) to the consensus store, wires a plan-then-emit flow into the ChainSync responder (pending rollback/forwards), exposes a ChainFragment external effect, and updates simulation/test helpers for wakeups, era-history block storage, and quiescence draining.

Changes

Cohort / File(s) Summary
Consensus store API
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
Add public ChainFragment (rollback_to: Option, forwards: Vec), ReadOnlyChainStore methods chain_fragment(from,to) and best_chain_intersection(points), forwarder impl for Box<dyn ChainStore<H>>, and imports BTreeMap/BTreeSet.
Chainsync responder
crates/amaru-protocols/src/chainsync/responder.rs
Switch to plan-then-emit: add pending_rollback: Option<Point> and pending_forwards: VecDeque<Point>, populate them via store.chain_fragment(...), centralize emission in next_action(), and use best_chain_intersection() for intersections. Removes prior on-the-fly next-header scanning.
Store effects
crates/amaru-protocols/src/store_effects.rs
Introduce ChainFragmentEffect (ExternalEffect / API / Sync) and implement ReadOnlyChainStore<BlockHeader>::chain_fragment for Store<T> by delegating to the external effect; import/export ChainFragment.
Simulation: node helpers
crates/amaru/src/tests/node.rs
Add pub fn next_wakeup(&self) -> Option<Instant> and pub fn advance_to_wakeup(&mut self, at: Instant) -> bool to expose/control node wakeups.
Simulation: quiescence & drain
crates/amaru/src/tests/nodes.rs
Rework Nodes::drain() to iterate to a fixed-point: run runnable effects, when idle check whether only transport-only trace entries were appended, otherwise compute earliest global wakeup and advance nodes; add trace-stage classification helpers and per-node trace-length tracking.
Actions stage & wiring
crates/amaru/src/tests/setup.rs, simulation/amaru-sim/src/simulator/run_tests.rs
Thread EraHistory into the actions stage and store encoded blocks on RollForward; replace local per-peer helpers with GeneratedActions accessors (get_peer_actions, get_headers) and adjust upstream node configuration to use them.
GeneratedActions API
crates/amaru-consensus/src/headers_tree/data_generation/actions.rs
Add pub fn get_headers(&self) -> Vec<BlockHeader> and pub fn get_peer_actions(&self, peer: &Peer) -> Vec<Action>.
RocksDB test
crates/amaru-stores/src/rocksdb/consensus/mod.rs
Add unit test best_chain_intersection_returns_origin_when_requested asserting Point::Origin returned when requested.

Sequence Diagram

sequenceDiagram
    actor Protocol
    participant Responder
    participant Store
    participant Effect

    Note over Protocol,Effect: Plan-then-emit chainsync flow
    Protocol->>Responder: NewTip(to_point)
    Responder->>Store: chain_fragment(from_point, to_point)
    Store-->>Responder: ChainFragment { rollback_to?, forwards[...] }
    Responder->>Responder: enqueue pending_rollback / pending_forwards

    Protocol->>Responder: RequestNext
    activate Responder
    Responder->>Responder: next_action()
    alt pending_rollback exists
        Responder-->>Protocol: RollBackward(point)
        Responder->>Responder: clear pending_rollback, drop matching forwards
    else pending_forwards non-empty
        Responder->>Store: load header(forward_point)
        Store-->>Responder: Header
        Responder-->>Protocol: RollForward(Header)
    else pointer == upstream_tip
        Responder-->>Protocol: AwaitReply
    end
    deactivate Responder

    Protocol->>Responder: FindIntersect(points)
    Responder->>Store: best_chain_intersection(points)
    Store-->>Responder: Option<Point>
    Responder-->>Protocol: IntersectFound / IntersectNotFound
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • rkuhn
  • abailly

Poem

The responder plots a cheeky little trail,
Rollbacks queued, forwards set to sail.
The store sings maps, the sim wakes on cue,
Packets prance, headers line up true. 🎬🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and generic, using only 'fix: chainsync responder' without conveying the specific nature or scope of changes made. Consider a more descriptive title that captures the main change, such as 'refactor: implement two-phase plan-then-emit approach for chainsync responder' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch etorreborre/fix/chainsync-responder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (1)

287-290: Explicit forwarding for chain_fragment on the boxed trait object.

This ensures that if an underlying store has a custom optimized implementation, it gets used rather than falling back to the default trait impl. Smart move.

However, I notice best_chain_intersection doesn't have explicit forwarding here — it relies on the default implementation. This works because the default impl uses forwarded methods (get_best_chain_hash, load_header). But if a store ever provides an optimized best_chain_intersection, it won't be used through Box<dyn ChainStore<H>>. Worth considering adding explicit forwarding for consistency, though it's not blocking since the current behavior is correct.

🔧 Optional: Add explicit forwarding for best_chain_intersection
     fn chain_fragment(&self, from: Point, to: Point) -> Result<ChainFragment, StoreError> {
         self.as_ref().chain_fragment(from, to)
     }
+
+    fn best_chain_intersection(&self, points: &[Point]) -> Result<Option<Point>, StoreError> {
+        self.as_ref().best_chain_intersection(points)
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs` around lines 287 -
290, Add an explicit forwarding impl for best_chain_intersection on the boxed
trait object so any optimized override on the concrete store is used;
specifically, implement fn best_chain_intersection(&self, a: Point, b: Point) ->
Result<Option<Point>, StoreError> that simply delegates to
self.as_ref().best_chain_intersection(a, b) (analogous to the existing
chain_fragment forwarding) for the Box<dyn ChainStore<H>> impl so stores with
custom best_chain_intersection implementations are honored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@simulation/amaru-sim/src/simulator/run_tests.rs`:
- Line 25: Remove the unused IsHeader import from the use statement that
currently reads use amaru_kernel::{IsHeader, Peer}; — keep Peer (which is used)
and delete IsHeader so the import becomes use amaru_kernel::Peer; to satisfy the
linter and remove the dead import.

---

Nitpick comments:
In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs`:
- Around line 287-290: Add an explicit forwarding impl for
best_chain_intersection on the boxed trait object so any optimized override on
the concrete store is used; specifically, implement fn
best_chain_intersection(&self, a: Point, b: Point) -> Result<Option<Point>,
StoreError> that simply delegates to self.as_ref().best_chain_intersection(a, b)
(analogous to the existing chain_fragment forwarding) for the Box<dyn
ChainStore<H>> impl so stores with custom best_chain_intersection
implementations are honored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 071c8b94-fb1b-4b68-bcec-712406a2ad89

📥 Commits

Reviewing files that changed from the base of the PR and between 8d7f5a5 and 26bae24.

📒 Files selected for processing (7)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru/src/tests/node.rs
  • crates/amaru/src/tests/nodes.rs
  • crates/amaru/src/tests/setup.rs
  • simulation/amaru-sim/src/simulator/run_tests.rs

@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch from 26bae24 to 4453c7b Compare March 20, 2026 11:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/amaru-protocols/src/chainsync/responder.rs (1)

117-122: ⚠️ Potential issue | 🔴 Critical

Rebuild the plan when IntersectFound moves the pointer.

Right now only NewTip refreshes pending_rollback / pending_forwards. After an intersect, self.pointer changes but the old plan survives — or stays empty on a fresh sync — so the next RequestNext can replay a stale suffix or drift into AwaitReply even though upstream is already ahead. That’s a proper protocol faceplant.

🛠️ Proposed fix
             ResponderResult::FindIntersect(points) => {
-                let action = intersect(points, &Store::new(eff.clone()), self.upstream)
+                let store = Store::new(eff.clone());
+                let action = intersect(points, &store, self.upstream)
                     .context("failed to find intersection")?;
                 if let ResponderAction::IntersectFound(point, _tip) = &action {
                     self.pointer = *point;
+                    let fragment = store
+                        .chain_fragment(self.pointer, self.upstream.point())
+                        .context("failed to plan fragment after intersection")?;
+                    self.pending_rollback = fragment.rollback_to;
+                    self.pending_forwards = VecDeque::from(fragment.forwards);
+                } else {
+                    self.pending_rollback = None;
+                    self.pending_forwards.clear();
                 }
                 Ok((Some(action), self))
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-protocols/src/chainsync/responder.rs` around lines 117 - 122,
When handling ResponderAction::IntersectFound you update self.pointer but do not
rebuild the plan, so pending_rollback/pending_forwards can remain stale; after
matching ResponderAction::IntersectFound(point, _tip) (the branch that currently
sets self.pointer) run the same plan-rebuild logic used in the NewTip handling
to recompute pending_rollback and pending_forwards (i.e., invoke the
plan-building routine used elsewhere in this module to derive the
forward/rollback suffix from the new pointer and self.upstream/Store), ensuring
RequestNext will use the refreshed plan and not replay a stale suffix or enter
AwaitReply incorrectly.
🧹 Nitpick comments (1)
crates/amaru/src/tests/nodes.rs (1)

238-243: Consider extracting transport prefixes to a constant for discoverability.

The logic here is sound - you're filtering out the "background noise" stages (keepalive, mux, reader, writer) so the drain can focus on the real action. It's like filtering out the HUD elements to see the actual gameplay, yeah?

One small thing though: these hardcoded prefixes could become a maintenance gotcha. If someone adds a new transport-layer stage down the track, they'd need to know to update this list.

🔧 Optional: Extract to a const for visibility
+/// Transport-layer stage prefixes that don't indicate meaningful simulation activity.
+/// Update this list when adding new transport/infrastructure stages.
+const TRANSPORT_STAGE_PREFIXES: &[&str] = &["keepalive", "mux", "reader", "writer"];
+
 fn trace_entry_is_transport_only(entry: &TraceEntry) -> bool {
     let Some(stage) = trace_entry_stage(entry) else {
         return true;
     };
-    matches!(stage_prefix(stage), "keepalive" | "mux" | "reader" | "writer")
+    TRANSPORT_STAGE_PREFIXES.contains(&stage_prefix(stage))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/tests/nodes.rs` around lines 238 - 243, Extract the
hardcoded transport stage prefixes used in trace_entry_is_transport_only into a
single reusable constant (e.g., TRANSPORT_STAGE_PREFIXES or
TRANSPORT_PREFIX_SET) so the list is discoverable and easy to update; replace
the inline "keepalive" | "mux" | "reader" | "writer" pattern with a check
against that constant (use a slice of &str or a HashSet and call contains) and
update trace_entry_is_transport_only (and any other uses of stage_prefix(stage)
matching these prefixes) to reference the new constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs`:
- Around line 126-148: best_chain_intersection currently ignores Point::Origin,
causing intersections to miss; update best_chain_intersection to treat
Point::Origin as a valid intersection when the store has a non-empty best chain:
before entering the loop check if points.contains(&Point::Origin) and if so call
self.load_header(&current) (or otherwise detect the chain is non-empty) and
return Ok(Some(Point::Origin)) only when a header exists; keep using the
existing helpers (get_best_chain_hash and load_header) and preserve the existing
error handling for header loading.

---

Outside diff comments:
In `@crates/amaru-protocols/src/chainsync/responder.rs`:
- Around line 117-122: When handling ResponderAction::IntersectFound you update
self.pointer but do not rebuild the plan, so pending_rollback/pending_forwards
can remain stale; after matching ResponderAction::IntersectFound(point, _tip)
(the branch that currently sets self.pointer) run the same plan-rebuild logic
used in the NewTip handling to recompute pending_rollback and pending_forwards
(i.e., invoke the plan-building routine used elsewhere in this module to derive
the forward/rollback suffix from the new pointer and self.upstream/Store),
ensuring RequestNext will use the refreshed plan and not replay a stale suffix
or enter AwaitReply incorrectly.

---

Nitpick comments:
In `@crates/amaru/src/tests/nodes.rs`:
- Around line 238-243: Extract the hardcoded transport stage prefixes used in
trace_entry_is_transport_only into a single reusable constant (e.g.,
TRANSPORT_STAGE_PREFIXES or TRANSPORT_PREFIX_SET) so the list is discoverable
and easy to update; replace the inline "keepalive" | "mux" | "reader" | "writer"
pattern with a check against that constant (use a slice of &str or a HashSet and
call contains) and update trace_entry_is_transport_only (and any other uses of
stage_prefix(stage) matching these prefixes) to reference the new constant.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a867fa92-2cb8-439b-b88f-9520985bfd5f

📥 Commits

Reviewing files that changed from the base of the PR and between 26bae24 and 4453c7b.

📒 Files selected for processing (3)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru/src/tests/nodes.rs

@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch from 4453c7b to 22893af Compare March 20, 2026 12:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs (1)

131-149: ⚠️ Potential issue | 🔴 Critical

Origin should be the fallback track, not the opening act.

This short-circuit makes any [..., Point::Origin] probe resolve to genesis, even when the best chain contains a newer common point. It also revives IntersectFound(Origin) on an empty store by skipping the header load that currently fails there. Walk the best chain first, then fall back to Origin only once you hit the root.

🛠️ Minimal fix
-        if points.contains(&Point::Origin) {
-            return Ok(Some(Point::Origin));
-        }
-
-        let points: BTreeSet<Point> = points.iter().copied().collect();
+        let origin_requested = points.contains(&Point::Origin);
+        let points: BTreeSet<Point> = points.iter().copied().collect();
         let mut current = self.get_best_chain_hash();

         loop {
             let header = self
                 .load_header(&current)
@@
             match header.parent() {
                 Some(parent) => current = parent,
-                None => return Ok(None),
+                None => return Ok(origin_requested.then_some(Point::Origin)),
             }
         }

Based on learnings, the chainsync responder intentionally does not support serving headers when the tip is Origin; intersect() is expected to fail in that case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs` around lines 131 -
149, The current early return for Point::Origin causes any probe containing
Origin to always resolve to genesis and bypass header traversal; remove the
initial short-circuit that returns Some(Point::Origin) and instead treat Origin
as a fallback only when you explicitly reach the root: keep walking from
get_best_chain_hash using load_header and header.point(), and when
header.parent() is None check if points.contains(&Point::Origin) and then return
Ok(Some(Point::Origin)) only as that fallback; this preserves the existing
ReadError behavior when the tip is Origin (load_header failing) and otherwise
finds newer common points on the best chain.
🧹 Nitpick comments (1)
crates/amaru/src/tests/nodes.rs (1)

238-273: Nice exhaustive coverage; the naming heuristic is the wobbly bit.

Covering every current TraceEntry / Effect variant is tidy. The fragile part is that drain termination now hangs off "keepalive" | "mux" | "reader" | "writer" plus split('-'), so a stage rename silently changes test semantics. Worth hoisting transport-ness behind shared stage metadata or constants instead of parsing names by hand, mate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/tests/nodes.rs` around lines 238 - 273, The test is brittle
because trace_entry_is_transport_only uses stage_prefix(stage) splitting and
hard-coded strings; instead, hoist the transport-ness into shared stage
metadata/constants and use that from the test. Update
trace_entry_is_transport_only to ask a single canonical predicate (e.g.,
is_transport_stage or TRANSPORT_STAGES constant exported from the runtime)
rather than matching "keepalive"|"mux"|"reader"|"writer" or splitting with
stage_prefix; keep trace_entry_stage and effect_stage as helpers but remove
stage_prefix usage in the boolean check and import/use the runtime's canonical
check/constant so renames won’t silently change semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru-protocols/src/chainsync/responder.rs`:
- Around line 96-102: The current replanning (building Store::chain_fragment and
setting self.pending_rollback / self.pending_forwards before calling
next_action) must be extracted into a helper (e.g., fn replan(&mut self,
store_eff, proto) -> Result<()>) and invoked whenever the responder's pointer
changes on IntersectFound and also lazily when handling RequestNext: if
self.pointer != self.upstream.point() and self.pending_forwards.is_empty() (or
pending_rollback is None) call the replan helper before computing next_action;
ensure NewTip still calls the helper as before so stale fragments aren't reused
and next_action always sees a fresh fragment after a post-intersect rollback.

---

Duplicate comments:
In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs`:
- Around line 131-149: The current early return for Point::Origin causes any
probe containing Origin to always resolve to genesis and bypass header
traversal; remove the initial short-circuit that returns Some(Point::Origin) and
instead treat Origin as a fallback only when you explicitly reach the root: keep
walking from get_best_chain_hash using load_header and header.point(), and when
header.parent() is None check if points.contains(&Point::Origin) and then return
Ok(Some(Point::Origin)) only as that fallback; this preserves the existing
ReadError behavior when the tip is Origin (load_header failing) and otherwise
finds newer common points on the best chain.

---

Nitpick comments:
In `@crates/amaru/src/tests/nodes.rs`:
- Around line 238-273: The test is brittle because trace_entry_is_transport_only
uses stage_prefix(stage) splitting and hard-coded strings; instead, hoist the
transport-ness into shared stage metadata/constants and use that from the test.
Update trace_entry_is_transport_only to ask a single canonical predicate (e.g.,
is_transport_stage or TRANSPORT_STAGES constant exported from the runtime)
rather than matching "keepalive"|"mux"|"reader"|"writer" or splitting with
stage_prefix; keep trace_entry_stage and effect_stage as helpers but remove
stage_prefix usage in the boolean check and import/use the runtime's canonical
check/constant so renames won’t silently change semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 28db97f9-9429-494e-a438-39d58bc541f0

📥 Commits

Reviewing files that changed from the base of the PR and between 4453c7b and 22893af.

📒 Files selected for processing (4)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/tests/nodes.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs

@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch from 22893af to 0ed2e13 Compare March 20, 2026 12:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
crates/amaru/src/tests/nodes.rs (1)

238-243: Transport stage detection looks solid.

The logic of returning true for entries without a stage (like Clock) makes sense — it's infrastructure timing, not application activity. Smart move.

Just a heads-up, mate: the transport prefix list "keepalive" | "mux" | "reader" | "writer" is hardcoded. If new transport-layer stages get added down the road (like some hypothetical "heartbeat" or "ping" stage), this function would need updating. Might be worth a wee comment to flag that for future maintainers, but no action needed right now.

📝 Optional: Add a clarifying comment
 fn trace_entry_is_transport_only(entry: &TraceEntry) -> bool {
     let Some(stage) = trace_entry_stage(entry) else {
         return true;
     };
+    // Transport-layer stage prefixes; update if new transport stages are added
     matches!(stage_prefix(stage), "keepalive" | "mux" | "reader" | "writer")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/tests/nodes.rs` around lines 238 - 243,
trace_entry_is_transport_only currently hardcodes the transport-stage prefixes
("keepalive" | "mux" | "reader" | "writer"), so add a short clarifying comment
above trace_entry_is_transport_only noting that the list is intentional and must
be updated if new transport stages are added (or alternatively extract the
prefixes into a single constant like TRANSPORT_STAGE_PREFIXES used by
trace_entry_is_transport_only to centralize updates); reference
trace_entry_stage and stage_prefix in the comment so future maintainers know
where the detection logic lives and can update it consistently.
crates/amaru-protocols/src/chainsync/responder.rs (1)

305-360: Solid intersection tests, but next_action is untested.

The intersect function has lovely coverage – finding points, most recent matching, before anchor, empty/unknown cases. Good stuff!

However, the new next_action function and the plan-then-emit flow don't have any direct tests yet. Given the complexity of the buffering logic (pending rollbacks, forwards, lazy replanning), it'd be worth adding tests for:

  • Normal forward progression through a planned fragment
  • Rollback followed by forwards
  • Lazy replanning when buffers are exhausted
  • Edge case: pointer == upstream.point() returning AwaitReply

I know it's marked WIP, but figured I'd mention it – like a "quest incomplete" marker for when you're ready to come back to it!

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-protocols/src/chainsync/responder.rs` around lines 305 - 360,
Add unit tests for next_action and the plan-then-emit flow inside the existing
tests module: cover (1) normal forward progression through a planned fragment
(advance through pending_forwards until plan is exhausted and ensure
ResponderAction::SendNext or similar forward actions are returned), (2) a
rollback followed by forwards (populate pending_rollbacks then pending_forwards
and assert rollbacks are emitted before forwards), (3) lazy replanning when
buffers are exhausted (consume buffers and verify next_action triggers
plan_then_emit/replanning and produces new forwards), and (4) the edge case
pointer == upstream.point() returning AwaitReply (set pointer and
upstream.point() equal and assert next_action returns AwaitReply). Use the
functions/fields next_action, plan_then_emit, pending_rollbacks,
pending_forwards, pointer and upstream.point(), and assert ResponderAction
variants to locate behavior in the responder logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/amaru-protocols/src/chainsync/responder.rs`:
- Around line 305-360: Add unit tests for next_action and the plan-then-emit
flow inside the existing tests module: cover (1) normal forward progression
through a planned fragment (advance through pending_forwards until plan is
exhausted and ensure ResponderAction::SendNext or similar forward actions are
returned), (2) a rollback followed by forwards (populate pending_rollbacks then
pending_forwards and assert rollbacks are emitted before forwards), (3) lazy
replanning when buffers are exhausted (consume buffers and verify next_action
triggers plan_then_emit/replanning and produces new forwards), and (4) the edge
case pointer == upstream.point() returning AwaitReply (set pointer and
upstream.point() equal and assert next_action returns AwaitReply). Use the
functions/fields next_action, plan_then_emit, pending_rollbacks,
pending_forwards, pointer and upstream.point(), and assert ResponderAction
variants to locate behavior in the responder logic.

In `@crates/amaru/src/tests/nodes.rs`:
- Around line 238-243: trace_entry_is_transport_only currently hardcodes the
transport-stage prefixes ("keepalive" | "mux" | "reader" | "writer"), so add a
short clarifying comment above trace_entry_is_transport_only noting that the
list is intentional and must be updated if new transport stages are added (or
alternatively extract the prefixes into a single constant like
TRANSPORT_STAGE_PREFIXES used by trace_entry_is_transport_only to centralize
updates); reference trace_entry_stage and stage_prefix in the comment so future
maintainers know where the detection logic lives and can update it consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 86b8b245-c3a8-48a3-913c-23628f5c20d5

📥 Commits

Reviewing files that changed from the base of the PR and between 22893af and 0ed2e13.

📒 Files selected for processing (4)
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/tests/nodes.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 88.52459% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/amaru-protocols/src/chainsync/responder.rs 70.76% 19 Missing ⚠️
...amaru-ouroboros-traits/src/stores/consensus/mod.rs 87.28% 15 Missing ⚠️
crates/amaru-stores/src/rocksdb/consensus/mod.rs 95.12% 8 Missing ⚠️
Files with missing lines Coverage Δ
...sensus/src/headers_tree/data_generation/actions.rs 73.21% <100.00%> (-0.29%) ⬇️
crates/amaru-protocols/src/store_effects.rs 69.26% <100.00%> (-9.02%) ⬇️
crates/amaru-stores/src/rocksdb/consensus/mod.rs 87.42% <95.12%> (+1.88%) ⬆️
...amaru-ouroboros-traits/src/stores/consensus/mod.rs 78.63% <87.28%> (+8.80%) ⬆️
crates/amaru-protocols/src/chainsync/responder.rs 91.42% <70.76%> (-4.84%) ⬇️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch from 0ed2e13 to c006a76 Compare March 23, 2026 10:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/amaru-protocols/src/chainsync/responder.rs (1)

116-123: ⚠️ Potential issue | 🟠 Major

Clear the stale plan when IntersectFound changes the pointer.

You've nailed the issue here, mate. After FindIntersect changes self.pointer, the old plan from a prior NewTip becomes a ticking time bomb. The lazy replanning at lines 158-165 won't kick in unless both pending_rollback.is_none() and pending_forwards.is_empty(). If either still holds the old plan, you'll skip the replan entirely.

The real danger: if the next RequestNext arrives in a state that doesn't match CanAwait { send_rollback: true }, the cleanup logic at lines 132-139 won't fire either. You'll then proceed straight to emitting headers from the stale plan—potentially rolling back to a point before your new intersection, which is like rewinding the film mid-scene.

The fix is spot-on: wipe both pending_rollback and pending_forwards when the pointer changes.

 if let ResponderAction::IntersectFound(point, _tip) = &action {
     self.pointer = *point;
+    // Clear stale plan from before the intersection
+    self.pending_rollback = None;
+    self.pending_forwards.clear();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-protocols/src/chainsync/responder.rs` around lines 116 - 123,
When handling ResponderResult::FindIntersect (the block calling
intersect(Store::new(...)) ), if the resulting ResponderAction matches
ResponderAction::IntersectFound and you update self.pointer, also clear any
stale planned work by setting self.pending_rollback to None and clearing
self.pending_forwards (e.g. .clear()) so the old NewTip plan cannot be emitted;
do this immediately after assigning self.pointer and before returning
Ok((Some(action), self)).
🧹 Nitpick comments (1)
simulation/amaru-sim/src/simulator/run_tests.rs (1)

134-135: Avoid recomputing/cloning peer actions and headers in the hot path.

Right now we clone peer actions to filter active peers, then clone again to configure nodes; plus headers are rebuilt for each upstream node. Not a crash-level drama, but it’s extra churn per test run.

♻️ Suggested refactor
-    let active_peers: Vec<_> =
-        upstream_peers.iter().filter(|peer| !actions.get_peer_actions(peer).is_empty()).cloned().collect();
+    let active_peer_actions: Vec<_> = upstream_peers
+        .iter()
+        .filter_map(|peer| {
+            let peer_actions = actions.get_peer_actions(peer);
+            (!peer_actions.is_empty()).then_some((peer.clone(), peer_actions))
+        })
+        .collect();
+
+    let active_peers: Vec<_> = active_peer_actions.iter().map(|(peer, _)| peer.clone()).collect();
+    let all_headers = actions.get_headers();

-    let upstream_nodes = active_peers
-        .iter()
-        .map(|peer| {
+    let upstream_nodes = active_peer_actions
+        .into_iter()
+        .map(|(peer, peer_actions)| {
             NodeTestConfig::default()
                 .with_no_upstream_peers()
                 .with_listen_address(&peer.name)
                 .with_chain_length(run_config.generated_chain_depth)
                 // upstream nodes have all the blocks already
-                .with_validated_blocks(actions.get_headers())
-                .with_actions(actions.get_peer_actions(peer))
+                .with_validated_blocks(all_headers.clone())
+                .with_actions(peer_actions)
                 .with_node_type(UpstreamNode)
         })
         .collect::<Vec<_>>();

Also applies to: 143-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@simulation/amaru-sim/src/simulator/run_tests.rs` around lines 134 - 135, The
code repeatedly clones peer actions and rebuilds headers in the hot path (see
upstream_peers.iter().filter(|peer|
!actions.get_peer_actions(peer).is_empty()).cloned().collect()), causing
unnecessary allocation churn; fix by computing and storing the active peers and
their associated action slices/headers once before node configuration—e.g.,
build a Vec or HashMap of (peer -> &PeerActions) from actions.get_peer_actions
and a headers_map generated by whatever header-building logic (the code that
currently rebuilds headers per upstream node), then iterate that collection when
calling configure_nodes (and any other places at lines ~143-145) so you reuse
precomputed actions and headers instead of cloning/recreating them multiple
times.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs`:
- Around line 77-82: The code currently treats a missing parent header from
store.load_header(&parent_hash) as Point::Origin, silently masking store
corruption; change this to surface the problem by returning an error or logging:
in the match branch where header.parent() => Some(parent_hash), if
store.load_header(&parent_hash) yields None, return
Err(StoreError::ReadError(...)) (or call the crate's logger to warn before
returning an error) instead of Ok(Point::Origin); update the surrounding
function signature/propagation to propagate StoreError and use the symbols
header.parent(), store.load_header, Point::Origin, and StoreError::ReadError to
locate and implement the fix.

---

Outside diff comments:
In `@crates/amaru-protocols/src/chainsync/responder.rs`:
- Around line 116-123: When handling ResponderResult::FindIntersect (the block
calling intersect(Store::new(...)) ), if the resulting ResponderAction matches
ResponderAction::IntersectFound and you update self.pointer, also clear any
stale planned work by setting self.pending_rollback to None and clearing
self.pending_forwards (e.g. .clear()) so the old NewTip plan cannot be emitted;
do this immediately after assigning self.pointer and before returning
Ok((Some(action), self)).

---

Nitpick comments:
In `@simulation/amaru-sim/src/simulator/run_tests.rs`:
- Around line 134-135: The code repeatedly clones peer actions and rebuilds
headers in the hot path (see upstream_peers.iter().filter(|peer|
!actions.get_peer_actions(peer).is_empty()).cloned().collect()), causing
unnecessary allocation churn; fix by computing and storing the active peers and
their associated action slices/headers once before node configuration—e.g.,
build a Vec or HashMap of (peer -> &PeerActions) from actions.get_peer_actions
and a headers_map generated by whatever header-building logic (the code that
currently rebuilds headers per upstream node), then iterate that collection when
calling configure_nodes (and any other places at lines ~143-145) so you reuse
precomputed actions and headers instead of cloning/recreating them multiple
times.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b5d12301-64ae-4445-84f1-1c4055da3430

📥 Commits

Reviewing files that changed from the base of the PR and between 0ed2e13 and c006a76.

📒 Files selected for processing (6)
  • crates/amaru-consensus/src/headers_tree/data_generation/actions.rs
  • crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs
  • crates/amaru/src/tests/nodes.rs
  • simulation/amaru-sim/src/simulator/run_tests.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/amaru-stores/src/rocksdb/consensus/mod.rs

Comment on lines +77 to +82
match header.parent() {
Some(parent_hash) => {
Ok(store.load_header(&parent_hash).map(|parent| parent.point()).unwrap_or(Point::Origin))
}
None => Ok(Point::Origin),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent fallback to Origin might hide store gremlins.

If load_header(&parent_hash) returns None for a hash that should exist, we quietly pretend we hit genesis. That's like Geralt ignoring a monster contract – sometimes you gotta investigate the dodgy stuff.

For a corrupted or incomplete store, this could lead to incorrect chain fragments without any indication something's amiss. Consider whether a StoreError::ReadError would be more appropriate here, or at minimum add a warning log.

🛠️ Optional: make the missing parent explicit
 match header.parent() {
     Some(parent_hash) => {
-        Ok(store.load_header(&parent_hash).map(|parent| parent.point()).unwrap_or(Point::Origin))
+        match store.load_header(&parent_hash) {
+            Some(parent) => Ok(parent.point()),
+            None => {
+                tracing::warn!(parent = %parent_hash, "parent header not found, assuming genesis");
+                Ok(Point::Origin)
+            }
+        }
     }
     None => Ok(Point::Origin),
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru-ouroboros-traits/src/stores/consensus/mod.rs` around lines 77 -
82, The code currently treats a missing parent header from
store.load_header(&parent_hash) as Point::Origin, silently masking store
corruption; change this to surface the problem by returning an error or logging:
in the match branch where header.parent() => Some(parent_hash), if
store.load_header(&parent_hash) yields None, return
Err(StoreError::ReadError(...)) (or call the crate's logger to warn before
returning an error) instead of Ok(Point::Origin); update the surrounding
function signature/propagation to propagate StoreError and use the symbols
header.parent(), store.load_header, Point::Origin, and StoreError::ReadError to
locate and implement the fix.

@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch 2 times, most recently from f3a1ab3 to 4fb6f57 Compare March 23, 2026 14:52
Signed-off-by: Eric Torreborre <etorreborre@yahoo.com>
@etorreborre etorreborre force-pushed the etorreborre/fix/chainsync-responder branch from 4fb6f57 to 504c6ea Compare March 23, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant