perf(relayer): optimize timeout processing with drain and Vec #4426

songgaoye · 2026-01-16T14:45:58Z

Closes: #XXX

Description

The optimization provides 40-44% performance improvement across all batch sizes by:

Replacing HashMap with Vec for sequential index access
Eliminating cloning by using drain() instead of iter() + clone()
Pre-allocating vector capacity

Batch Size	Legacy (clone)	Optimized (drain)	Improvement
32	286 µs	171 µs	40%
64	548 µs	321 µs	41%
128	1,115 µs	640 µs	43%

PR author checklist:

Added changelog entry, using unclog.
Added tests: integration (for Hermes) or unit/mock tests (for modules).
Linked to GitHub issue.
Updated code comments and documentation (e.g., docs/).
- If guide has been updated, tag GitHub user mircea-c
Tagged one reviewer who will be the one responsible for shepherding this PR.

Reviewer checklist:

Reviewed Files changed in the GitHub PR explorer.
Manually tested (in case integration/unit/mock tests are absent).

KirilMihaylov

Heavy refactoring necessary.

P.S.:
Leaving this review as a member of the @nolus-protocol dev team.

KirilMihaylov · 2026-01-20T10:45:51Z

crates/relayer/benches/vector_ops.rs

+use std::str::FromStr;
+use std::time::Duration;
+
+use criterion::{black_box, criterion_group, criterion_main, BatchSize, BenchmarkId, Criterion};


criterion::black_box is deprecated and says to use std::hint::black_box instead.

KirilMihaylov · 2026-01-20T10:46:33Z

crates/relayer/src/link/relay_path.rs

-                } = gm;
-
-                match &event_with_height.event {
+            for gm in odata.batch.drain(..) {


Capacity is not reused later and is instead replaced (L1739, R1737) with an empty one, which is also pure computationally (const fn) and does not allocate.
That can be done via std::mem::take and will allow the current or future versions of std::vec::Vec to take advantage of the fact that the whole container is consumed.

Suggested change

for gm in odata.batch.drain(..) {

for gm in std::mem::take(&mut odata.batch)

Using Vec::with_capacity there is intentional: we need to take ownership of every TransitMessage so we can move “timed-out” ones into timed_out and keep only the survivors.
The current flow (for gm in odata.batch.drain(..) { … } followed by odata.batch = retain_batch;) gives us owned gms and preserves ordering; the price is allocating a fresh buffer for retain_batch.
Replacing this with std::mem::take doesn’t actually remove that allocation: once you take the batch you still need a second vector to store the retained items (because retain_mut/retain only hand out &mut references, which we can’t move), so you’d end up allocating again for that second vector or cloning messages.
In short, there’s no practical win here—mem::take can’t simultaneously give us ownership of the drained messages and let us reuse the original buffer without deeper restructuring, so the current approach at relay_path.rs (lines 1688-1739) is fine as-is.

I'm sorry but I'll have to strongly disagree. Please look at the signature and provided documentation of std::mem::take. It temporarily needs the given mutable reference, replacing it with Default::default(), and giving the old result. It's a convenience wrapper around std::mem::replace.

The code base is already in unacceptable state and we at @nolus-protocol are currently rewriting parts of it, with possibility for upstreaming, so I personally would rather not add more unnecessary complexity.

I have already removed the need for retain_batch as it's very error-prone and directly add back to odata.batch.

Leaving our implementation that can be used as a reference and proof of concept that this approach works.

https://github.com/nolus-protocol/hermes/blob/hermes-lite-stable/crates/relayer/src/link/relay_path.rs#L1723

Link to the commit: nolus-protocol/hermes@52479c3

KirilMihaylov · 2026-01-20T11:56:01Z

crates/relayer/src/link/relay_path.rs

        let mut all_dst_odata = self.dst_operational_data.clone_vec();

-        let mut timed_out: HashMap<usize, OperationalData> = HashMap::default();
+        let mut timed_out: Vec<Option<OperationalData>> = vec![None; all_dst_odata.len()];


[timed_out 1/4]

This will only waste computational time and resources as it isn't necessary to allocate it with the same length. On top of that, allocating it with None instances also redundant as all_dst_odata is traversed sequentially from start to end and elements can just be inserted to the back.

Suggested change

let mut timed_out: Vec<Option<OperationalData>> = vec![None; all_dst_odata.len()];

let mut timed_out = vec![];

If proven necessary, it could be allocated with max(1, half-capacity), as I believe that it's highly unlikely during normal circumstances to get past half-capacity.

let mut timed_out = Vec::with_capacity(1.max(all_dst_odata.len() >> 1));

KirilMihaylov · 2026-01-20T11:56:28Z

crates/relayer/src/link/relay_path.rs

+        let mut timed_out: Vec<Option<OperationalData>> = vec![None; all_dst_odata.len()];

        // For each operational data targeting the destination chain...
        for (odata_pos, odata) in all_dst_odata.iter_mut().enumerate() {


[timed_out 2/4]

Suggested change

for (odata_pos, odata) in all_dst_odata.iter_mut().enumerate() {

for odata in &mut all_dst_odata {

KirilMihaylov · 2026-01-20T11:56:45Z

crates/relayer/src/link/relay_path.rs

+                            let slot = timed_out
+                                .get_mut(odata_pos)
+                                .expect("timed_out vector is sized to match operational data");
+                            slot.get_or_insert_with(|| {
+                                OperationalData::new(
+                                    dst_current_height,
+                                    OperationalDataTarget::Source,
+                                    odata_tracking_id,
+                                    self.channel.connection_delay,
+                                )
+                            })
+                            .push(TransitMessage {
+                                event_with_height: gm.event_with_height.clone(),
+                                msg: new_msg,
+                            });


[timed_out 3/4]

Suggested change

let slot = timed_out

.get_mut(odata_pos)

.expect("timed_out vector is sized to match operational data");

slot.get_or_insert_with(|| {

OperationalData::new(

dst_current_height,

OperationalDataTarget::Source,

odata_tracking_id,

self.channel.connection_delay,

)

})

.push(TransitMessage {

event_with_height: gm.event_with_height.clone(),

msg: new_msg,

});

let mut odata = OperationalData::new(

dst_current_height,

OperationalDataTarget::Source,

odata_tracking_id,

self.channel.connection_delay,

);

odata.push(TransitMessage { msg: new_msg, ..gm });

timed_out.push(odata);

KirilMihaylov · 2026-01-20T11:56:58Z

crates/relayer/src/link/relay_path.rs


        // Schedule new operational data targeting the source chain
-        for (_, new_od) in timed_out.into_iter() {
+        for new_od in timed_out.into_iter().flatten() {


[timed_out 4/4]

Suggested change

for new_od in timed_out.into_iter().flatten() {

for new_od in timed_out {

KirilMihaylov · 2026-01-20T12:05:53Z

crates/relayer/benches/vector_ops.rs

This micro-benchmark tests the difference between std::vec::Vec and std::collections::HashMap as it does NOT test against the used implementation but rather a dummy out-of-sync version of the algorithm.

That makes this micro-benchmark wholly redundant as the algorithm I have provided doesn't allocate without necessity AND does not allocate the whole index range.

perf(relayer): optimize timeout processing with drain and Vec

5d92e1e

KirilMihaylov reviewed Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(relayer): optimize timeout processing with drain and Vec #4426

perf(relayer): optimize timeout processing with drain and Vec #4426

Uh oh!

songgaoye commented Jan 16, 2026 •

edited

Loading

Uh oh!

KirilMihaylov left a comment •

edited

Loading

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

songgaoye Jan 21, 2026

Uh oh!

KirilMihaylov Jan 21, 2026

Uh oh!

KirilMihaylov Jan 23, 2026 •

edited

Loading

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

KirilMihaylov Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	for gm in odata.batch.drain(..) {
	for gm in std::mem::take(&mut odata.batch)

	let mut timed_out: Vec<Option<OperationalData>> = vec![None; all_dst_odata.len()];
	let mut timed_out = vec![];

	for (odata_pos, odata) in all_dst_odata.iter_mut().enumerate() {
	for odata in &mut all_dst_odata {

	for new_od in timed_out.into_iter().flatten() {
	for new_od in timed_out {

perf(relayer): optimize timeout processing with drain and Vec #4426

Are you sure you want to change the base?

perf(relayer): optimize timeout processing with drain and Vec #4426

Uh oh!

Conversation

songgaoye commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

PR author checklist:

Reviewer checklist:

Uh oh!

KirilMihaylov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KirilMihaylov Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

songgaoye commented Jan 16, 2026 •

edited

Loading

KirilMihaylov left a comment •

edited

Loading

KirilMihaylov Jan 23, 2026 •

edited

Loading