Skip to content

Conversation

matheus23
Copy link
Member

Description

This does the re-batching inside RelayTransport by storing a pending_item: Option<RelayRecvDatagram>.
When we poll_recv, we first try to use that pending item instead of polling a new one.

Once we've got a pending item, we try to split off as much from it as can possibly fit into our receive buffer and handle that.

Breaking Changes

  • iroh_relay::protos::relay::Datagrams::take_segments now return Datagrams instead of Option<Datagrams>, not distinguishing the case where Datagrams might be empty.

Notes

Probably needs some test.
I'd love to test two RelayTransports talking to each other with different max_transmit_segments/max_receive_segments, but I'm not sure I can make such a test setup happen easily.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.

@matheus23 matheus23 requested a review from flub July 31, 2025 11:57
@matheus23 matheus23 self-assigned this Jul 31, 2025
Copy link

github-actions bot commented Jul 31, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3421/docs/iroh/

Last updated: 2025-08-20T12:55:08Z

Copy link

github-actions bot commented Jul 31, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 7024757

@n0bot n0bot bot added this to iroh Jul 31, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Jul 31, 2025
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure about my comments. But the Datagram semantics don't currently match it's docs. And I think if my comments aren't completely wild it could end up with easier state in all places.

/// the batch with at most `num_segments` and leaving only the rest in `self`.
///
/// Calling this on a datagram batch that only contains a single datagram (`segment_size == None`)
/// will result in returning essentially `Some(self.clone())`, while making `self` empty afterwards.
/// will result in returning essentially a clone of `self`, while making `self` empty afterwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem I have with this behaviour is that a segment_size of something with a empty contents (or even with just one datagram) is supposed to be illegal according to the docs of the struct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be simplified by always having a segement_size? It's only the serialisation that needs to be linux-compatible. Likewise we can declare that an empty Datagrams is allows and provide an is_empty method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only invariant that's invalidated here is that self.contents can be empty after this call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that's not a claimed invariant. Datagrams doesn't have any restrictions on contents. We only check that on receive, but that's basically it.

Can you tell me which invariant from the Datagams documentation is actually invalidated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 141-143 in this file in this PR:

/// The segment size if this transmission contains multiple datagrams.
/// This is `None` if the transmit only contains a single datagram
pub segment_size: Option<NonZeroU16>,

We can change this to segment_size: NonZeroU16 and add to the docs of the contents field that it might be empty, just should always be a multiple of segment_size if non-empty. Then this becomes a little more consistent?

Apologies for my vague reviews, I should clearly give more context up front!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your initial review comment you write:

One problem I have with this behaviour is that a segment_size of something with a empty contents (or even with just one datagram) is supposed to be illegal according to the docs of the struct.

I don't see where it says that?

This is a different invariant:

The segment size if this transmission contains multiple datagrams.
This is None if the transmit only contains a single datagram

The current code still respects that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after you call Datagrams::take_segments this guarantee is not upheld? You can end up with a single item but segment_size being still Some or it can be empty, in which case segment_size can not be right.

Though I appreciate that the way you use it does at least solve the 2nd issue.

Anyway, I don't feel like my comments here are being helpful, or maybe I'm even misunderstanding things. This conversation isn't really improving the quality of the code. Apologies to hold things up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I'm sorry. You're right that I missed a case where the invariant is broken in self.
I kept double-checking the return value and what's stored in self after the early-return and missed what's stored in self after the late return.

Sorry this took so long :X

@matheus23 matheus23 requested a review from flub August 4, 2025 11:34
@matheus23 matheus23 force-pushed the matheus23/rebatch-in-poll-recv branch from 7712341 to 1af03c2 Compare August 4, 2025 11:36
/// the batch with at most `num_segments` and leaving only the rest in `self`.
///
/// Calling this on a datagram batch that only contains a single datagram (`segment_size == None`)
/// will result in returning essentially `Some(self.clone())`, while making `self` empty afterwards.
/// will result in returning essentially a clone of `self`, while making `self` empty afterwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 141-143 in this file in this PR:

/// The segment size if this transmission contains multiple datagrams.
/// This is `None` if the transmit only contains a single datagram
pub segment_size: Option<NonZeroU16>,

We can change this to segment_size: NonZeroU16 and add to the docs of the contents field that it might be empty, just should always be a multiple of segment_size if non-empty. Then this becomes a little more consistent?

Apologies for my vague reviews, I should clearly give more context up front!

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be getting something wrong, and this certainly isn't worth holding up the great improvement this is. Apologies for dragging this out.

/// the batch with at most `num_segments` and leaving only the rest in `self`.
///
/// Calling this on a datagram batch that only contains a single datagram (`segment_size == None`)
/// will result in returning essentially `Some(self.clone())`, while making `self` empty afterwards.
/// will result in returning essentially a clone of `self`, while making `self` empty afterwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after you call Datagrams::take_segments this guarantee is not upheld? You can end up with a single item but segment_size being still Some or it can be empty, in which case segment_size can not be right.

Though I appreciate that the way you use it does at least solve the 2nd issue.

Anyway, I don't feel like my comments here are being helpful, or maybe I'm even misunderstanding things. This conversation isn't really improving the quality of the code. Apologies to hold things up.

@matheus23
Copy link
Member Author

Sorry, you were completely right to "drag this out", at the end it was me dragging things out because I just kept getting confused about the invariant. I hope the last commit fixed this!

@matheus23 matheus23 added this pull request to the merge queue Aug 21, 2025
Merged via the queue into main with commit b791123 Aug 21, 2025
29 checks passed
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in iroh Aug 21, 2025
@matheus23 matheus23 deleted the matheus23/rebatch-in-poll-recv branch August 21, 2025 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

2 participants