-
Notifications
You must be signed in to change notification settings - Fork 13
feat: driver high-water marks #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs: clarify fix: default buffer size from crate test(hwm): fix behaviour to use hwm error
fix: cleanup unused error value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a good track, but I was thinking about it slightly differently: the high-water mark should entail both queued requests (i.e. in the channel between socket frontend and driver, currently defined by DEFAULT_BUFFER_SIZE), and pending requests waiting for a response.
I realize this may be better to split up into 2 options: with_max_queue_size and with_max_pending_requests. Theoretically, the queue should only start filling up once max_pending_requests is hit, or if the driver is just too slow at processing messages.
Wdyt?
msg-socket/src/req/driver.rs
Outdated
|
|
||
| // We might be able to write more queued messages to the buffer. | ||
| // Continue to potentially send more or flush | ||
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This continue is now redundant - we won't be able to write more since pending_egress is now None. However, in the error case above, we should continue to poll the conn_manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the error case, you should put the message back into pending_egress or it gets lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not super sure I understand.
If we continue after writing pending_egress we will poll conn_manager again and also poll the given channel to drive it.
At the end, since we don't have pending_egress we will get another message when polling from_socket, processing that one...
Basically we try to do as much work as possible before getting "stuck" waiting for the underlying I/O, no?
Also, in the error case, you should put the message back into pending_egress or it gets lost
Does that apply also to all other error cases then, like when failing to flush? I need to double check how channel works here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we continue after writing pending_egress we will poll conn_manager again and also poll the given channel to drive it.
But we already have the channel in this code path, so there's no need to poll it again. The conn_manager is responsible for ensuring we stay connected in case of a disconnect, so when we get here we don't have to poll it.
At the end, since we don't have pending_egress we will get another message when polling from_socket, processing that one...
Basically we try to do as much work as possible before getting "stuck" waiting for the underlying I/O, no?
Yes, but in this case what work are you doing by continuing after writing to the channel?
Does that apply also to all other error cases then, like when failing to flush? I need to double check how channel works here
That's a good catch, technically we should yes. Maybe there's a way to take the Framed buffer and cache it to be written again when we reconnect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought conn manager handled multiple channels, depending on what was ready. Will account for it then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I understood it now. If we were to still continue we'd just get the channel again, poll for incoming messages, but not really prepare anything else to write out. Honestly it's ok but we can instead drive the rest, which would also include getting a new message from the queue in the pending_egress...
In this scenario, we'll try flushing after we reach the write buffer size or after linger_timer anyways, with potentially N messages written (if the write and from_socket keep being ready)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH wrt restoring the egress on error/flush, I'm unsure. I think it needs further testing so I'd leave it for a new issue.
I think we'd change the semantics too much rn, since it has been like this even before this pending_egress addition, just with multiple messages at once.
This is already the case, the channel queue starts to fill up after HWM limit is reached, since the driver won't poll for new messages anymore.
Agree, for maximum flexibility. I will remove the unbounded growth and put something like 8k items each as a default. Shouldn't be an issue since it's easily configured. |
docs: clarify channel_size and pending_requests_hwm coupling
|
@claude review this PR for correctness |
|
Claude finished @mempirate's task in 1m 48s —— View job Code Review - High-Water Marks ImplementationI've reviewed the PR implementation for driver high-water marks. Here's my detailed analysis: Architecture & Design ✅The implementation correctly uses the SubscriberSession pattern:
The HWM is split into two configurable options as discussed:
Correctness Issues Found1.
|
mempirate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nits
msg-socket/src/rep/driver.rs
Outdated
| /// Currently, this driver will use the following sources of work: | ||
| /// 1. Connected peers (scheduled in task 2 and 3) | ||
| /// 2. Authentication tasks for connecting peers (scheduled in task 3) | ||
| /// 3. Incoming new connections for connecting peers (scheduled in task 5) | ||
| /// 4. Process control signals for the underlying transport (doesn't restart the loop) | ||
| /// 5. Incoming connections from the underlying transport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "scheduled in task n" refer to? I think it's not quite obvious, unless those are specific functions code paths you can name and refer to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will clarify it, but "task n" means "point number n", and in that sense the point number n somehow creates work for point number m.
For example, task 5 (Incoming connections from underlying transport) creates a conn_task which is prosessed by task 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the new commit :) Rephrased a bit and got claude to do a small ascii diagram for it
| /// TODO: there might be some [`Self::pending_requests`] yet to processed. TBD how to handle | ||
| /// them, for now they're dropped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because we would still have requests in pending_requests.
This only flushes pending_egress, where before it was a flush of egress_queue. In a way this is the same as egress_queue with size 1.
This TODO just signals that we don't really respond anything to pending_requests, we could for example send a connection reset instead... Or at least that's what I understand here
mempirate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just 2 nits but approved
msg-socket/src/req/mod.rs
Outdated
|
|
||
| /// The default buffer size for the socket. | ||
| const DEFAULT_BUFFER_SIZE: usize = 1024; | ||
| use crate::DEFAULT_BUFFER_SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename this to DEFAULT_QUEUE_SIZE, more accurate now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also used by the other drivers for various buffers and queues. For example Rep uses it for the number of pending responses, but also for channel size for control and notifications of requests.
Req uses it for queue size yes and also pending requests.
Sub uses it for the notification channel size as well as the channel size to queue messages to the driver...
Overall it's used in many places. I'll group up the usages and dedicate separate constants where needed
This PR attempts to address the issue outlined in #13.
It does so by using the SubscriberSession pattern: 1 pending outgoing message in the driver and the rest in the originating channel.
Additionally, there's a HWM configurable for the number of pending requests for a given peer, allowing only up to N requests to be pulled in the PeerState. This effectively allows HWM + Framed buffer size messages to be received by the network socket before we stop polling data from it