-
Notifications
You must be signed in to change notification settings - Fork 422
ensure peer_connected is called before peer_disconnected #3110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure peer_connected is called before peer_disconnected #3110
Conversation
| ("Route Handler", self.message_handler.route_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
| ("Channel Handler", self.message_handler.chan_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
| ("Onion Handler", self.message_handler.onion_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
| ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't like this attempt to dry up the handling then I'm find just having separate results where I check them one by one with their own log messages.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3110 +/- ##
==========================================
+ Coverage 89.84% 90.78% +0.93%
==========================================
Files 119 119
Lines 97561 103463 +5902
Branches 97561 103463 +5902
==========================================
+ Hits 87655 93925 +6270
+ Misses 7331 7032 -299
+ Partials 2575 2506 -69 ☔ View full report in Codecov by Sentry. |
TheBlueMatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think we also need to move the peer_lock.their_features call up - we only call peer_disconnected if that line has been hit (Peer::handshake_complete checks for it) so we want to always hit that immediately before we call peer_connecteds.
4a1cade to
db3b148
Compare
Whoops, fixed it. Can't be before calls to peer_connected because they pass a reference to msg but as long as we do it before returning it should be okay. |
db3b148 to
3a8f3b2
Compare
TheBlueMatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to get a test (which should be pretty easy), but either way LGTM.
Doesn't look like there's easy way to handle testing it with the existing test message handlers. Should I create new ones that can error on peer_connected and track connected/disconnected have been called or add the functionality to the existing test handlers? Have used something like Is this what you had in mind for being able to test it? |
|
Yea, I was figuring you'd just create a trivial |
Hm, using a CustomMessageHandler doesn't really test the fix here since it goes last. One of the issues was the early return causing the later handlers to not get the I guess at least it would catch the fix for ensuring disconnect is called. |
|
Added a test that passes but it duplicates a ton of code to handle all of the setup but with the new message handlers :| not sure if this is okay, looking for feedback on the test and how to do it better if it's not okay. |
2c4c40a to
7a29c39
Compare
7a29c39 to
922c31f
Compare
| if let Err(()) = self.message_handler.custom_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection) { | ||
| log_debug!(logger, "Custom Message Handler decided we couldn't communicate with peer {}", log_pubkey!(their_node_id)); | ||
|
|
||
| peer_lock.their_features = Some(msg.features); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this, wouldn't that lead to use falsely assuming the handshake succeeded even though one of our handlers rejected it? And there is a window between us dropping the lock and handling the disconnect even where we would deal with it in a 'normal' manner, e.g., accepting further messages, and potentially rebroadcasting etc?
(cc @TheBlueMatt as he requested this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, if that's true then seems like we'll need to separate "handshake_completed" from "triggered peer_connected" with a new flag on the peer that we can use to decide whether or not to trigger peer_disconnected in do_disconnect and disconnect_event_internal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, kinda, we'll end up forwarding broadcasts, as you point out, which is maybe not ideal, but we shouldn't process any further messages - we're currently in a read processing call, and we require read processing calls for any given peer to be serial, so presumably when we return an error the read-processing pipeline for this peer will stall and we won't get any more reads. We could make that explicit in the docs, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh, rather than introducing this race-y behavior in the first place, couldn't we just introduce a new handshake_aborted flag and check that alternatively to !peer.handshake_complete in disconnect_event_internal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine too.
| use crate::ln::msgs::{Init, LightningError, SocketAddress}; | ||
| use crate::util::test_utils; | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Drop superfluous whitespace.
| } | ||
| } | ||
|
|
||
| struct TestPeerTrackingMessageHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I believe the alternative would be to add TestCustomMessageHandler and TestOnionMessageHandler to test_utils and use them as part of the default test setup?
|
@johncantrell97 Any interest in finishing this PR? |
|
Supersceded by #3580. |
Fixes #3108
Makes sure all message handler's
peer_connectedmethods are called instead of returning early on the first to error.As for whether or not the user has to call back into socket_disconnected after a
PeerManager::read_event, I assume you mean after it returns an Err? I think the user does not have to becauseread_eventwill calldisconnect_event_internalon any error before returning it to the user.I took a look at
lightning-net-tokioand it appears to be the case over there as well. It does:Only calling
socket_disconnectedif the disconnection type is one the user detected. Ifread_eventreturns an Err it breaks with a disconnection type of Disconnect::CloseConnection and does not call back intosocket_disconnected.Matt seems to think you do have to so I'm probably misunderstanding the original question. Happy to dig into it a bit more with some clarification if I misunderstood.