ensure peer_connected is called before peer_disconnected #3110

johncantrell97 · 2024-06-07T15:14:33Z

Makes sure all message handler's peer_connected methods are called instead of returning early on the first to error.

As for whether or not the user has to call back into socket_disconnected after a PeerManager::read_event, I assume you mean after it returns an Err? I think the user does not have to because read_event will call disconnect_event_internal on any error before returning it to the user.

I took a look at lightning-net-tokio and it appears to be the case over there as well. It does:

if let Disconnect::PeerDisconnected = disconnect_type {
    peer_manager.as_ref().socket_disconnected(&our_descriptor);
    peer_manager.as_ref().process_events();
}

Only calling socket_disconnected if the disconnection type is one the user detected. If read_event returns an Err it breaks with a disconnection type of Disconnect::CloseConnection and does not call back into socket_disconnected.

Matt seems to think you do have to so I'm probably misunderstanding the original question. Happy to dig into it a bit more with some clarification if I misunderstood.

johncantrell97 · 2024-06-07T15:15:58Z

lightning/src/ln/peer_handler.rs

+				("Route Handler", self.message_handler.route_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)),
+				("Channel Handler", self.message_handler.chan_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)),
+				("Onion Handler", self.message_handler.onion_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)),
+			];


If you don't like this attempt to dry up the handling then I'm find just having separate results where I check them one by one with their own log messages.

codecov-commenter · 2024-06-07T15:24:10Z

Codecov Report

Attention: Patch coverage is 92.56757% with 11 lines in your changes missing coverage. Please review.

Project coverage is 90.78%. Comparing base (9789152) to head (922c31f).
Report is 1274 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/ln/peer_handler.rs	92.56%	10 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3110      +/-   ##
==========================================
+ Coverage   89.84%   90.78%   +0.93%     
==========================================
  Files         119      119              
  Lines       97561   103463    +5902     
  Branches    97561   103463    +5902     
==========================================
+ Hits        87655    93925    +6270     
+ Misses       7331     7032     -299     
+ Partials     2575     2506      -69

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

TheBlueMatt

Thanks! I think we also need to move the peer_lock.their_features call up - we only call peer_disconnected if that line has been hit (Peer::handshake_complete checks for it) so we want to always hit that immediately before we call peer_connecteds.

johncantrell97 · 2024-06-07T16:04:08Z

Thanks! I think we also need to move the peer_lock.their_features call up - we only call peer_disconnected if that line has been hit (Peer::handshake_complete checks for it) so we want to always hit that immediately before we call peer_connecteds.

Whoops, fixed it.

Can't be before calls to peer_connected because they pass a reference to msg but as long as we do it before returning it should be okay.

TheBlueMatt

Would be nice to get a test (which should be pretty easy), but either way LGTM.

johncantrell97 · 2024-06-07T19:13:16Z

Would be nice to get a test (which should be pretty easy), but either way LGTM.

Doesn't look like there's easy way to handle testing it with the existing test message handlers. Should I create new ones that can error on peer_connected and track connected/disconnected have been called or add the functionality to the existing test handlers?

Have used something like mockall for this in the past but without it I'd have to add counters/flags that get updated and a way to check them.

Is this what you had in mind for being able to test it?

TheBlueMatt · 2024-06-07T19:25:25Z

Yea, I was figuring you'd just create a trivial CustomMessageHandler that asserts that connected/disconnecteds all come in order and then errors on connection.

johncantrell97 · 2024-06-07T19:52:37Z

Yea, I was figuring you'd just create a trivial CustomMessageHandler that asserts that connected/disconnecteds all come in order and then errors on connection.

Hm, using a CustomMessageHandler doesn't really test the fix here since it goes last. One of the issues was the early return causing the later handlers to not get the peer_connected at all. Would have to use multiple new handlers to be able to check the one after an error is returned still gets peer_connected called on it (and disconnected)

I guess at least it would catch the fix for ensuring disconnect is called.

johncantrell97 · 2024-06-07T21:15:12Z

@TheBlueMatt

Added a test that passes but it duplicates a ton of code to handle all of the setup but with the new message handlers :|

not sure if this is okay, looking for feedback on the test and how to do it better if it's not okay.

tnull · 2024-06-10T09:35:26Z

lightning/src/ln/peer_handler.rs

-			if let Err(()) = self.message_handler.custom_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection) {
-				log_debug!(logger, "Custom Message Handler decided we couldn't communicate with peer {}", log_pubkey!(their_node_id));
+
+			peer_lock.their_features = Some(msg.features);


I'm a bit confused by this, wouldn't that lead to use falsely assuming the handshake succeeded even though one of our handlers rejected it? And there is a window between us dropping the lock and handling the disconnect even where we would deal with it in a 'normal' manner, e.g., accepting further messages, and potentially rebroadcasting etc?

(cc @TheBlueMatt as he requested this change)

hm, if that's true then seems like we'll need to separate "handshake_completed" from "triggered peer_connected" with a new flag on the peer that we can use to decide whether or not to trigger peer_disconnected in do_disconnect and disconnect_event_internal?

Yea, kinda, we'll end up forwarding broadcasts, as you point out, which is maybe not ideal, but we shouldn't process any further messages - we're currently in a read processing call, and we require read processing calls for any given peer to be serial, so presumably when we return an error the read-processing pipeline for this peer will stall and we won't get any more reads. We could make that explicit in the docs, however.

Mhh, rather than introducing this race-y behavior in the first place, couldn't we just introduce a new handshake_aborted flag and check that alternatively to !peer.handshake_complete in disconnect_event_internal?

That's fine too.

tnull · 2024-06-10T09:36:32Z

lightning/src/ln/peer_handler.rs

 	use crate::ln::msgs::{Init, LightningError, SocketAddress};
 	use crate::util::test_utils;

+


nit: Drop superfluous whitespace.

tnull · 2024-06-10T09:43:53Z

lightning/src/ln/peer_handler.rs

 		}
 	}

+	struct TestPeerTrackingMessageHandler {


Hmm, I believe the alternative would be to add TestCustomMessageHandler and TestOnionMessageHandler to test_utils and use them as part of the default test setup?

tnull · 2024-07-17T09:12:42Z

@johncantrell97 Any interest in finishing this PR?

TheBlueMatt · 2025-01-30T20:47:51Z

Supersceded by #3580.

johncantrell97 commented Jun 7, 2024

View reviewed changes

TheBlueMatt reviewed Jun 7, 2024

View reviewed changes

johncantrell97 force-pushed the 2024-06-robust-peer-disconnected-events branch from 4a1cade to db3b148 Compare June 7, 2024 16:02

ensure peer_connected is called before peer_disconnected

3a8f3b2

johncantrell97 force-pushed the 2024-06-robust-peer-disconnected-events branch from db3b148 to 3a8f3b2 Compare June 7, 2024 16:18

TheBlueMatt previously approved these changes Jun 7, 2024

View reviewed changes

johncantrell97 dismissed TheBlueMatt’s stale review via 7c5f0c4 June 7, 2024 21:13

johncantrell97 force-pushed the 2024-06-robust-peer-disconnected-events branch 2 times, most recently from 2c4c40a to 7a29c39 Compare June 7, 2024 23:00

add test

922c31f

johncantrell97 force-pushed the 2024-06-robust-peer-disconnected-events branch from 7a29c39 to 922c31f Compare June 8, 2024 01:22

tnull reviewed Jun 10, 2024

View reviewed changes

TheBlueMatt mentioned this pull request Jan 30, 2025

Ensure peer_disconnected is called after a handler refuses a connection #3580

Merged

TheBlueMatt closed this Jan 30, 2025

		use crate::ln::msgs::{Init, LightningError, SocketAddress};
		use crate::util::test_utils;

ensure peer_connected is called before peer_disconnected #3110

ensure peer_connected is called before peer_disconnected #3110

Uh oh!

Conversation

johncantrell97 commented Jun 7, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 7, 2024 • edited by codecov bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

johncantrell97 commented Jun 7, 2024

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

johncantrell97 commented Jun 7, 2024

Uh oh!

TheBlueMatt commented Jun 7, 2024

Uh oh!

johncantrell97 commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johncantrell97 commented Jun 7, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tnull commented Jul 17, 2024

Uh oh!

TheBlueMatt commented Jan 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Jun 7, 2024 •

edited by codecov bot

Loading

johncantrell97 commented Jun 7, 2024 •

edited

Loading