Backports for 0.6.2 #613

tnull · 2025-08-14T14:51:58Z

These are backports of #592 and #612 for the upcoming 0.6.2 release.

Previously, we had to configure enormous syncing timeouts as the BDK wallet syncing would hold a central mutex that could lead to large parts of event handling and syncing locking up. Here, we drop the configured timeouts considerably across the board, since such huge values are hopefully not required anymore.

Previously, we used to a channel to indicate that the background processor task has been stopped. Here, we rather just await the task's `JoinHandle` which is more robust in that it avoids a race condition.

.. we provide finer-grained logging after each step of `stop`.

ldk-reviews-bot · 2025-08-14T14:52:01Z

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

tnull · 2025-08-14T15:04:57Z

Now also included #590 to fix the CLN CI on this branch, too.

TheBlueMatt · 2025-08-14T15:12:12Z

src/lib.rs

+					loop {
+						let timeout_fut = tokio::time::timeout(
+							Duration::from_secs(BACKGROUND_TASK_SHUTDOWN_TIMEOUT_SECS),
+							tasks.join_next_with_id(),


Rather than letting each of these just take their time, IMO we should have a JoinSet for things that we can cancel right away and the ones we have to join. That way we can just call abort_all for the tasks that we really don't want to wait around on for no reason.

Going through the callers we have

continuously_sync_wallets - not clear that this is cancel-safe? I assume not cause its all BDK stuff and they don't guarantee that?

RGS sync - clearly can just be cancelled. There's no need to wait for a timeout

inbound TCP acceptor - clearly can just be cancelled, though it should reliably exit fast

reconnect loop - clearly can just be cancelled, and it might be important because the connect logic loops until the peer finishes its handshake

node announce - clearly we can just cancel.

liquidity handler - probably not cancel-safe

We do have a stop signal for all of these, which should be very similar to aborting for most of the cited cases, as ~all of them should be selecting ~instantly on the signal receiver.

Note that most of them also perform IO, if it's just to update NodeMetrics after they're done (e.g. RGS, node ann.). So, for one some of these tasks might simply not be cancellable (as you can't cancel IOPS/spawn_blocking), but I think it's cleaner/easier to reason about if we simply let them reach the next await point.

If you find it much preferable I can add a separate set for the TCP tasks, but not entirely convinced it's worth it?

We do have a stop signal for all of these, which should be very similar to aborting for most of the cited cases, as ~all of them should be selecting ~instantly on the signal receiver.

This is at least not true for the background connection task. It has a sleep in the loop but it isn't selected with the exit task. If it happens to be connecting that guarantees we'll wait like a few RTTs to exit, but if the peer is hung it could be a lot longer. Rather than trying to pipe the exit signal through everywhere it seems much simpler to just cancel the tasks.

Okay, now split the TCP tasks out.

src/lib.rs

TheBlueMatt · 2025-08-14T18:03:13Z

src/lib.rs

@@ -661,6 +684,53 @@ impl Node {
 		self.chain_source.stop();
 		log_debug!(self.logger, "Stopped chain sources.");

+		// Cancel cancellable background tasks
+		if let Some(mut tasks) = self.cancellable_background_tasks.lock().unwrap().take() {
+			tasks.abort_all();


We still want to join them even after we abort, I think.

Why would we?

Because they may have been in sync code at the time we called abort_all and we want to wait until they reach an await point so that we know they're actually not running. Its not all that critical, but in theory the connection logic is still race-y I think.

Okay, now doing so.

TheBlueMatt · 2025-08-14T18:16:17Z

src/lib.rs

+		log_debug!(self.logger, "Stopped chain sources.");
+
+		// Cancel cancellable background tasks
+		if let Some(mut tasks) = self.cancellable_background_tasks.lock().unwrap().take() {


ALso doesn't this need to happen after disconnect_all_peers, which is a few lines up?

You mean before? For the TCP tasks yes. Moved up.

Oops, yes, thanks.

Previously, we'd only wait for the background processor tasks to successfully finish. It turned out that this could lead to races when the other background tasks took too long to shutdown. Here, we attempt to wait on all background tasks shutting down for a bit, before moving on.

.. as tokio tends to panic if dropping a runtime in an async context and we're not super careful. Here, we add some test coverage for this edge case in Rust tests.

…ightningdevkit#527)

tnull added 3 commits August 14, 2025 16:22

Await on the background processing task's JoinHandle

8f8d7e5

Previously, we used to a channel to indicate that the background processor task has been stopped. Here, we rather just await the task's `JoinHandle` which is more robust in that it avoids a race condition.

Improve logging in stop

8ce139a

.. we provide finer-grained logging after each step of `stop`.

tnull requested a review from TheBlueMatt August 14, 2025 14:52

TheBlueMatt reviewed Aug 14, 2025

View reviewed changes

TheBlueMatt mentioned this pull request Aug 14, 2025

Wait on all background tasks to finish (or abort) #612

Merged

tnull force-pushed the 2025-08-backports-for-0.6.2 branch from edd22b2 to dbe11e8 Compare August 14, 2025 16:48

tnull requested a review from TheBlueMatt August 14, 2025 16:48

tnull mentioned this pull request Aug 14, 2025

Cut v0.6.2 release #614

Merged

TheBlueMatt reviewed Aug 14, 2025

View reviewed changes

tnull force-pushed the 2025-08-backports-for-0.6.2 branch from dbe11e8 to a0be1a5 Compare August 14, 2025 18:09

TheBlueMatt reviewed Aug 14, 2025

View reviewed changes

tnull and others added 3 commits August 14, 2025 20:23

Add test that drops the node in an async context

672a332

.. as tokio tends to panic if dropping a runtime in an async context and we're not super careful. Here, we add some test coverage for this edge case in Rust tests.

Fix CLN crash by waiting for block height sync before channel open (l…

1a2d900

…ightningdevkit#527)

tnull force-pushed the 2025-08-backports-for-0.6.2 branch from a0be1a5 to 1a2d900 Compare August 14, 2025 18:23

TheBlueMatt approved these changes Aug 14, 2025

View reviewed changes

tnull merged commit 52e8f6f into lightningdevkit:release/0.6 Aug 14, 2025
9 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Backports for 0.6.2 #613

Backports for 0.6.2 #613

Uh oh!

tnull commented Aug 14, 2025

Uh oh!

ldk-reviews-bot commented Aug 14, 2025 •

edited

Loading

Uh oh!

tnull commented Aug 14, 2025

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

tnull Aug 14, 2025

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

tnull Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

tnull Aug 14, 2025

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

tnull Aug 14, 2025

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

tnull Aug 14, 2025

Uh oh!

TheBlueMatt Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

Backports for 0.6.2 #613

Backports for 0.6.2 #613

Uh oh!

Conversation

tnull commented Aug 14, 2025

Uh oh!

ldk-reviews-bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tnull commented Aug 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ldk-reviews-bot commented Aug 14, 2025 •

edited

Loading