chancloser: don't set ChanStatusCoopBroadcasted before close tx exists#10782
chancloser: don't set ChanStatusCoopBroadcasted before close tx exists#10782jtobin wants to merge 4 commits intolightningnetwork:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where channels could become stranded in a limbo state during cooperative close negotiation if a peer disconnected before the closing transaction was finalized. By removing the premature setting of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
🔴 PR Severity: CRITICAL
🔴 Critical (2 files)
🟢 Low (1 file + excluded)
AnalysisThis PR modifies Severity bump check: 3 non-excluded files (< 20 threshold), 31 non-excluded lines changed (< 500 threshold), and only one distinct critical package touched — no bump applied. This PR warrants expert review due to the wallet/channel-close logic changes, even though the diff is small. The cooperative close path and RBF transitions directly affect fund safety. To override, add a |
There was a problem hiding this comment.
Code Review
This pull request addresses a bug where ChanStatusCoopBroadcasted was set before a cooperative close transaction actually existed, leading to issues during node restarts. The changes remove the premature marking of the channel as broadcasted in chancloser.go and associated RBF transition logic, ensuring the channel remains in a default state until a transaction is ready. New integration and unit tests were added to verify that channels with ShutdownInfo now correctly resume negotiation after a restart. One review comment regarding an unnecessary variable declaration was identified.
ziggie1984
left a comment
There was a problem hiding this comment.
Nice writeup — the analysis of why MarkCoopBroadcasted(nil, …) is redundant matches what I see tracing readers of ChanStatusCoopBroadcasted through loadActiveChannels, restartCoopClose,
and the chain arbitrator. The fix is correct: ShutdownInfo (persisted by MarkShutdownSent in initChanShutdown) is the right durable signal, and the existing peer/brontide.go:1367-1487
reconnect path already consumes it. A few things before this lands:
The new tests don't actually defend the fix. I verified empirically by cherry-picking the itest commit onto the parent of the fix.
The held HODL HTLC keeps the channel from flushing, so neither BeginNegotiation nor the RBF ChannelFlushing → ClosingNegotiation transition runs before the suspend — the call sites this
PR removes are simply unreached. Same issue with the two new unit tests: TestShutdownInfoChannelStaysActive is tautological (MarkShutdownSent only writes to shutdownInfoKey, can't
mutate chanStatus or the in-memory map by construction), and TestResendChanSyncFailsForCoopBroadcastedLimbo manually creates the limbo via MarkCoopBroadcasted(nil, …) rather than
relying on production code to produce it — so it tests the symptom, not that the fix prevents the state.
Real regression coverage exists on the RBF side, not the legacy side. Removing expectChanPendingClose() from TestRbfChannelFlushingTransitions implicitly turns into a "no unexpected
MarkCoopBroadcasted call" assertion via testify/mock's Called() + AssertExpectations. ✓ But the legacy BeginNegotiation path is unguarded: mockChannel.MarkCoopBroadcasted
(chancloser_test.go:164) is a hand-rolled stub with no call tracking. A regression that re-adds the nil call there would slip through.
Suggestion: drop the itest and the two new unit tests, and add call tracking to the legacy mockChannel — either a markCoopBroadcastedCalls slice with a require.NotContains(t, …, nilTx)
assertion in the existing BeginNegotiation test, or convert it to embed mock.Mock like the RBF side. Few lines, milliseconds to run, actually defends both paths.
Two smaller items:
- ChannelFlushed.FreshFlush is now dead — only producer (peer/brontide.go:4075) sets it unconditionally to true, and the test loops at rbf_coop_test.go:1785, 1822 no longer
differentiate between the two values. Worth removing in this PR or a follow-up. - The PendingChannels/ListChannels visibility change is acknowledged in the PR description but missing from the release note. Operators with monitoring keyed on those buckets will see
channels move differently — channels stay in ListChannels (inactive) until the real close tx is broadcast, and WaitingCloseChannel.ClosingTx is no longer ever empty. Worth a sentence.
Hmm not sure but I think before this change the channel would immediatly move to the pendingClose channel and now with the new change it would only move to the pending close if the close-transaction was actually created which I think is still the right behavior now. A channel can still remain in the |
Remove the two call sites that set ChanStatusCoopBroadcasted before a cooperative close transaction exists: - BeginNegotiation in the legacy close path (chancloser.go) - ChannelFlushed handling in the RBF close path (rbf_coop_transitions.go) Both calls passed nil as the close tx, creating a "limbo" state where ChanStatusCoopBroadcasted is set but no close transaction is stored. This is unnecessary because ShutdownInfo — persisted earlier by MarkShutdownSent in initChanShutdown / the RBF ShutdownPending transition — already serves as the durable signal that the shutdown flow was entered. ChanStatusCoopBroadcasted should only be set when a real close transaction exists, which this change preserves.
Replace the peer-level unit tests and the integration test (dropped in previous rebase) with call tracking on the legacy mockChannel.MarkCoopBroadcasted stub. The old tests either never reached the removed call sites or were tautological. The mockChannel now records every MarkCoopBroadcasted call, and TestTaprootFastClose asserts that no call was made with a nil tx — directly guarding against the limbo state described in lightninglabs/taproot-assets#2108. Also remove the now-inert expectChanPendingClose method and its call sites in the RBF close test harness, which existed only to set up a mock expectation for MarkCoopBroadcasted(nil).
FreshFlush is never read in any transition handler. The only
producer (peer/brontide.go) sets it unconditionally to true,
and after the previous commit removed expectChanPendingClose,
the test loops that iterated over {true, false} no longer
differentiate between the two values.
Remove the field, the unconditional assignment, and collapse
the test loops into single sub-tests.
|
(CI errors from the last run look to be unrelated flakes, from what I can tell.) |
|
what about ?
|
Was removed in 5150469. 👍 👍 |
Fixes lightninglabs/taproot-assets#2108.
Lightly-edited summary, ctsy Opus, of the issue and its fix:
Of note: the OP's logs in the linked issue indicate that he landed in the ChanStatusCoopBroadcasted state without any close transaction, which, I believe, could only have occurred if MarkCoopBroadcasted was called with nil. There appear to be no other code paths that set ChanStatusCoopBroadcasted without storing a close tx.
I was pretty curious why those MarkCoopBroadcasted(nil) calls existed, but after investigating them thoroughly, I couldn't really find any good justification for them. Opus's summary of this analysis is also fruitful to include here: