splice_locked + channel_reestablish bug fixes#4654
Conversation
When a splice confirms while disconnected after our channel_reestablish was generated, promotion clears pending_splice before their reestablish arrives. Fall back to the current promoted splice funding txid so we still send splice_locked after reestablishing.
If we have pending updates to send to our counterparty on reestablishment, while also pending a `splice_locked` send, then we must send our `splice_locked` first as the pending updates are considering the post-splice-locked state.
A stale ChannelManager can be reloaded after a monitor update has already completed in a prior runtime and released its post-update messages to the counterparty. The latest ChannelMonitor is not stale, but the serialized manager may still contain the old in-flight monitor state and `monitor_pending_*` resend flags from before the completion action ran. This becomes observable when startup monitor-completion background events are interleaved with splice promotion. On reload, the completed monitor update is queued as a background event. If a splice confirmation is processed before that background event fully resumes the channel, splice promotion can create a new `RenegotiatedFundingLocked` monitor update. The old completion is then blocked behind the new in-flight splice update. Once the channel reconnects and the splice update completes, `monitor_updating_restored` may consume the stale `monitor_pending_revoke_and_ack` / `monitor_pending_commitment_signed` flags and release a duplicate `revoke_and_ack` or `commitment_signed`. The peer's `channel_reestablish` commitment numbers are authoritative for this case. If `next_remote_commitment_number` says the peer is not waiting for a `revoke_and_ack`, clear `monitor_pending_revoke_and_ack`. Likewise, if `next_local_commitment_number` says the peer already has our latest `commitment_signed`, clear `monitor_pending_commitment_signed`.
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
| }).or_else(|| { | ||
| // If a splice confirms after we've sent `channel_reestablish` but before we've | ||
| // received theirs, we may promote the splice while disconnected and clear | ||
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | ||
| // unless the promoted txid was already included in our `channel_reestablish`. | ||
| let current_funding_txid = self.funding.get_funding_txid()?; | ||
| (self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | ||
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | ||
| .then(|| msgs::SpliceLocked { | ||
| channel_id: self.context.channel_id, | ||
| splice_txid: current_funding_txid, | ||
| }) | ||
| }); |
There was a problem hiding this comment.
This or_else fires whenever the first closure returns None, which happens not only when self.pending_splice is None (the intended case per the comment), but also when self.pending_splice is Some and sent_funding_txid was filtered out because it matched funding_locked_txid_sent_in_reestablish.
In a sequential-splice scenario (first splice promoted, second splice negotiated and confirmed locally, then reconnect), this would send splice_locked for the previously promoted splice's txid. If the peer has already promoted that splice and has a pending_splice for the second splice, their splice_locked handler would check negotiated_candidates for the old txid, not find it, and force-close the channel with "unknown splice funding txid".
Consider adding self.pending_splice.is_none() to the guard to match the comment's intent—this branch should only handle the case where pending_splice was cleared due to promotion:
| }).or_else(|| { | |
| // If a splice confirms after we've sent `channel_reestablish` but before we've | |
| // received theirs, we may promote the splice while disconnected and clear | |
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | |
| // unless the promoted txid was already included in our `channel_reestablish`. | |
| let current_funding_txid = self.funding.get_funding_txid()?; | |
| (self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | |
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | |
| .then(|| msgs::SpliceLocked { | |
| channel_id: self.context.channel_id, | |
| splice_txid: current_funding_txid, | |
| }) | |
| }); | |
| }).or_else(|| { | |
| // If a splice confirms after we've sent `channel_reestablish` but before we've | |
| // received theirs, we may promote the splice while disconnected and clear | |
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | |
| // unless the promoted txid was already included in our `channel_reestablish`. | |
| let current_funding_txid = self.funding.get_funding_txid()?; | |
| (self.pending_splice.is_none() | |
| && self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | |
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | |
| .then(|| msgs::SpliceLocked { | |
| channel_id: self.context.channel_id, | |
| splice_txid: current_funding_txid, | |
| }) |
Review SummaryOne issue found:
The remaining changes look correct:
|
This PR features commits to fix three different issues found by the
chanmon_consistencytarget around the interaction ofsplice_lockedandchannel_reestablish. It fixes the following payloads from #4363: