Skip to content

Commit e64ed33

Browse files
committed
Generate new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we begin generating the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, updating functional tests for the new `ChannelMonitorUpdate`s where required. Backport of 71a364c Conflicts resolved in: * lightning/src/ln/chanmon_update_fail_tests.rs * lightning/src/ln/channelmanager.rs * lightning/src/ln/functional_test_utils.rs * lightning/src/ln/functional_tests.rs * lightning/src/ln/monitor_tests.rs * lightning/src/ln/outbound_payment.rs * lightning/src/ln/payment_tests.rs Note that unlike the original commit, on this branch we do not fail to deserialize a `ChannelMonitor` if the `counterparty_node_id` is `None` (implying it has not seen a `ChannelMonitorUpdate` since LDK 0.0.118). Thus, we skip the new logic in some cases, generating a warning log instead. As we assumed that it is now reasonable to require `counterparty_node_id`s in LDK 0.2, it seems reasonable to skip the new logic (potentially generating some additional spurious payment events on restart) now here as well.
1 parent 6d6ed29 commit e64ed33

9 files changed

+315
-57
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3356,6 +3356,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
33563356
if updates.update_id == LEGACY_CLOSED_CHANNEL_UPDATE_ID || self.lockdown_from_offchain {
33573357
assert_eq!(updates.updates.len(), 1);
33583358
match updates.updates[0] {
3359+
ChannelMonitorUpdateStep::ReleasePaymentComplete { .. } => {},
33593360
ChannelMonitorUpdateStep::ChannelForceClosed { .. } => {},
33603361
// We should have already seen a `ChannelForceClosed` update if we're trying to
33613362
// provide a preimage at this point.

lightning/src/ln/chanmon_update_fail_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3314,6 +3314,7 @@ fn do_test_durable_preimages_on_closed_channel(close_chans_before_reload: bool,
33143314

33153315
nodes[0].node.force_close_broadcasting_latest_txn(&chan_id_ab, &nodes[1].node.get_our_node_id(), error_message.to_string()).unwrap();
33163316
check_closed_event(&nodes[0], 1, ClosureReason::HolderForceClosed { broadcasted_latest_txn: Some(true) }, false, &[nodes[1].node.get_our_node_id()], 100000);
3317+
check_added_monitors(&nodes[0], 1);
33173318
let as_closing_tx = nodes[0].tx_broadcaster.txn_broadcasted.lock().unwrap().split_off(0);
33183319
assert_eq!(as_closing_tx.len(), 1);
33193320

lightning/src/ln/channelmanager.rs

Lines changed: 165 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1206,6 +1206,12 @@ pub(crate) enum EventCompletionAction {
12061206
channel_funding_outpoint: Option<OutPoint>,
12071207
channel_id: ChannelId,
12081208
},
1209+
1210+
/// When a payment's resolution is communicated to the downstream logic via
1211+
/// [`Event::PaymentSent`] or [`Event::PaymentFailed`] we may want to mark the payment as
1212+
/// fully-resolved in the [`ChannelMonitor`], which we do via this action.
1213+
/// Note that this action will be dropped on downgrade to LDK prior to 0.2!
1214+
ReleasePaymentCompleteChannelMonitorUpdate(PaymentCompleteUpdate),
12091215
}
12101216
impl_writeable_tlv_based_enum!(EventCompletionAction,
12111217
(0, ReleaseRAAChannelMonitorUpdate) => {
@@ -1218,6 +1224,7 @@ impl_writeable_tlv_based_enum!(EventCompletionAction,
12181224
ChannelId::v1_from_funding_outpoint(channel_funding_outpoint.unwrap())
12191225
})),
12201226
}
1227+
{1, ReleasePaymentCompleteChannelMonitorUpdate} => (),
12211228
);
12221229

12231230
/// The source argument which is passed to [`ChannelManager::claim_mpp_part`].
@@ -6907,11 +6914,20 @@ where
69076914
if let Some(update) = from_monitor_update_completion {
69086915
// If `fail_htlc` didn't `take` the post-event action, we should go ahead and
69096916
// complete it here as the failure was duplicative - we've already handled it.
6910-
// This should mostly only happen on startup, but it is possible to hit it in
6911-
// rare cases where a MonitorUpdate is replayed after restart because a
6912-
// ChannelMonitor wasn't persisted after it was applied (even though the
6913-
// ChannelManager was).
6914-
// TODO
6917+
// This can happen in rare cases where a MonitorUpdate is replayed after
6918+
// restart because a ChannelMonitor wasn't persisted after it was applied (even
6919+
// though the ChannelManager was).
6920+
// For such cases, we also check that there's no existing pending event to
6921+
// complete this action already, which we let finish instead.
6922+
let action =
6923+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update);
6924+
let have_action = {
6925+
let pending_events = self.pending_events.lock().unwrap();
6926+
pending_events.iter().any(|(_, act)| act.as_ref() == Some(&action))
6927+
};
6928+
if !have_action {
6929+
self.handle_post_event_actions([action]);
6930+
}
69156931
}
69166932
},
69176933
HTLCSource::PreviousHopData(HTLCPreviousHopData {
@@ -7397,19 +7413,38 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
73977413
startup_replay: bool, next_channel_counterparty_node_id: Option<PublicKey>,
73987414
next_channel_outpoint: OutPoint, next_channel_id: ChannelId, next_user_channel_id: Option<u128>,
73997415
) {
7416+
debug_assert_eq!(
7417+
startup_replay,
7418+
!self.background_events_processed_since_startup.load(Ordering::Acquire)
7419+
);
7420+
let htlc_id = SentHTLCId::from_source(&source);
74007421
match source {
74017422
HTLCSource::OutboundRoute { session_priv, payment_id, path, .. } => {
7402-
debug_assert!(self.background_events_processed_since_startup.load(Ordering::Acquire),
7423+
debug_assert!(!startup_replay,
74037424
"We don't support claim_htlc claims during startup - monitors may not be available yet");
74047425
if let Some(pubkey) = next_channel_counterparty_node_id {
74057426
debug_assert_eq!(pubkey, path.hops[0].pubkey);
74067427
}
7407-
let mut ev_completion_action =
7428+
let mut ev_completion_action = if from_onchain {
7429+
if let Some(counterparty_node_id) = next_channel_counterparty_node_id {
7430+
let release = PaymentCompleteUpdate {
7431+
counterparty_node_id,
7432+
channel_funding_outpoint: next_channel_outpoint,
7433+
channel_id: next_channel_id,
7434+
htlc_id,
7435+
};
7436+
Some(EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(release))
7437+
} else {
7438+
log_warn!(self.logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant PaymentSent events on restart");
7439+
None
7440+
}
7441+
} else {
74087442
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
74097443
channel_funding_outpoint: Some(next_channel_outpoint),
74107444
channel_id: next_channel_id,
74117445
counterparty_node_id: path.hops[0].pubkey,
7412-
});
7446+
})
7447+
};
74137448
self.pending_outbound_payments.claim_htlc(
74147449
payment_id,
74157450
payment_preimage,
@@ -9535,12 +9570,24 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
95359570
log_trace!(logger, "Failing HTLC with hash {} from our monitor", &htlc_update.payment_hash);
95369571
let receiver = HTLCDestination::NextHopChannel { node_id: counterparty_node_id, channel_id };
95379572
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
9573+
let completion_update =
9574+
if let Some(counterparty_node_id) = counterparty_node_id {
9575+
Some(PaymentCompleteUpdate {
9576+
counterparty_node_id,
9577+
channel_funding_outpoint: funding_outpoint,
9578+
channel_id,
9579+
htlc_id: SentHTLCId::from_source(&htlc_update.source),
9580+
})
9581+
} else {
9582+
log_warn!(self.logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant PaymentSent events on restart");
9583+
None
9584+
};
95389585
self.fail_htlc_backwards_internal(
95399586
&htlc_update.source,
95409587
&htlc_update.payment_hash,
95419588
&reason,
95429589
receiver,
9543-
None,
9590+
completion_update,
95449591
);
95459592
}
95469593
},
@@ -10894,8 +10941,63 @@ where
1089410941
channel_id,
1089510942
counterparty_node_id,
1089610943
} => {
10944+
let startup_complete =
10945+
self.background_events_processed_since_startup.load(Ordering::Acquire);
10946+
debug_assert!(startup_complete);
1089710947
self.handle_monitor_update_release(counterparty_node_id, channel_id, None);
1089810948
},
10949+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(
10950+
PaymentCompleteUpdate {
10951+
counterparty_node_id,
10952+
channel_funding_outpoint,
10953+
channel_id,
10954+
htlc_id,
10955+
},
10956+
) => {
10957+
let per_peer_state = self.per_peer_state.read().unwrap();
10958+
let mut peer_state = per_peer_state
10959+
.get(&counterparty_node_id)
10960+
.map(|state| state.lock().unwrap())
10961+
.expect("Channels originating a payment resolution must have peer state");
10962+
let update_id = peer_state
10963+
.closed_channel_monitor_update_ids
10964+
.get_mut(&channel_id)
10965+
.expect("Channels originating a payment resolution must have a monitor");
10966+
*update_id += 1;
10967+
10968+
let update = ChannelMonitorUpdate {
10969+
update_id: *update_id,
10970+
channel_id: Some(channel_id),
10971+
counterparty_node_id: Some(counterparty_node_id),
10972+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
10973+
htlc: htlc_id,
10974+
}],
10975+
};
10976+
10977+
let during_startup =
10978+
!self.background_events_processed_since_startup.load(Ordering::Acquire);
10979+
if during_startup {
10980+
let event = BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
10981+
counterparty_node_id,
10982+
funding_txo: channel_funding_outpoint,
10983+
channel_id,
10984+
update,
10985+
};
10986+
self.pending_background_events.lock().unwrap().push(event);
10987+
} else {
10988+
handle_new_monitor_update!(
10989+
self,
10990+
channel_funding_outpoint,
10991+
update,
10992+
peer_state,
10993+
peer_state,
10994+
per_peer_state,
10995+
counterparty_node_id,
10996+
channel_id,
10997+
POST_CHANNEL_CLOSE
10998+
);
10999+
}
11000+
},
1089911001
}
1090011002
}
1090111003
}
@@ -13984,6 +14086,7 @@ where
1398414086
if counterparty_opt.is_none() {
1398514087
for (htlc_source, (htlc, preimage_opt)) in monitor.get_all_current_outbound_htlcs() {
1398614088
let logger = WithChannelMonitor::from(&args.logger, monitor, Some(htlc.payment_hash));
14089+
let htlc_id = SentHTLCId::from_source(&htlc_source);
1398714090
match htlc_source {
1398814091
HTLCSource::PreviousHopData(prev_hop_data) => {
1398914092
let pending_forward_matches_htlc = |info: &PendingAddHTLCInfo| {
@@ -14035,6 +14138,21 @@ where
1403514138
HTLCSource::OutboundRoute { payment_id, session_priv, path, .. } => {
1403614139
if let Some(preimage) = preimage_opt {
1403714140
let pending_events = Mutex::new(pending_events_read);
14141+
let mut compl_action =
14142+
if let Some(counterparty_node_id) = monitor.get_counterparty_node_id() {
14143+
let update = PaymentCompleteUpdate {
14144+
counterparty_node_id,
14145+
channel_funding_outpoint: monitor.get_funding_txo().0,
14146+
channel_id: monitor.channel_id(),
14147+
htlc_id,
14148+
};
14149+
Some(
14150+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update)
14151+
)
14152+
} else {
14153+
log_warn!(logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant PaymentSent events on restart");
14154+
None
14155+
};
1403814156
// Note that we set `from_onchain` to "false" here,
1403914157
// deliberately keeping the pending payment around forever.
1404014158
// Given it should only occur when we have a channel we're
@@ -14043,13 +14161,6 @@ where
1404314161
// generating a `PaymentPathSuccessful` event but regenerating
1404414162
// it and the `PaymentSent` on every restart until the
1404514163
// `ChannelMonitor` is removed.
14046-
let mut compl_action = Some(
14047-
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
14048-
channel_funding_outpoint: Some(monitor.get_funding_txo().0),
14049-
channel_id: monitor.channel_id(),
14050-
counterparty_node_id: path.hops[0].pubkey,
14051-
},
14052-
);
1405314164
pending_outbounds.claim_htlc(
1405414165
payment_id,
1405514166
preimage,
@@ -14060,6 +14171,44 @@ where
1406014171
&pending_events,
1406114172
&&logger,
1406214173
);
14174+
// If the completion action was not consumed, then there was no
14175+
// payment to claim, and we need to tell the `ChannelMonitor`
14176+
// we don't need to hear about the HTLC again, at least as long
14177+
// as the PaymentSent event isn't still sitting around in our
14178+
// event queue.
14179+
let have_action = if compl_action.is_some() {
14180+
let pending_events = pending_events.lock().unwrap();
14181+
pending_events.iter().any(|(_, act)| *act == compl_action)
14182+
} else {
14183+
false
14184+
};
14185+
if !have_action && compl_action.is_some() {
14186+
if let Some(counterparty_node_id) = monitor.get_counterparty_node_id() {
14187+
let mut peer_state = per_peer_state
14188+
.get(&counterparty_node_id)
14189+
.map(|state| state.lock().unwrap())
14190+
.expect("Channels originating a preimage must have peer state");
14191+
let update_id = peer_state
14192+
.closed_channel_monitor_update_ids
14193+
.get_mut(&monitor.channel_id())
14194+
.expect("Channels originating a preimage must have a monitor");
14195+
*update_id += 1;
14196+
14197+
pending_background_events.push(BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
14198+
counterparty_node_id: counterparty_node_id,
14199+
funding_txo: monitor.get_funding_txo().0,
14200+
channel_id: monitor.channel_id(),
14201+
update: ChannelMonitorUpdate {
14202+
update_id: *update_id,
14203+
channel_id: Some(monitor.channel_id()),
14204+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
14205+
htlc: htlc_id,
14206+
}],
14207+
counterparty_node_id: Some(counterparty_node_id),
14208+
},
14209+
});
14210+
}
14211+
}
1406314212
pending_events_read = pending_events.into_inner().unwrap();
1406414213
}
1406514214
},

lightning/src/ln/functional_test_utils.rs

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2332,10 +2332,14 @@ macro_rules! expect_payment_claimed {
23322332
}
23332333
}
23342334

2335-
pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM=CM>>(node: &H,
2336-
expected_payment_preimage: PaymentPreimage, expected_fee_msat_opt: Option<Option<u64>>,
2337-
expect_per_path_claims: bool, expect_post_ev_mon_update: bool,
2335+
pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM=CM>>(
2336+
node: &H, expected_payment_preimage: PaymentPreimage,
2337+
expected_fee_msat_opt: Option<Option<u64>>, expect_per_path_claims: bool,
2338+
expect_post_ev_mon_update: bool,
23382339
) {
2340+
if expect_post_ev_mon_update {
2341+
check_added_monitors(node, 0);
2342+
}
23392343
let events = node.node().get_and_clear_pending_events();
23402344
let expected_payment_hash = PaymentHash(
23412345
bitcoin::hashes::sha256::Hash::hash(&expected_payment_preimage.0).to_byte_array());
@@ -2535,6 +2539,7 @@ pub struct PaymentFailedConditions<'a> {
25352539
pub(crate) expected_blamed_scid: Option<u64>,
25362540
pub(crate) expected_blamed_chan_closed: Option<bool>,
25372541
pub(crate) expected_mpp_parts_remain: bool,
2542+
pub(crate) from_mon_update: bool,
25382543
}
25392544

25402545
impl<'a> PaymentFailedConditions<'a> {
@@ -2544,6 +2549,7 @@ impl<'a> PaymentFailedConditions<'a> {
25442549
expected_blamed_scid: None,
25452550
expected_blamed_chan_closed: None,
25462551
expected_mpp_parts_remain: false,
2552+
from_mon_update: false,
25472553
}
25482554
}
25492555
pub fn mpp_parts_remain(mut self) -> Self {
@@ -2562,6 +2568,10 @@ impl<'a> PaymentFailedConditions<'a> {
25622568
self.expected_htlc_error_data = Some((code, data));
25632569
self
25642570
}
2571+
pub fn from_mon_update(mut self) -> Self {
2572+
self.from_mon_update = true;
2573+
self
2574+
}
25652575
}
25662576

25672577
#[cfg(test)]
@@ -2647,8 +2657,19 @@ pub fn expect_payment_failed_conditions<'a, 'b, 'c, 'd, 'e>(
26472657
node: &'a Node<'b, 'c, 'd>, expected_payment_hash: PaymentHash, expected_payment_failed_permanently: bool,
26482658
conditions: PaymentFailedConditions<'e>
26492659
) {
2660+
if conditions.from_mon_update {
2661+
check_added_monitors(node, 0);
2662+
}
26502663
let events = node.node.get_and_clear_pending_events();
2651-
expect_payment_failed_conditions_event(events, expected_payment_hash, expected_payment_failed_permanently, conditions);
2664+
if conditions.from_mon_update {
2665+
check_added_monitors(node, 1);
2666+
}
2667+
expect_payment_failed_conditions_event(
2668+
events,
2669+
expected_payment_hash,
2670+
expected_payment_failed_permanently,
2671+
conditions,
2672+
);
26522673
}
26532674

26542675
pub fn send_along_route_with_secret<'a, 'b, 'c>(origin_node: &Node<'a, 'b, 'c>, route: Route, expected_paths: &[&[&Node<'a, 'b, 'c>]], recv_value: u64, our_payment_hash: PaymentHash, our_payment_secret: PaymentSecret) -> PaymentId {

0 commit comments

Comments
 (0)