Skip to content

Commit 0626324

Browse files
committed
Generate new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we begin generating the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, updating functional tests for the new `ChannelMonitorUpdate`s where required. Backport of 71a364c Conflicts resolved in: * lightning/src/ln/chanmon_update_fail_tests.rs * lightning/src/ln/channelmanager.rs * lightning/src/ln/functional_test_utils.rs * lightning/src/ln/functional_tests.rs * lightning/src/ln/monitor_tests.rs * lightning/src/ln/outbound_payment.rs * lightning/src/ln/payment_tests.rs Note that unlike the original commit, on this branch we do not fail to deserialize a `ChannelMonitor` if the `counterparty_node_id` is `None` (implying it has not seen a `ChannelMonitorUpdate` since LDK 0.0.118). Thus, we skip the new logic in some cases, generating a warning log instead. As we assumed that it is now reasonable to require `counterparty_node_id`s in LDK 0.2, it seems reasonable to skip the new logic (potentially generating some additional spurious payment events on restart) now here as well.
1 parent fafa0dd commit 0626324

9 files changed

+315
-57
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3356,6 +3356,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
33563356
if updates.update_id == LEGACY_CLOSED_CHANNEL_UPDATE_ID || self.lockdown_from_offchain {
33573357
assert_eq!(updates.updates.len(), 1);
33583358
match updates.updates[0] {
3359+
ChannelMonitorUpdateStep::ReleasePaymentComplete { .. } => {},
33593360
ChannelMonitorUpdateStep::ChannelForceClosed { .. } => {},
33603361
// We should have already seen a `ChannelForceClosed` update if we're trying to
33613362
// provide a preimage at this point.

lightning/src/ln/chanmon_update_fail_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3314,6 +3314,7 @@ fn do_test_durable_preimages_on_closed_channel(close_chans_before_reload: bool,
33143314

33153315
nodes[0].node.force_close_broadcasting_latest_txn(&chan_id_ab, &nodes[1].node.get_our_node_id(), error_message.to_string()).unwrap();
33163316
check_closed_event(&nodes[0], 1, ClosureReason::HolderForceClosed { broadcasted_latest_txn: Some(true) }, false, &[nodes[1].node.get_our_node_id()], 100000);
3317+
check_added_monitors(&nodes[0], 1);
33173318
let as_closing_tx = nodes[0].tx_broadcaster.txn_broadcasted.lock().unwrap().split_off(0);
33183319
assert_eq!(as_closing_tx.len(), 1);
33193320

lightning/src/ln/channelmanager.rs

Lines changed: 165 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1206,6 +1206,12 @@ pub(crate) enum EventCompletionAction {
12061206
channel_funding_outpoint: Option<OutPoint>,
12071207
channel_id: ChannelId,
12081208
},
1209+
1210+
/// When a payment's resolution is communicated to the downstream logic via
1211+
/// [`Event::PaymentSent`] or [`Event::PaymentFailed`] we may want to mark the payment as
1212+
/// fully-resolved in the [`ChannelMonitor`], which we do via this action.
1213+
/// Note that this action will be dropped on downgrade to LDK prior to 0.2!
1214+
ReleasePaymentCompleteChannelMonitorUpdate(PaymentCompleteUpdate),
12091215
}
12101216
impl_writeable_tlv_based_enum!(EventCompletionAction,
12111217
(0, ReleaseRAAChannelMonitorUpdate) => {
@@ -1218,6 +1224,7 @@ impl_writeable_tlv_based_enum!(EventCompletionAction,
12181224
ChannelId::v1_from_funding_outpoint(channel_funding_outpoint.unwrap())
12191225
})),
12201226
}
1227+
{1, ReleasePaymentCompleteChannelMonitorUpdate} => (),
12211228
);
12221229

12231230
/// The source argument which is passed to [`ChannelManager::claim_mpp_part`].
@@ -6907,11 +6914,20 @@ where
69076914
if let Some(update) = from_monitor_update_completion {
69086915
// If `fail_htlc` didn't `take` the post-event action, we should go ahead and
69096916
// complete it here as the failure was duplicative - we've already handled it.
6910-
// This should mostly only happen on startup, but it is possible to hit it in
6911-
// rare cases where a MonitorUpdate is replayed after restart because a
6912-
// ChannelMonitor wasn't persisted after it was applied (even though the
6913-
// ChannelManager was).
6914-
// TODO
6917+
// This can happen in rare cases where a MonitorUpdate is replayed after
6918+
// restart because a ChannelMonitor wasn't persisted after it was applied (even
6919+
// though the ChannelManager was).
6920+
// For such cases, we also check that there's no existing pending event to
6921+
// complete this action already, which we let finish instead.
6922+
let action =
6923+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update);
6924+
let have_action = {
6925+
let pending_events = self.pending_events.lock().unwrap();
6926+
pending_events.iter().any(|(_, act)| act.as_ref() == Some(&action))
6927+
};
6928+
if !have_action {
6929+
self.handle_post_event_actions([action]);
6930+
}
69156931
}
69166932
},
69176933
HTLCSource::PreviousHopData(HTLCPreviousHopData {
@@ -7399,19 +7415,38 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
73997415
startup_replay: bool, next_channel_counterparty_node_id: Option<PublicKey>,
74007416
next_channel_outpoint: OutPoint, next_channel_id: ChannelId, next_user_channel_id: Option<u128>,
74017417
) {
7418+
debug_assert_eq!(
7419+
startup_replay,
7420+
!self.background_events_processed_since_startup.load(Ordering::Acquire)
7421+
);
7422+
let htlc_id = SentHTLCId::from_source(&source);
74027423
match source {
74037424
HTLCSource::OutboundRoute { session_priv, payment_id, path, .. } => {
7404-
debug_assert!(self.background_events_processed_since_startup.load(Ordering::Acquire),
7425+
debug_assert!(!startup_replay,
74057426
"We don't support claim_htlc claims during startup - monitors may not be available yet");
74067427
if let Some(pubkey) = next_channel_counterparty_node_id {
74077428
debug_assert_eq!(pubkey, path.hops[0].pubkey);
74087429
}
7409-
let mut ev_completion_action =
7430+
let mut ev_completion_action = if from_onchain {
7431+
if let Some(counterparty_node_id) = next_channel_counterparty_node_id {
7432+
let release = PaymentCompleteUpdate {
7433+
counterparty_node_id,
7434+
channel_funding_outpoint: next_channel_outpoint,
7435+
channel_id: next_channel_id,
7436+
htlc_id,
7437+
};
7438+
Some(EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(release))
7439+
} else {
7440+
log_warn!(self.logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant payment claims on restart");
7441+
None
7442+
}
7443+
} else {
74107444
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
74117445
channel_funding_outpoint: Some(next_channel_outpoint),
74127446
channel_id: next_channel_id,
74137447
counterparty_node_id: path.hops[0].pubkey,
7414-
});
7448+
})
7449+
};
74157450
self.pending_outbound_payments.claim_htlc(
74167451
payment_id,
74177452
payment_preimage,
@@ -9537,12 +9572,24 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
95379572
log_trace!(logger, "Failing HTLC with hash {} from our monitor", &htlc_update.payment_hash);
95389573
let receiver = HTLCDestination::NextHopChannel { node_id: counterparty_node_id, channel_id };
95399574
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
9575+
let completion_update =
9576+
if let Some(counterparty_node_id) = counterparty_node_id {
9577+
Some(PaymentCompleteUpdate {
9578+
counterparty_node_id,
9579+
channel_funding_outpoint: funding_outpoint,
9580+
channel_id,
9581+
htlc_id: SentHTLCId::from_source(&htlc_update.source),
9582+
})
9583+
} else {
9584+
log_warn!(self.logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant payment claims on restart");
9585+
None
9586+
};
95409587
self.fail_htlc_backwards_internal(
95419588
&htlc_update.source,
95429589
&htlc_update.payment_hash,
95439590
&reason,
95449591
receiver,
9545-
None,
9592+
completion_update,
95469593
);
95479594
}
95489595
},
@@ -10896,8 +10943,63 @@ where
1089610943
channel_id,
1089710944
counterparty_node_id,
1089810945
} => {
10946+
let startup_complete =
10947+
self.background_events_processed_since_startup.load(Ordering::Acquire);
10948+
debug_assert!(startup_complete);
1089910949
self.handle_monitor_update_release(counterparty_node_id, channel_id, None);
1090010950
},
10951+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(
10952+
PaymentCompleteUpdate {
10953+
counterparty_node_id,
10954+
channel_funding_outpoint,
10955+
channel_id,
10956+
htlc_id,
10957+
},
10958+
) => {
10959+
let per_peer_state = self.per_peer_state.read().unwrap();
10960+
let mut peer_state = per_peer_state
10961+
.get(&counterparty_node_id)
10962+
.map(|state| state.lock().unwrap())
10963+
.expect("Channels originating a payment resolution must have peer state");
10964+
let update_id = peer_state
10965+
.closed_channel_monitor_update_ids
10966+
.get_mut(&channel_id)
10967+
.expect("Channels originating a payment resolution must have a monitor");
10968+
*update_id += 1;
10969+
10970+
let update = ChannelMonitorUpdate {
10971+
update_id: *update_id,
10972+
channel_id: Some(channel_id),
10973+
counterparty_node_id: Some(counterparty_node_id),
10974+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
10975+
htlc: htlc_id,
10976+
}],
10977+
};
10978+
10979+
let during_startup =
10980+
!self.background_events_processed_since_startup.load(Ordering::Acquire);
10981+
if during_startup {
10982+
let event = BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
10983+
counterparty_node_id,
10984+
funding_txo: channel_funding_outpoint,
10985+
channel_id,
10986+
update,
10987+
};
10988+
self.pending_background_events.lock().unwrap().push(event);
10989+
} else {
10990+
handle_new_monitor_update!(
10991+
self,
10992+
channel_funding_outpoint,
10993+
update,
10994+
peer_state,
10995+
peer_state,
10996+
per_peer_state,
10997+
counterparty_node_id,
10998+
channel_id,
10999+
POST_CHANNEL_CLOSE
11000+
);
11001+
}
11002+
},
1090111003
}
1090211004
}
1090311005
}
@@ -13986,6 +14088,7 @@ where
1398614088
if counterparty_opt.is_none() {
1398714089
for (htlc_source, (htlc, preimage_opt)) in monitor.get_all_current_outbound_htlcs() {
1398814090
let logger = WithChannelMonitor::from(&args.logger, monitor, Some(htlc.payment_hash));
14091+
let htlc_id = SentHTLCId::from_source(&htlc_source);
1398914092
match htlc_source {
1399014093
HTLCSource::PreviousHopData(prev_hop_data) => {
1399114094
let pending_forward_matches_htlc = |info: &PendingAddHTLCInfo| {
@@ -14037,6 +14140,21 @@ where
1403714140
HTLCSource::OutboundRoute { payment_id, session_priv, path, .. } => {
1403814141
if let Some(preimage) = preimage_opt {
1403914142
let pending_events = Mutex::new(pending_events_read);
14143+
let mut compl_action =
14144+
if let Some(counterparty_node_id) = monitor.get_counterparty_node_id() {
14145+
let update = PaymentCompleteUpdate {
14146+
counterparty_node_id,
14147+
channel_funding_outpoint: monitor.get_funding_txo().0,
14148+
channel_id: monitor.channel_id(),
14149+
htlc_id,
14150+
};
14151+
Some(
14152+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update)
14153+
)
14154+
} else {
14155+
log_warn!(logger, "Missing counterparty node id in monitor when trying to re-claim a payment resolved on chain. This may lead to redundant payment claims on restart");
14156+
None
14157+
};
1404014158
// Note that we set `from_onchain` to "false" here,
1404114159
// deliberately keeping the pending payment around forever.
1404214160
// Given it should only occur when we have a channel we're
@@ -14045,13 +14163,6 @@ where
1404514163
// generating a `PaymentPathSuccessful` event but regenerating
1404614164
// it and the `PaymentSent` on every restart until the
1404714165
// `ChannelMonitor` is removed.
14048-
let mut compl_action = Some(
14049-
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
14050-
channel_funding_outpoint: Some(monitor.get_funding_txo().0),
14051-
channel_id: monitor.channel_id(),
14052-
counterparty_node_id: path.hops[0].pubkey,
14053-
},
14054-
);
1405514166
pending_outbounds.claim_htlc(
1405614167
payment_id,
1405714168
preimage,
@@ -14062,6 +14173,44 @@ where
1406214173
&pending_events,
1406314174
&&logger,
1406414175
);
14176+
// If the completion action was not consumed, then there was no
14177+
// payment to claim, and we need to tell the `ChannelMonitor`
14178+
// we don't need to hear about the HTLC again, at least as long
14179+
// as the PaymentSent event isn't still sitting around in our
14180+
// event queue.
14181+
let have_action = if compl_action.is_some() {
14182+
let pending_events = pending_events.lock().unwrap();
14183+
pending_events.iter().any(|(_, act)| *act == compl_action)
14184+
} else {
14185+
false
14186+
};
14187+
if !have_action && compl_action.is_some() {
14188+
if let Some(counterparty_node_id) = monitor.get_counterparty_node_id() {
14189+
let mut peer_state = per_peer_state
14190+
.get(&counterparty_node_id)
14191+
.map(|state| state.lock().unwrap())
14192+
.expect("Channels originating a preimage must have peer state");
14193+
let update_id = peer_state
14194+
.closed_channel_monitor_update_ids
14195+
.get_mut(&monitor.channel_id())
14196+
.expect("Channels originating a preimage must have a monitor");
14197+
*update_id += 1;
14198+
14199+
pending_background_events.push(BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
14200+
counterparty_node_id: counterparty_node_id,
14201+
funding_txo: monitor.get_funding_txo().0,
14202+
channel_id: monitor.channel_id(),
14203+
update: ChannelMonitorUpdate {
14204+
update_id: *update_id,
14205+
channel_id: Some(monitor.channel_id()),
14206+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
14207+
htlc: htlc_id,
14208+
}],
14209+
counterparty_node_id: Some(counterparty_node_id),
14210+
},
14211+
});
14212+
}
14213+
}
1406514214
pending_events_read = pending_events.into_inner().unwrap();
1406614215
}
1406714216
},

lightning/src/ln/functional_test_utils.rs

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2332,10 +2332,14 @@ macro_rules! expect_payment_claimed {
23322332
}
23332333
}
23342334

2335-
pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM=CM>>(node: &H,
2336-
expected_payment_preimage: PaymentPreimage, expected_fee_msat_opt: Option<Option<u64>>,
2337-
expect_per_path_claims: bool, expect_post_ev_mon_update: bool,
2335+
pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM=CM>>(
2336+
node: &H, expected_payment_preimage: PaymentPreimage,
2337+
expected_fee_msat_opt: Option<Option<u64>>, expect_per_path_claims: bool,
2338+
expect_post_ev_mon_update: bool,
23382339
) {
2340+
if expect_post_ev_mon_update {
2341+
check_added_monitors(node, 0);
2342+
}
23392343
let events = node.node().get_and_clear_pending_events();
23402344
let expected_payment_hash = PaymentHash(
23412345
bitcoin::hashes::sha256::Hash::hash(&expected_payment_preimage.0).to_byte_array());
@@ -2535,6 +2539,7 @@ pub struct PaymentFailedConditions<'a> {
25352539
pub(crate) expected_blamed_scid: Option<u64>,
25362540
pub(crate) expected_blamed_chan_closed: Option<bool>,
25372541
pub(crate) expected_mpp_parts_remain: bool,
2542+
pub(crate) from_mon_update: bool,
25382543
}
25392544

25402545
impl<'a> PaymentFailedConditions<'a> {
@@ -2544,6 +2549,7 @@ impl<'a> PaymentFailedConditions<'a> {
25442549
expected_blamed_scid: None,
25452550
expected_blamed_chan_closed: None,
25462551
expected_mpp_parts_remain: false,
2552+
from_mon_update: false,
25472553
}
25482554
}
25492555
pub fn mpp_parts_remain(mut self) -> Self {
@@ -2562,6 +2568,10 @@ impl<'a> PaymentFailedConditions<'a> {
25622568
self.expected_htlc_error_data = Some((code, data));
25632569
self
25642570
}
2571+
pub fn from_mon_update(mut self) -> Self {
2572+
self.from_mon_update = true;
2573+
self
2574+
}
25652575
}
25662576

25672577
#[cfg(test)]
@@ -2647,8 +2657,19 @@ pub fn expect_payment_failed_conditions<'a, 'b, 'c, 'd, 'e>(
26472657
node: &'a Node<'b, 'c, 'd>, expected_payment_hash: PaymentHash, expected_payment_failed_permanently: bool,
26482658
conditions: PaymentFailedConditions<'e>
26492659
) {
2660+
if conditions.from_mon_update {
2661+
check_added_monitors(node, 0);
2662+
}
26502663
let events = node.node.get_and_clear_pending_events();
2651-
expect_payment_failed_conditions_event(events, expected_payment_hash, expected_payment_failed_permanently, conditions);
2664+
if conditions.from_mon_update {
2665+
check_added_monitors(node, 1);
2666+
}
2667+
expect_payment_failed_conditions_event(
2668+
events,
2669+
expected_payment_hash,
2670+
expected_payment_failed_permanently,
2671+
conditions,
2672+
);
26522673
}
26532674

26542675
pub fn send_along_route_with_secret<'a, 'b, 'c>(origin_node: &Node<'a, 'b, 'c>, route: Route, expected_paths: &[&[&Node<'a, 'b, 'c>]], recv_value: u64, our_payment_hash: PaymentHash, our_payment_secret: PaymentSecret) -> PaymentId {

0 commit comments

Comments
 (0)