Skip to content

Commit 77430bf

Browse files
committed
Stop re-hydrating pending payments once they are fully resolved
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we, finally, begin actually using the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, skipping re-hydration of pending payments once they have been fully resolved through to a user `Event`. Backport of 226520b Fixed conflicts in: * lightning/src/chain/channelmonitor.rs * lightning/src/ln/channelmanager.rs * lightning/src/ln/functional_tests.rs to address differences causing upstream to (somewhat spuriously) generate `ChannelMonitorUpdate`s for channels force-closed by commitment transaction confirmation whereas this branch does not, * lightning/src/ln/payment_tests.rs
1 parent e64ed33 commit 77430bf

File tree

4 files changed

+58
-126
lines changed

4 files changed

+58
-126
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 10 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -2577,10 +2577,6 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitor<Signer> {
25772577
/// Gets the set of outbound HTLCs which can be (or have been) resolved by this
25782578
/// `ChannelMonitor`. This is used to determine if an HTLC was removed from the channel prior
25792579
/// to the `ChannelManager` having been persisted.
2580-
///
2581-
/// This is similar to [`Self::get_pending_or_resolved_outbound_htlcs`] except it includes
2582-
/// HTLCs which were resolved on-chain (i.e. where the final HTLC resolution was done by an
2583-
/// event from this `ChannelMonitor`).
25842580
pub(crate) fn get_all_current_outbound_htlcs(
25852581
&self,
25862582
) -> HashMap<HTLCSource, (HTLCOutputInCommitment, Option<PaymentPreimage>)> {
@@ -2593,8 +2589,11 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitor<Signer> {
25932589
for &(ref htlc, ref source_option) in latest_outpoints.iter() {
25942590
if let &Some(ref source) = source_option {
25952591
let htlc_id = SentHTLCId::from_source(source);
2596-
let preimage_opt = us.counterparty_fulfilled_htlcs.get(&htlc_id).cloned();
2597-
res.insert((**source).clone(), (htlc.clone(), preimage_opt));
2592+
if !us.htlcs_resolved_to_user.contains(&htlc_id) {
2593+
let preimage_opt =
2594+
us.counterparty_fulfilled_htlcs.get(&htlc_id).cloned();
2595+
res.insert((**source).clone(), (htlc.clone(), preimage_opt));
2596+
}
25982597
}
25992598
}
26002599
}
@@ -2650,6 +2649,11 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitor<Signer> {
26502649
} else {
26512650
continue;
26522651
};
2652+
let htlc_id = SentHTLCId::from_source(source);
2653+
if us.htlcs_resolved_to_user.contains(&htlc_id) {
2654+
continue;
2655+
}
2656+
26532657
let confirmed = $htlc_iter.find(|(_, conf_src)| Some(source) == *conf_src);
26542658
if let Some((confirmed_htlc, _)) = confirmed {
26552659
let filter = |v: &&IrrevocablyResolvedHTLC| {
@@ -2724,93 +2728,6 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitor<Signer> {
27242728
res
27252729
}
27262730

2727-
/// Gets the set of outbound HTLCs which are pending resolution in this channel or which were
2728-
/// resolved with a preimage from our counterparty.
2729-
///
2730-
/// This is used to reconstruct pending outbound payments on restart in the ChannelManager.
2731-
///
2732-
/// Currently, the preimage is unused, however if it is present in the relevant internal state
2733-
/// an HTLC is always included even if it has been resolved.
2734-
pub(crate) fn get_pending_or_resolved_outbound_htlcs(&self) -> HashMap<HTLCSource, (HTLCOutputInCommitment, Option<PaymentPreimage>)> {
2735-
let us = self.inner.lock().unwrap();
2736-
// We're only concerned with the confirmation count of HTLC transactions, and don't
2737-
// actually care how many confirmations a commitment transaction may or may not have. Thus,
2738-
// we look for either a FundingSpendConfirmation event or a funding_spend_confirmed.
2739-
let confirmed_txid = us.funding_spend_confirmed.or_else(|| {
2740-
us.onchain_events_awaiting_threshold_conf.iter().find_map(|event| {
2741-
if let OnchainEvent::FundingSpendConfirmation { .. } = event.event {
2742-
Some(event.txid)
2743-
} else { None }
2744-
})
2745-
});
2746-
2747-
if confirmed_txid.is_none() {
2748-
// If we have not seen a commitment transaction on-chain (ie the channel is not yet
2749-
// closed), just get the full set.
2750-
mem::drop(us);
2751-
return self.get_all_current_outbound_htlcs();
2752-
}
2753-
2754-
let mut res = new_hash_map();
2755-
macro_rules! walk_htlcs {
2756-
($holder_commitment: expr, $htlc_iter: expr) => {
2757-
for (htlc, source) in $htlc_iter {
2758-
if us.htlcs_resolved_on_chain.iter().any(|v| v.commitment_tx_output_idx == htlc.transaction_output_index) {
2759-
// We should assert that funding_spend_confirmed is_some() here, but we
2760-
// have some unit tests which violate HTLC transaction CSVs entirely and
2761-
// would fail.
2762-
// TODO: Once tests all connect transactions at consensus-valid times, we
2763-
// should assert here like we do in `get_claimable_balances`.
2764-
} else if htlc.offered == $holder_commitment {
2765-
// If the payment was outbound, check if there's an HTLCUpdate
2766-
// indicating we have spent this HTLC with a timeout, claiming it back
2767-
// and awaiting confirmations on it.
2768-
let htlc_update_confd = us.onchain_events_awaiting_threshold_conf.iter().any(|event| {
2769-
if let OnchainEvent::HTLCUpdate { commitment_tx_output_idx: Some(commitment_tx_output_idx), .. } = event.event {
2770-
// If the HTLC was timed out, we wait for ANTI_REORG_DELAY blocks
2771-
// before considering it "no longer pending" - this matches when we
2772-
// provide the ChannelManager an HTLC failure event.
2773-
Some(commitment_tx_output_idx) == htlc.transaction_output_index &&
2774-
us.best_block.height >= event.height + ANTI_REORG_DELAY - 1
2775-
} else if let OnchainEvent::HTLCSpendConfirmation { commitment_tx_output_idx, .. } = event.event {
2776-
// If the HTLC was fulfilled with a preimage, we consider the HTLC
2777-
// immediately non-pending, matching when we provide ChannelManager
2778-
// the preimage.
2779-
Some(commitment_tx_output_idx) == htlc.transaction_output_index
2780-
} else { false }
2781-
});
2782-
let counterparty_resolved_preimage_opt =
2783-
us.counterparty_fulfilled_htlcs.get(&SentHTLCId::from_source(source)).cloned();
2784-
if !htlc_update_confd || counterparty_resolved_preimage_opt.is_some() {
2785-
res.insert(source.clone(), (htlc.clone(), counterparty_resolved_preimage_opt));
2786-
}
2787-
}
2788-
}
2789-
}
2790-
}
2791-
2792-
let txid = confirmed_txid.unwrap();
2793-
if Some(txid) == us.current_counterparty_commitment_txid || Some(txid) == us.prev_counterparty_commitment_txid {
2794-
walk_htlcs!(false, us.counterparty_claimable_outpoints.get(&txid).unwrap().iter().filter_map(|(a, b)| {
2795-
if let &Some(ref source) = b {
2796-
Some((a, &**source))
2797-
} else { None }
2798-
}));
2799-
} else if txid == us.current_holder_commitment_tx.txid {
2800-
walk_htlcs!(true, us.current_holder_commitment_tx.htlc_outputs.iter().filter_map(|(a, _, c)| {
2801-
if let Some(source) = c { Some((a, source)) } else { None }
2802-
}));
2803-
} else if let Some(prev_commitment) = &us.prev_holder_signed_commitment_tx {
2804-
if txid == prev_commitment.txid {
2805-
walk_htlcs!(true, prev_commitment.htlc_outputs.iter().filter_map(|(a, _, c)| {
2806-
if let Some(source) = c { Some((a, source)) } else { None }
2807-
}));
2808-
}
2809-
}
2810-
2811-
res
2812-
}
2813-
28142731
pub(crate) fn get_stored_preimages(&self) -> HashMap<PaymentHash, (PaymentPreimage, Vec<PaymentClaimDetails>)> {
28152732
self.inner.lock().unwrap().payment_preimages.clone()
28162733
}

lightning/src/ln/channelmanager.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14064,7 +14064,7 @@ where
1406414064
let counterparty_opt = outpoint_to_peer.get(&monitor.get_funding_txo().0);
1406514065
// outpoint_to_peer missing the funding outpoint implies the channel is closed
1406614066
if counterparty_opt.is_none() {
14067-
for (htlc_source, (htlc, _)) in monitor.get_pending_or_resolved_outbound_htlcs() {
14067+
for (htlc_source, (htlc, _)) in monitor.get_all_current_outbound_htlcs() {
1406814068
let logger = WithChannelMonitor::from(&args.logger, monitor, Some(htlc.payment_hash));
1406914069
if let HTLCSource::OutboundRoute { payment_id, session_priv, path, .. } = htlc_source {
1407014070
if path.hops.is_empty() {

lightning/src/ln/functional_tests.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3711,6 +3711,8 @@ fn do_test_commitment_revoked_fail_backward_exhaustive(deliver_bs_raa: bool, use
37113711
let events = nodes[1].node.get_and_clear_pending_events();
37123712
if deliver_bs_raa {
37133713
check_added_monitors(&nodes[1], 1);
3714+
} else {
3715+
check_added_monitors(&nodes[1], 0);
37143716
}
37153717
assert_eq!(events.len(), if deliver_bs_raa { 3 + nodes.len() - 1 } else { 4 + nodes.len() });
37163718
assert!(events.iter().any(|ev| matches!(
@@ -3727,7 +3729,7 @@ fn do_test_commitment_revoked_fail_backward_exhaustive(deliver_bs_raa: bool, use
37273729
)));
37283730

37293731
nodes[1].node.process_pending_htlc_forwards();
3730-
check_added_monitors!(nodes[1], 1);
3732+
check_added_monitors(&nodes[1], 1);
37313733

37323734
let mut events = nodes[1].node.get_and_clear_pending_msg_events();
37333735
assert_eq!(events.len(), if deliver_bs_raa { 4 } else { 3 });

lightning/src/ln/payment_tests.rs

Lines changed: 44 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -973,9 +973,6 @@ fn do_test_completed_payment_not_retryable_on_reload(use_dust: bool) {
973973
nodes[1].node.peer_disconnected(nodes[0].node.get_our_node_id());
974974

975975
nodes[0].node.test_process_background_events();
976-
check_added_monitors(&nodes[0], 1); // TODO: Removed in the next commit as this only required
977-
// when we are still seeing all payments, even resolved
978-
// ones.
979976

980977
let mut reconnect_args = ReconnectArgs::new(&nodes[0], &nodes[1]);
981978
reconnect_args.send_channel_ready = (true, true);
@@ -1005,9 +1002,6 @@ fn do_test_completed_payment_not_retryable_on_reload(use_dust: bool) {
10051002
nodes[1].node.peer_disconnected(nodes[0].node.get_our_node_id());
10061003

10071004
nodes[0].node.test_process_background_events();
1008-
check_added_monitors(&nodes[0], 1); // TODO: Removed in the next commit as this only required
1009-
// when we are still seeing all payments, even resolved
1010-
// ones.
10111005

10121006
reconnect_nodes(ReconnectArgs::new(&nodes[0], &nodes[1]));
10131007

@@ -1024,7 +1018,10 @@ fn test_completed_payment_not_retryable_on_reload() {
10241018
do_test_completed_payment_not_retryable_on_reload(false);
10251019
}
10261020

1027-
fn do_test_dup_htlc_onchain_doesnt_fail_on_reload(persist_manager_post_event: bool, confirm_commitment_tx: bool, payment_timeout: bool) {
1021+
fn do_test_dup_htlc_onchain_doesnt_fail_on_reload(
1022+
persist_manager_post_event: bool, persist_monitor_after_events: bool,
1023+
confirm_commitment_tx: bool, payment_timeout: bool,
1024+
) {
10281025
// When a Channel is closed, any outbound HTLCs which were relayed through it are simply
10291026
// dropped. From there, the ChannelManager relies on the ChannelMonitor having a copy of the
10301027
// relevant fail-/claim-back data and processes the HTLC fail/claim when the ChannelMonitor tells
@@ -1115,36 +1112,58 @@ fn do_test_dup_htlc_onchain_doesnt_fail_on_reload(persist_manager_post_event: bo
11151112
chan_manager_serialized = nodes[0].node.encode();
11161113
}
11171114

1118-
let chan_0_monitor_serialized = get_monitor!(nodes[0], chan_id).encode();
1115+
let mut mon_ser = Vec::new();
1116+
if !persist_monitor_after_events {
1117+
mon_ser = get_monitor!(nodes[0], chan_id).encode();
1118+
}
11191119
if payment_timeout {
11201120
let conditions = PaymentFailedConditions::new().from_mon_update();
11211121
expect_payment_failed_conditions(&nodes[0], payment_hash, false, conditions);
11221122
} else {
11231123
expect_payment_sent(&nodes[0], payment_preimage, None, true, true);
11241124
}
1125+
// Note that if we persist the monitor before processing the events, above, we'll always get
1126+
// them replayed on restart no matter what
1127+
if persist_monitor_after_events {
1128+
mon_ser = get_monitor!(nodes[0], chan_id).encode();
1129+
}
11251130

11261131
// If we persist the ChannelManager after we get the PaymentSent event, we shouldn't get it
11271132
// twice.
11281133
if persist_manager_post_event {
11291134
chan_manager_serialized = nodes[0].node.encode();
1135+
} else if persist_monitor_after_events {
1136+
// Persisting the monitor after the events (resulting in a new monitor being persisted) but
1137+
// didn't persist the manager will result in an FC, which we don't test here.
1138+
panic!();
11301139
}
11311140

11321141
// Now reload nodes[0]...
1133-
reload_node!(nodes[0], &chan_manager_serialized, &[&chan_0_monitor_serialized], persister, new_chain_monitor, nodes_0_deserialized);
1142+
reload_node!(nodes[0], &chan_manager_serialized, &[&mon_ser], persister, new_chain_monitor, nodes_0_deserialized);
11341143

11351144
check_added_monitors(&nodes[0], 0);
1136-
if persist_manager_post_event {
1145+
if persist_manager_post_event && persist_monitor_after_events {
11371146
assert!(nodes[0].node.get_and_clear_pending_events().is_empty());
1138-
check_added_monitors(&nodes[0], 2);
1147+
check_added_monitors(&nodes[0], 0);
11391148
} else if payment_timeout {
1140-
let conditions = PaymentFailedConditions::new().from_mon_update();
1149+
let mut conditions = PaymentFailedConditions::new();
1150+
if !persist_monitor_after_events {
1151+
conditions = conditions.from_mon_update();
1152+
}
11411153
expect_payment_failed_conditions(&nodes[0], payment_hash, false, conditions);
1154+
check_added_monitors(&nodes[0], 0);
11421155
} else {
1156+
if persist_manager_post_event {
1157+
assert!(nodes[0].node.get_and_clear_pending_events().is_empty());
1158+
} else {
1159+
expect_payment_sent(&nodes[0], payment_preimage, None, true, false);
1160+
}
11431161
// After reload, the ChannelManager identified the failed payment and queued up the
1144-
// PaymentSent and corresponding ChannelMonitorUpdate to mark the payment handled, but
1145-
// while processing the pending `MonitorEvent`s (which were not processed before the
1146-
// monitor was persisted) we will end up with a duplicate ChannelMonitorUpdate.
1147-
expect_payment_sent(&nodes[0], payment_preimage, None, true, false);
1162+
// PaymentSent (or not, if `persist_manager_post_event` resulted in us detecting we
1163+
// already did that) and corresponding ChannelMonitorUpdate to mark the payment
1164+
// handled, but while processing the pending `MonitorEvent`s (which were not processed
1165+
// before the monitor was persisted) we will end up with a duplicate
1166+
// ChannelMonitorUpdate.
11481167
check_added_monitors(&nodes[0], 2);
11491168
}
11501169

@@ -1158,12 +1177,15 @@ fn do_test_dup_htlc_onchain_doesnt_fail_on_reload(persist_manager_post_event: bo
11581177

11591178
#[test]
11601179
fn test_dup_htlc_onchain_doesnt_fail_on_reload() {
1161-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, true, true);
1162-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, true, false);
1163-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, false, false);
1164-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, true, true);
1165-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, true, false);
1166-
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, false, false);
1180+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, true, true, true);
1181+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, true, true, false);
1182+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, true, false, false);
1183+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, false, true, true);
1184+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, false, true, false);
1185+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(true, false, false, false);
1186+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, false, true, true);
1187+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, false, true, false);
1188+
do_test_dup_htlc_onchain_doesnt_fail_on_reload(false, false, false, false);
11671189
}
11681190

11691191
#[test]
@@ -3513,15 +3535,9 @@ fn do_no_missing_sent_on_reload(persist_manager_with_payment: bool, at_midpoint:
35133535
let chan_0_monitor_serialized = get_monitor!(nodes[0], chan_id).encode();
35143536
reload_node!(nodes[0], test_default_channel_config(), &nodes[0].node.encode(), &[&chan_0_monitor_serialized], persister_b, chain_monitor_b, nodes_0_deserialized_b);
35153537

3516-
// Because the pending payment will currently stick around forever, we'll apply a
3517-
// ChannelMonitorUpdate on each startup to attempt to remove it.
3518-
// TODO: This will be dropped in the next commit after we actually remove the payment!
3519-
check_added_monitors(&nodes[0], 0);
35203538
nodes[0].node.test_process_background_events();
3521-
check_added_monitors(&nodes[0], 1);
35223539
let events = nodes[0].node.get_and_clear_pending_events();
35233540
assert!(events.is_empty());
3524-
check_added_monitors(&nodes[0], 0);
35253541

35263542
// Ensure that we don't generate any further events even after the channel-closing commitment
35273543
// transaction is confirmed on-chain.
@@ -3536,9 +3552,6 @@ fn do_no_missing_sent_on_reload(persist_manager_with_payment: bool, at_midpoint:
35363552
reload_node!(nodes[0], test_default_channel_config(), &nodes[0].node.encode(), &[&chan_0_monitor_serialized], persister_c, chain_monitor_c, nodes_0_deserialized_c);
35373553
let events = nodes[0].node.get_and_clear_pending_events();
35383554
assert!(events.is_empty());
3539-
3540-
// TODO: This will be dropped in the next commit after we actually remove the payment!
3541-
check_added_monitors(&nodes[0], 2);
35423555
}
35433556

35443557
#[test]

0 commit comments

Comments
 (0)