Skip to content

Commit 2f2f42d

Browse files
committed
Prepare to provide new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we prepare to generate the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, adding a new `PaymentCompleteUpdate` struct to track the new update before we generate the `ChannelMonitorUpdate` and passing through to the right places in `ChannelManager`. The only cases where we want to generate the new update is after a `PaymentSent` or `PaymentFailed` event when the event was the result of a `MonitorEvent` or the equivalent read during startup. Backport of 8b637cc Conflicts resolved in: * lightning/src/ln/channelmanager.rs * lightning/src/ln/outbound_payment.rs
1 parent 81b1628 commit 2f2f42d

File tree

2 files changed

+153
-44
lines changed

2 files changed

+153
-44
lines changed

lightning/src/ln/channelmanager.rs

Lines changed: 144 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1183,6 +1183,21 @@ impl_writeable_tlv_based_enum_upgradable!(MonitorUpdateCompletionAction,
11831183
},
11841184
);
11851185

1186+
#[derive(Clone, Debug, PartialEq, Eq)]
1187+
pub(crate) struct PaymentCompleteUpdate {
1188+
counterparty_node_id: PublicKey,
1189+
channel_funding_outpoint: OutPoint,
1190+
channel_id: ChannelId,
1191+
htlc_id: SentHTLCId,
1192+
}
1193+
1194+
impl_writeable_tlv_based!(PaymentCompleteUpdate, {
1195+
(1, channel_funding_outpoint, required),
1196+
(3, counterparty_node_id, required),
1197+
(5, channel_id, required),
1198+
(7, htlc_id, required),
1199+
});
1200+
11861201
#[derive(Clone, Debug, PartialEq, Eq)]
11871202
pub(crate) enum EventCompletionAction {
11881203
ReleaseRAAChannelMonitorUpdate {
@@ -3298,7 +3313,7 @@ macro_rules! handle_monitor_update_completion {
32983313
$self.finalize_claims(updates.finalized_claimed_htlcs);
32993314
for failure in updates.failed_htlcs.drain(..) {
33003315
let receiver = HTLCDestination::NextHopChannel { node_id: Some(counterparty_node_id), channel_id };
3301-
$self.fail_htlc_backwards_internal(&failure.0, &failure.1, &failure.2, receiver);
3316+
$self.fail_htlc_backwards_internal(&failure.0, &failure.1, &failure.2, receiver, None);
33023317
}
33033318
} }
33043319
}
@@ -3923,7 +3938,7 @@ where
39233938
for htlc_source in failed_htlcs.drain(..) {
39243939
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
39253940
let receiver = HTLCDestination::NextHopChannel { node_id: Some(*counterparty_node_id), channel_id: *channel_id };
3926-
self.fail_htlc_backwards_internal(&htlc_source.0, &htlc_source.1, &reason, receiver);
3941+
self.fail_htlc_backwards_internal(&htlc_source.0, &htlc_source.1, &reason, receiver, None);
39273942
}
39283943

39293944
if let Some(shutdown_result) = shutdown_result {
@@ -4046,7 +4061,7 @@ where
40464061
let (source, payment_hash, counterparty_node_id, channel_id) = htlc_source;
40474062
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
40484063
let receiver = HTLCDestination::NextHopChannel { node_id: Some(counterparty_node_id), channel_id };
4049-
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver);
4064+
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver, None);
40504065
}
40514066
if let Some((_, funding_txo, _channel_id, monitor_update)) = shutdown_res.monitor_update {
40524067
debug_assert!(false, "This should have been handled in `locked_close_channel`");
@@ -5642,7 +5657,7 @@ where
56425657

56435658
let failure_reason = HTLCFailReason::from_failure_code(0x4000 | 10);
56445659
let destination = HTLCDestination::UnknownNextHop { requested_forward_scid: short_channel_id };
5645-
self.fail_htlc_backwards_internal(&htlc_source, &payment.forward_info.payment_hash, &failure_reason, destination);
5660+
self.fail_htlc_backwards_internal(&htlc_source, &payment.forward_info.payment_hash, &failure_reason, destination, None);
56465661
} else { unreachable!() } // Only `PendingHTLCRouting::Forward`s are intercepted
56475662

56485663
Ok(())
@@ -6324,7 +6339,13 @@ where
63246339
&self.pending_events, &self.logger, |args| self.send_payment_along_path(args));
63256340

63266341
for (htlc_source, payment_hash, failure_reason, destination) in failed_forwards.drain(..) {
6327-
self.fail_htlc_backwards_internal(&htlc_source, &payment_hash, &failure_reason, destination);
6342+
self.fail_htlc_backwards_internal(
6343+
&htlc_source,
6344+
&payment_hash,
6345+
&failure_reason,
6346+
destination,
6347+
None,
6348+
);
63286349
}
63296350
self.forward_htlcs(&mut phantom_receives);
63306351

@@ -6686,7 +6707,7 @@ where
66866707
let source = HTLCSource::PreviousHopData(htlc_source.0.clone());
66876708
let reason = HTLCFailReason::from_failure_code(23);
66886709
let receiver = HTLCDestination::FailedPayment { payment_hash: htlc_source.1 };
6689-
self.fail_htlc_backwards_internal(&source, &htlc_source.1, &reason, receiver);
6710+
self.fail_htlc_backwards_internal(&source, &htlc_source.1, &reason, receiver, None);
66906711
}
66916712

66926713
for (err, counterparty_node_id) in handle_errors.drain(..) {
@@ -6751,7 +6772,7 @@ where
67516772
let reason = self.get_htlc_fail_reason_from_failure_code(failure_code, &htlc);
67526773
let source = HTLCSource::PreviousHopData(htlc.prev_hop);
67536774
let receiver = HTLCDestination::FailedPayment { payment_hash: *payment_hash };
6754-
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver);
6775+
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver, None);
67556776
}
67566777
}
67576778
}
@@ -6830,18 +6851,26 @@ where
68306851
for (htlc_src, payment_hash) in htlcs_to_fail.drain(..) {
68316852
let reason = HTLCFailReason::reason(failure_code, onion_failure_data.clone());
68326853
let receiver = HTLCDestination::NextHopChannel { node_id: Some(counterparty_node_id.clone()), channel_id };
6833-
self.fail_htlc_backwards_internal(&htlc_src, &payment_hash, &reason, receiver);
6854+
self.fail_htlc_backwards_internal(&htlc_src, &payment_hash, &reason, receiver, None);
68346855
}
68356856
}
68366857

6837-
fn fail_htlc_backwards_internal(&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason, destination: HTLCDestination) {
6838-
let push_forward_event = self.fail_htlc_backwards_internal_without_forward_event(source, payment_hash, onion_error, destination);
6858+
fn fail_htlc_backwards_internal(
6859+
&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason,
6860+
destination: HTLCDestination,
6861+
from_monitor_update_completion: Option<PaymentCompleteUpdate>,
6862+
) {
6863+
let push_forward_event = self.fail_htlc_backwards_internal_without_forward_event(source, payment_hash, onion_error, destination, from_monitor_update_completion);
68396864
if push_forward_event { self.push_pending_forwards_ev(); }
68406865
}
68416866

68426867
/// Fails an HTLC backwards to the sender of it to us.
68436868
/// Note that we do not assume that channels corresponding to failed HTLCs are still available.
6844-
fn fail_htlc_backwards_internal_without_forward_event(&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason, destination: HTLCDestination) -> bool {
6869+
fn fail_htlc_backwards_internal_without_forward_event(
6870+
&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason,
6871+
destination: HTLCDestination,
6872+
mut from_monitor_update_completion: Option<PaymentCompleteUpdate>,
6873+
) -> bool {
68456874
// Ensure that no peer state channel storage lock is held when calling this function.
68466875
// This ensures that future code doesn't introduce a lock-order requirement for
68476876
// `forward_htlcs` to be locked after the `per_peer_state` peer locks, which calling
@@ -6862,9 +6891,28 @@ where
68626891
let mut push_forward_event;
68636892
match source {
68646893
HTLCSource::OutboundRoute { ref path, ref session_priv, ref payment_id, .. } => {
6865-
push_forward_event = self.pending_outbound_payments.fail_htlc(source, payment_hash, onion_error, path,
6866-
session_priv, payment_id, self.probing_cookie_secret, &self.secp_ctx,
6867-
&self.pending_events, &self.logger);
6894+
push_forward_event = self.pending_outbound_payments.fail_htlc(
6895+
source,
6896+
payment_hash,
6897+
onion_error,
6898+
path,
6899+
session_priv,
6900+
payment_id,
6901+
self.probing_cookie_secret,
6902+
&self.secp_ctx,
6903+
&self.pending_events,
6904+
&self.logger,
6905+
&mut from_monitor_update_completion,
6906+
);
6907+
if let Some(update) = from_monitor_update_completion {
6908+
// If `fail_htlc` didn't `take` the post-event action, we should go ahead and
6909+
// complete it here as the failure was duplicative - we've already handled it.
6910+
// This should mostly only happen on startup, but it is possible to hit it in
6911+
// rare cases where a MonitorUpdate is replayed after restart because a
6912+
// ChannelMonitor wasn't persisted after it was applied (even though the
6913+
// ChannelManager was).
6914+
// TODO
6915+
}
68686916
},
68696917
HTLCSource::PreviousHopData(HTLCPreviousHopData {
68706918
ref short_channel_id, ref htlc_id, ref incoming_packet_shared_secret,
@@ -6979,7 +7027,7 @@ where
69797027
let reason = self.get_htlc_fail_reason_from_failure_code(FailureCode::InvalidOnionPayload(None), &htlc);
69807028
let source = HTLCSource::PreviousHopData(htlc.prev_hop);
69817029
let receiver = HTLCDestination::FailedPayment { payment_hash };
6982-
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver);
7030+
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver, None);
69837031
}
69847032
return;
69857033
}
@@ -7085,7 +7133,7 @@ where
70857133
let source = HTLCSource::PreviousHopData(htlc.prev_hop);
70867134
let reason = HTLCFailReason::reason(0x4000 | 15, htlc_msat_height_data);
70877135
let receiver = HTLCDestination::FailedPayment { payment_hash };
7088-
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver);
7136+
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver, None);
70897137
}
70907138
self.claimable_payments.lock().unwrap().pending_claiming_payments.remove(&payment_hash);
70917139
}
@@ -7358,14 +7406,34 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
73587406
if let Some(pubkey) = next_channel_counterparty_node_id {
73597407
debug_assert_eq!(pubkey, path.hops[0].pubkey);
73607408
}
7361-
let ev_completion_action = EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
7362-
channel_funding_outpoint: Some(next_channel_outpoint),
7363-
channel_id: next_channel_id,
7364-
counterparty_node_id: path.hops[0].pubkey,
7409+
let mut ev_completion_action =
7410+
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
7411+
channel_funding_outpoint: Some(next_channel_outpoint),
7412+
channel_id: next_channel_id,
7413+
counterparty_node_id: path.hops[0].pubkey,
7414+
});
7415+
self.pending_outbound_payments.claim_htlc(
7416+
payment_id,
7417+
payment_preimage,
7418+
session_priv,
7419+
path,
7420+
from_onchain,
7421+
&mut ev_completion_action,
7422+
&self.pending_events,
7423+
&self.logger,
7424+
);
7425+
// If an event was generated, `claim_htlc` set `ev_completion_action` to None, if
7426+
// not, we should go ahead and run it now (as the claim was duplicative), at least
7427+
// if a PaymentClaimed event with the same action isn't already pending.
7428+
let have_action = if ev_completion_action.is_some() {
7429+
let pending_events = self.pending_events.lock().unwrap();
7430+
pending_events.iter().any(|(_, act)| *act == ev_completion_action)
7431+
} else {
7432+
false
73657433
};
7366-
self.pending_outbound_payments.claim_htlc(payment_id, payment_preimage,
7367-
session_priv, path, from_onchain, ev_completion_action, &self.pending_events,
7368-
&self.logger);
7434+
if !have_action {
7435+
self.handle_post_event_actions(ev_completion_action);
7436+
}
73697437
},
73707438
HTLCSource::PreviousHopData(hop_data) => {
73717439
let prev_channel_id = hop_data.channel_id;
@@ -8711,7 +8779,7 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
87118779
for htlc_source in dropped_htlcs.drain(..) {
87128780
let receiver = HTLCDestination::NextHopChannel { node_id: Some(counterparty_node_id.clone()), channel_id: msg.channel_id };
87138781
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
8714-
self.fail_htlc_backwards_internal(&htlc_source.0, &htlc_source.1, &reason, receiver);
8782+
self.fail_htlc_backwards_internal(&htlc_source.0, &htlc_source.1, &reason, receiver, None);
87158783
}
87168784
if let Some(shutdown_res) = finish_shutdown {
87178785
self.finish_close_channel(shutdown_res);
@@ -9122,7 +9190,13 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
91229190
}
91239191

91249192
for (htlc_source, payment_hash, failure_reason, destination) in failed_intercept_forwards.drain(..) {
9125-
push_forward_event |= self.fail_htlc_backwards_internal_without_forward_event(&htlc_source, &payment_hash, &failure_reason, destination);
9193+
push_forward_event |= self.fail_htlc_backwards_internal_without_forward_event(
9194+
&htlc_source,
9195+
&payment_hash,
9196+
&failure_reason,
9197+
destination,
9198+
None,
9199+
);
91269200
}
91279201

91289202
if !new_intercept_events.is_empty() {
@@ -9463,7 +9537,13 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
94639537
log_trace!(logger, "Failing HTLC with hash {} from our monitor", &htlc_update.payment_hash);
94649538
let receiver = HTLCDestination::NextHopChannel { node_id: counterparty_node_id, channel_id };
94659539
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
9466-
self.fail_htlc_backwards_internal(&htlc_update.source, &htlc_update.payment_hash, &reason, receiver);
9540+
self.fail_htlc_backwards_internal(
9541+
&htlc_update.source,
9542+
&htlc_update.payment_hash,
9543+
&reason,
9544+
receiver,
9545+
None,
9546+
);
94679547
}
94689548
},
94699549
MonitorEvent::HolderForceClosed(_) | MonitorEvent::HolderForceClosedWithInfo { .. } => {
@@ -10808,8 +10888,8 @@ where
1080810888
}
1080910889
}
1081010890

10811-
fn handle_post_event_actions(&self, actions: Vec<EventCompletionAction>) {
10812-
for action in actions {
10891+
fn handle_post_event_actions<I: IntoIterator<Item = EventCompletionAction>>(&self, actions: I) {
10892+
for action in actions.into_iter() {
1081310893
match action {
1081410894
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
1081510895
channel_funding_outpoint: _,
@@ -11315,7 +11395,7 @@ where
1131511395
}
1131611396

1131711397
for (source, payment_hash, reason, destination) in timed_out_htlcs.drain(..) {
11318-
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, destination);
11398+
self.fail_htlc_backwards_internal(&source, &payment_hash, &reason, destination, None);
1131911399
}
1132011400
}
1132111401

@@ -13380,7 +13460,7 @@ where
1338013460
log_error!(logger, " The ChannelMonitor for channel {} is at counterparty commitment transaction number {} but the ChannelManager is at counterparty commitment transaction number {}.",
1338113461
&channel.context.channel_id(), monitor.get_cur_counterparty_commitment_number(), channel.get_cur_counterparty_commitment_transaction_number());
1338213462
}
13383-
let mut shutdown_result = channel.context.force_shutdown(true, ClosureReason::OutdatedChannelManager);
13463+
let shutdown_result = channel.context.force_shutdown(true, ClosureReason::OutdatedChannelManager);
1338413464
if shutdown_result.unbroadcasted_batch_funding_txid.is_some() {
1338513465
return Err(DecodeError::InvalidValue);
1338613466
}
@@ -13401,7 +13481,9 @@ where
1340113481
counterparty_node_id, funding_txo, channel_id, update
1340213482
});
1340313483
}
13404-
failed_htlcs.append(&mut shutdown_result.dropped_outbound_htlcs);
13484+
for (source, hash, cp_id, chan_id) in shutdown_result.dropped_outbound_htlcs {
13485+
failed_htlcs.push((source, hash, cp_id, chan_id, None));
13486+
}
1340513487
channel_closures.push_back((events::Event::ChannelClosed {
1340613488
channel_id: channel.context.channel_id(),
1340713489
user_channel_id: channel.context.get_user_id(),
@@ -13428,7 +13510,13 @@ where
1342813510
log_info!(logger,
1342913511
"Failing HTLC with hash {} as it is missing in the ChannelMonitor for channel {} but was present in the (stale) ChannelManager",
1343013512
&channel.context.channel_id(), &payment_hash);
13431-
failed_htlcs.push((channel_htlc_source.clone(), *payment_hash, channel.context.get_counterparty_node_id(), channel.context.channel_id()));
13513+
failed_htlcs.push((
13514+
channel_htlc_source.clone(),
13515+
*payment_hash,
13516+
channel.context.get_counterparty_node_id(),
13517+
channel.context.channel_id(),
13518+
None,
13519+
));
1343213520
}
1343313521
}
1343413522
} else {
@@ -13956,14 +14044,23 @@ where
1395614044
// generating a `PaymentPathSuccessful` event but regenerating
1395714045
// it and the `PaymentSent` on every restart until the
1395814046
// `ChannelMonitor` is removed.
13959-
let compl_action =
14047+
let mut compl_action = Some(
1396014048
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
1396114049
channel_funding_outpoint: Some(monitor.get_funding_txo().0),
1396214050
channel_id: monitor.channel_id(),
1396314051
counterparty_node_id: path.hops[0].pubkey,
13964-
};
13965-
pending_outbounds.claim_htlc(payment_id, preimage, session_priv,
13966-
path, false, compl_action, &pending_events, &&logger);
14052+
},
14053+
);
14054+
pending_outbounds.claim_htlc(
14055+
payment_id,
14056+
preimage,
14057+
session_priv,
14058+
path,
14059+
false,
14060+
&mut compl_action,
14061+
&pending_events,
14062+
&&logger,
14063+
);
1396714064
pending_events_read = pending_events.into_inner().unwrap();
1396814065
}
1396914066
},
@@ -13976,11 +14073,18 @@ where
1397614073
"Failing HTLC with payment hash {} as it was resolved on-chain.",
1397714074
payment_hash
1397814075
);
14076+
let completion_action = Some(PaymentCompleteUpdate {
14077+
counterparty_node_id: node_id,
14078+
channel_funding_outpoint: monitor.get_funding_txo().0,
14079+
channel_id: monitor.channel_id(),
14080+
htlc_id: SentHTLCId::from_source(&htlc_source),
14081+
});
1397914082
failed_htlcs.push((
1398014083
htlc_source,
1398114084
payment_hash,
1398214085
node_id,
1398314086
monitor.channel_id(),
14087+
completion_action,
1398414088
));
1398514089
} else {
1398614090
log_warn!(
@@ -14575,12 +14679,13 @@ where
1457514679
}
1457614680
}
1457714681

14578-
for htlc_source in failed_htlcs.drain(..) {
14579-
let (source, payment_hash, counterparty_id, channel_id) = htlc_source;
14682+
for htlc_source in failed_htlcs {
14683+
let (source, hash, counterparty_id, channel_id, ev_action) = htlc_source;
1458014684
let receiver =
1458114685
HTLCDestination::NextHopChannel { node_id: Some(counterparty_id), channel_id };
1458214686
let reason = HTLCFailReason::from_failure_code(0x4000 | 8);
14583-
channel_manager.fail_htlc_backwards_internal(&source, &payment_hash, &reason, receiver);
14687+
channel_manager
14688+
.fail_htlc_backwards_internal(&source, &hash, &reason, receiver, ev_action);
1458414689
}
1458514690

1458614691
for (source, preimage, downstream_value, downstream_closed, downstream_node_id, downstream_funding, downstream_channel_id) in pending_claims_to_replay {

0 commit comments

Comments
 (0)