Skip to content

Commit 1440d95

Browse files
committed
Generate new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we begin generating the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, updating functional tests for the new `ChannelMonitorUpdate`s where required.
1 parent 8b637cc commit 1440d95

9 files changed

+279
-51
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4119,6 +4119,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
41194119
if updates.update_id == LEGACY_CLOSED_CHANNEL_UPDATE_ID || self.lockdown_from_offchain {
41204120
assert_eq!(updates.updates.len(), 1);
41214121
match updates.updates[0] {
4122+
ChannelMonitorUpdateStep::ReleasePaymentComplete { .. } => {},
41224123
ChannelMonitorUpdateStep::ChannelForceClosed { .. } => {},
41234124
// We should have already seen a `ChannelForceClosed` update if we're trying to
41244125
// provide a preimage at this point.

lightning/src/ln/chanmon_update_fail_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3870,6 +3870,7 @@ fn do_test_durable_preimages_on_closed_channel(
38703870
};
38713871
nodes[0].node.force_close_broadcasting_latest_txn(&chan_id_ab, &node_b_id, err_msg).unwrap();
38723872
check_closed_event(&nodes[0], 1, reason, false, &[node_b_id], 100000);
3873+
check_added_monitors(&nodes[0], 1);
38733874
let as_closing_tx = nodes[0].tx_broadcaster.txn_broadcasted.lock().unwrap().split_off(0);
38743875
assert_eq!(as_closing_tx.len(), 1);
38753876

lightning/src/ln/channelmanager.rs

Lines changed: 142 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1267,6 +1267,12 @@ pub(crate) enum EventCompletionAction {
12671267
channel_funding_outpoint: Option<OutPoint>,
12681268
channel_id: ChannelId,
12691269
},
1270+
1271+
/// When a payment's resolution is communicated to the downstream logic via
1272+
/// [`Event::PaymentSent`] or [`Event::PaymentFailed`] we may want to mark the payment as
1273+
/// fully-resolved in the [`ChannelMonitor`], which we do via this action.
1274+
/// Note that this action will be dropped on downgrade to LDK prior to 0.2!
1275+
ReleasePaymentCompleteChannelMonitorUpdate(PaymentCompleteUpdate),
12701276
}
12711277
impl_writeable_tlv_based_enum!(EventCompletionAction,
12721278
(0, ReleaseRAAChannelMonitorUpdate) => {
@@ -1279,6 +1285,7 @@ impl_writeable_tlv_based_enum!(EventCompletionAction,
12791285
ChannelId::v1_from_funding_outpoint(channel_funding_outpoint.unwrap())
12801286
})),
12811287
}
1288+
{1, ReleasePaymentCompleteChannelMonitorUpdate} => (),
12821289
);
12831290

12841291
/// The source argument which is passed to [`ChannelManager::claim_mpp_part`].
@@ -8013,7 +8020,17 @@ where
80138020
// rare cases where a MonitorUpdate is replayed after restart because a
80148021
// ChannelMonitor wasn't persisted after it was applied (even though the
80158022
// ChannelManager was).
8016-
// TODO
8023+
// For such cases, we also check that there's no existing pending event to
8024+
// complete this action already, which we let finish instead.
8025+
let action =
8026+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update);
8027+
let have_action = {
8028+
let pending_events = self.pending_events.lock().unwrap();
8029+
pending_events.iter().any(|(_, act)| act.as_ref() == Some(&action))
8030+
};
8031+
if !have_action {
8032+
self.handle_post_event_actions([action]);
8033+
}
80178034
}
80188035
},
80198036
HTLCSource::PreviousHopData(HTLCPreviousHopData {
@@ -8642,19 +8659,34 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
86428659
next_user_channel_id: Option<u128>, attribution_data: Option<AttributionData>,
86438660
send_timestamp: Option<Duration>,
86448661
) {
8662+
debug_assert_eq!(
8663+
startup_replay,
8664+
!self.background_events_processed_since_startup.load(Ordering::Acquire)
8665+
);
8666+
let htlc_id = SentHTLCId::from_source(&source);
86458667
match source {
86468668
HTLCSource::OutboundRoute {
86478669
session_priv, payment_id, path, bolt12_invoice, ..
86488670
} => {
8649-
debug_assert!(self.background_events_processed_since_startup.load(Ordering::Acquire),
8671+
debug_assert!(!startup_replay,
86508672
"We don't support claim_htlc claims during startup - monitors may not be available yet");
86518673
debug_assert_eq!(next_channel_counterparty_node_id, path.hops[0].pubkey);
8652-
let mut ev_completion_action =
8674+
8675+
let mut ev_completion_action = if from_onchain {
8676+
let release = PaymentCompleteUpdate {
8677+
counterparty_node_id: next_channel_counterparty_node_id,
8678+
channel_funding_outpoint: next_channel_outpoint,
8679+
channel_id: next_channel_id,
8680+
htlc_id,
8681+
};
8682+
Some(EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(release))
8683+
} else {
86538684
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
86548685
channel_funding_outpoint: Some(next_channel_outpoint),
86558686
channel_id: next_channel_id,
86568687
counterparty_node_id: path.hops[0].pubkey,
8657-
});
8688+
})
8689+
};
86588690
self.pending_outbound_payments.claim_htlc(
86598691
payment_id,
86608692
payment_preimage,
@@ -11372,12 +11404,18 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
1137211404
channel_id,
1137311405
};
1137411406
let reason = HTLCFailReason::from_failure_code(failure_reason);
11407+
let completion_update = Some(PaymentCompleteUpdate {
11408+
counterparty_node_id,
11409+
channel_funding_outpoint: funding_outpoint,
11410+
channel_id,
11411+
htlc_id: SentHTLCId::from_source(&htlc_update.source),
11412+
});
1137511413
self.fail_htlc_backwards_internal(
1137611414
&htlc_update.source,
1137711415
&htlc_update.payment_hash,
1137811416
&reason,
1137911417
receiver,
11380-
None,
11418+
completion_update,
1138111419
);
1138211420
}
1138311421
},
@@ -12864,8 +12902,62 @@ where
1286412902
channel_id,
1286512903
counterparty_node_id,
1286612904
} => {
12905+
let startup_complete =
12906+
self.background_events_processed_since_startup.load(Ordering::Acquire);
12907+
debug_assert!(startup_complete);
1286712908
self.handle_monitor_update_release(counterparty_node_id, channel_id, None);
1286812909
},
12910+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(
12911+
PaymentCompleteUpdate {
12912+
counterparty_node_id,
12913+
channel_funding_outpoint,
12914+
channel_id,
12915+
htlc_id,
12916+
},
12917+
) => {
12918+
let per_peer_state = self.per_peer_state.read().unwrap();
12919+
let mut peer_state = per_peer_state
12920+
.get(&counterparty_node_id)
12921+
.map(|state| state.lock().unwrap())
12922+
.expect("Channels originating a payment resolution must have peer state");
12923+
let update_id = peer_state
12924+
.closed_channel_monitor_update_ids
12925+
.get_mut(&channel_id)
12926+
.expect("Channels originating a payment resolution must have a monitor");
12927+
*update_id += 1;
12928+
12929+
let update = ChannelMonitorUpdate {
12930+
update_id: *update_id,
12931+
channel_id: Some(channel_id),
12932+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
12933+
htlc: htlc_id,
12934+
}],
12935+
};
12936+
12937+
let during_startup =
12938+
!self.background_events_processed_since_startup.load(Ordering::Acquire);
12939+
if during_startup {
12940+
let event = BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
12941+
counterparty_node_id,
12942+
funding_txo: channel_funding_outpoint,
12943+
channel_id,
12944+
update,
12945+
};
12946+
self.pending_background_events.lock().unwrap().push(event);
12947+
} else {
12948+
handle_new_monitor_update!(
12949+
self,
12950+
channel_funding_outpoint,
12951+
update,
12952+
peer_state,
12953+
peer_state,
12954+
per_peer_state,
12955+
counterparty_node_id,
12956+
channel_id,
12957+
POST_CHANNEL_CLOSE
12958+
);
12959+
}
12960+
},
1286912961
}
1287012962
}
1287112963
}
@@ -16557,6 +16649,7 @@ where
1655716649
monitor,
1655816650
Some(htlc.payment_hash),
1655916651
);
16652+
let htlc_id = SentHTLCId::from_source(&htlc_source);
1656016653
match htlc_source {
1656116654
HTLCSource::PreviousHopData(prev_hop_data) => {
1656216655
let pending_forward_matches_htlc = |info: &PendingAddHTLCInfo| {
@@ -16614,6 +16707,15 @@ where
1661416707
} => {
1661516708
if let Some(preimage) = preimage_opt {
1661616709
let pending_events = Mutex::new(pending_events_read);
16710+
let update = PaymentCompleteUpdate {
16711+
counterparty_node_id: monitor.get_counterparty_node_id(),
16712+
channel_funding_outpoint: monitor.get_funding_txo(),
16713+
channel_id: monitor.channel_id(),
16714+
htlc_id,
16715+
};
16716+
let mut compl_action = Some(
16717+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update)
16718+
);
1661716719
// Note that we set `from_onchain` to "false" here,
1661816720
// deliberately keeping the pending payment around forever.
1661916721
// Given it should only occur when we have a channel we're
@@ -16622,15 +16724,6 @@ where
1662216724
// generating a `PaymentPathSuccessful` event but regenerating
1662316725
// it and the `PaymentSent` on every restart until the
1662416726
// `ChannelMonitor` is removed.
16625-
let mut compl_action = Some(
16626-
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
16627-
channel_funding_outpoint: Some(
16628-
monitor.get_funding_txo(),
16629-
),
16630-
channel_id: monitor.channel_id(),
16631-
counterparty_node_id: path.hops[0].pubkey,
16632-
},
16633-
);
1663416727
pending_outbounds.claim_htlc(
1663516728
payment_id,
1663616729
preimage,
@@ -16642,6 +16735,41 @@ where
1664216735
&pending_events,
1664316736
&&logger,
1664416737
);
16738+
// If the completion action was not consumed, then there was no
16739+
// payment to claim, and we need to tell the `ChannelMonitor`
16740+
// we don't need to hear about the HTLC again, at least as long
16741+
// as the PaymentSent event isn't still sitting around in our
16742+
// event queue.
16743+
let have_action = if compl_action.is_some() {
16744+
let pending_events = pending_events.lock().unwrap();
16745+
pending_events.iter().any(|(_, act)| *act == compl_action)
16746+
} else {
16747+
false
16748+
};
16749+
if !have_action && compl_action.is_some() {
16750+
let mut peer_state = per_peer_state
16751+
.get(&counterparty_node_id)
16752+
.map(|state| state.lock().unwrap())
16753+
.expect("Channels originating a preimage must have peer state");
16754+
let update_id = peer_state
16755+
.closed_channel_monitor_update_ids
16756+
.get_mut(channel_id)
16757+
.expect("Channels originating a preimage must have a monitor");
16758+
*update_id += 1;
16759+
16760+
pending_background_events.push(BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
16761+
counterparty_node_id: monitor.get_counterparty_node_id(),
16762+
funding_txo: monitor.get_funding_txo(),
16763+
channel_id: monitor.channel_id(),
16764+
update: ChannelMonitorUpdate {
16765+
update_id: *update_id,
16766+
channel_id: Some(monitor.channel_id()),
16767+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
16768+
htlc: htlc_id,
16769+
}],
16770+
},
16771+
});
16772+
}
1664516773
pending_events_read = pending_events.into_inner().unwrap();
1664616774
}
1664716775
},

lightning/src/ln/functional_test_utils.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2803,6 +2803,9 @@ pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM = CM>>(
28032803
expected_fee_msat_opt: Option<Option<u64>>, expect_per_path_claims: bool,
28042804
expect_post_ev_mon_update: bool,
28052805
) -> (Option<PaidBolt12Invoice>, Vec<Event>) {
2806+
if expect_post_ev_mon_update {
2807+
check_added_monitors(node, 0);
2808+
}
28062809
let events = node.node().get_and_clear_pending_events();
28072810
let expected_payment_hash = PaymentHash(
28082811
bitcoin::hashes::sha256::Hash::hash(&expected_payment_preimage.0).to_byte_array(),
@@ -3052,6 +3055,7 @@ pub struct PaymentFailedConditions<'a> {
30523055
pub(crate) expected_blamed_chan_closed: Option<bool>,
30533056
pub(crate) expected_mpp_parts_remain: bool,
30543057
pub(crate) retry_expected: bool,
3058+
pub(crate) from_mon_update: bool,
30553059
}
30563060

30573061
impl<'a> PaymentFailedConditions<'a> {
@@ -3062,6 +3066,7 @@ impl<'a> PaymentFailedConditions<'a> {
30623066
expected_blamed_chan_closed: None,
30633067
expected_mpp_parts_remain: false,
30643068
retry_expected: false,
3069+
from_mon_update: false,
30653070
}
30663071
}
30673072
pub fn mpp_parts_remain(mut self) -> Self {
@@ -3086,6 +3091,10 @@ impl<'a> PaymentFailedConditions<'a> {
30863091
self.retry_expected = true;
30873092
self
30883093
}
3094+
pub fn from_mon_update(mut self) -> Self {
3095+
self.from_mon_update = true;
3096+
self
3097+
}
30893098
}
30903099

30913100
#[cfg(any(test, feature = "_externalize_tests"))]
@@ -3195,7 +3204,13 @@ pub fn expect_payment_failed_conditions<'a, 'b, 'c, 'd, 'e>(
31953204
node: &'a Node<'b, 'c, 'd>, expected_payment_hash: PaymentHash,
31963205
expected_payment_failed_permanently: bool, conditions: PaymentFailedConditions<'e>,
31973206
) {
3207+
if conditions.from_mon_update {
3208+
check_added_monitors(node, 0);
3209+
}
31983210
let events = node.node.get_and_clear_pending_events();
3211+
if conditions.from_mon_update {
3212+
check_added_monitors(node, 1);
3213+
}
31993214
expect_payment_failed_conditions_event(
32003215
events,
32013216
expected_payment_hash,

0 commit comments

Comments
 (0)