Skip to content

Commit 8383d2b

Browse files
committed
Generate new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we begin generating the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, updating functional tests for the new `ChannelMonitorUpdate`s where required.
1 parent 5a5c708 commit 8383d2b

9 files changed

+277
-48
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3956,6 +3956,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
39563956
if updates.update_id == LEGACY_CLOSED_CHANNEL_UPDATE_ID || self.lockdown_from_offchain {
39573957
assert_eq!(updates.updates.len(), 1);
39583958
match updates.updates[0] {
3959+
ChannelMonitorUpdateStep::ReleasePaymentComplete { .. } => {},
39593960
ChannelMonitorUpdateStep::ChannelForceClosed { .. } => {},
39603961
// We should have already seen a `ChannelForceClosed` update if we're trying to
39613962
// provide a preimage at this point.

lightning/src/ln/chanmon_update_fail_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3870,6 +3870,7 @@ fn do_test_durable_preimages_on_closed_channel(
38703870
};
38713871
nodes[0].node.force_close_broadcasting_latest_txn(&chan_id_ab, &node_b_id, err_msg).unwrap();
38723872
check_closed_event(&nodes[0], 1, reason, false, &[node_b_id], 100000);
3873+
check_added_monitors(&nodes[0], 1);
38733874
let as_closing_tx = nodes[0].tx_broadcaster.txn_broadcasted.lock().unwrap().split_off(0);
38743875
assert_eq!(as_closing_tx.len(), 1);
38753876

lightning/src/ln/channelmanager.rs

Lines changed: 140 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1255,6 +1255,9 @@ pub(crate) enum EventCompletionAction {
12551255
channel_funding_outpoint: OutPoint,
12561256
channel_id: ChannelId,
12571257
},
1258+
1259+
/// Note that this action will be dropped on downgrade to LDK prior to 0.2!
1260+
ReleasePaymentCompleteChannelMonitorUpdate(PaymentCompleteUpdate),
12581261
}
12591262
impl_writeable_tlv_based_enum!(EventCompletionAction,
12601263
(0, ReleaseRAAChannelMonitorUpdate) => {
@@ -1263,7 +1266,8 @@ impl_writeable_tlv_based_enum!(EventCompletionAction,
12631266
// Note that by the time we get past the required read above, channel_funding_outpoint will be
12641267
// filled in, so we can safely unwrap it here.
12651268
(3, channel_id, (default_value, ChannelId::v1_from_funding_outpoint(channel_funding_outpoint.0.unwrap()))),
1266-
}
1269+
},
1270+
{1, ReleasePaymentCompleteChannelMonitorUpdate} => (),
12671271
);
12681272

12691273
/// The source argument which is passed to [`ChannelManager::claim_mpp_part`].
@@ -7885,7 +7889,17 @@ where
78857889
// rare cases where a MonitorUpdate is replayed after restart because a
78867890
// ChannelMonitor wasn't persisted after it was applied (even though the
78877891
// ChannelManager was).
7888-
// TODO
7892+
// For such cases, we also check that there's no existing pending event to
7893+
// complete this action already, which we let finish instead.
7894+
let action =
7895+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update);
7896+
let have_action = {
7897+
let pending_events = self.pending_events.lock().unwrap();
7898+
pending_events.iter().any(|(_, act)| act.as_ref() == Some(&action))
7899+
};
7900+
if !have_action {
7901+
self.handle_post_event_actions([action]);
7902+
}
78897903
}
78907904
},
78917905
HTLCSource::PreviousHopData(HTLCPreviousHopData {
@@ -8505,19 +8519,34 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
85058519
next_user_channel_id: Option<u128>, attribution_data: Option<AttributionData>,
85068520
send_timestamp: Option<Duration>,
85078521
) {
8522+
debug_assert_eq!(
8523+
startup_replay,
8524+
!self.background_events_processed_since_startup.load(Ordering::Acquire)
8525+
);
8526+
let htlc_id = SentHTLCId::from_source(&source);
85088527
match source {
85098528
HTLCSource::OutboundRoute {
85108529
session_priv, payment_id, path, bolt12_invoice, ..
85118530
} => {
8512-
debug_assert!(self.background_events_processed_since_startup.load(Ordering::Acquire),
8531+
debug_assert!(!startup_replay,
85138532
"We don't support claim_htlc claims during startup - monitors may not be available yet");
85148533
debug_assert_eq!(next_channel_counterparty_node_id, path.hops[0].pubkey);
8515-
let mut ev_completion_action =
8534+
8535+
let mut ev_completion_action = if from_onchain {
8536+
let release = PaymentCompleteUpdate {
8537+
counterparty_node_id: next_channel_counterparty_node_id,
8538+
channel_funding_outpoint: next_channel_outpoint,
8539+
channel_id: next_channel_id,
8540+
htlc_id,
8541+
};
8542+
Some(EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(release))
8543+
} else {
85168544
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
85178545
channel_funding_outpoint: next_channel_outpoint,
85188546
channel_id: next_channel_id,
85198547
counterparty_node_id: path.hops[0].pubkey,
8520-
});
8548+
})
8549+
};
85218550
self.pending_outbound_payments.claim_htlc(
85228551
payment_id,
85238552
payment_preimage,
@@ -11141,12 +11170,18 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
1114111170
channel_id,
1114211171
};
1114311172
let reason = HTLCFailReason::from_failure_code(failure_reason);
11173+
let completion_update = Some(PaymentCompleteUpdate {
11174+
counterparty_node_id,
11175+
channel_funding_outpoint: funding_outpoint,
11176+
channel_id,
11177+
htlc_id: SentHTLCId::from_source(&htlc_update.source),
11178+
});
1114411179
self.fail_htlc_backwards_internal(
1114511180
&htlc_update.source,
1114611181
&htlc_update.payment_hash,
1114711182
&reason,
1114811183
receiver,
11149-
None,
11184+
completion_update,
1115011185
);
1115111186
}
1115211187
},
@@ -12677,13 +12712,67 @@ where
1267712712
channel_id,
1267812713
counterparty_node_id,
1267912714
} => {
12715+
let startup_complete =
12716+
self.background_events_processed_since_startup.load(Ordering::Acquire);
12717+
debug_assert!(startup_complete);
1268012718
self.handle_monitor_update_release(
1268112719
counterparty_node_id,
1268212720
channel_funding_outpoint,
1268312721
channel_id,
1268412722
None,
1268512723
);
1268612724
},
12725+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(
12726+
PaymentCompleteUpdate {
12727+
counterparty_node_id,
12728+
channel_funding_outpoint,
12729+
channel_id,
12730+
htlc_id,
12731+
},
12732+
) => {
12733+
let per_peer_state = self.per_peer_state.read().unwrap();
12734+
let mut peer_state = per_peer_state
12735+
.get(&counterparty_node_id)
12736+
.map(|state| state.lock().unwrap())
12737+
.expect("Channels originating a preimage must have peer state");
12738+
let update_id = peer_state
12739+
.closed_channel_monitor_update_ids
12740+
.get_mut(&channel_id)
12741+
.expect("Channels originating a preimage must have a monitor");
12742+
*update_id += 1;
12743+
12744+
let update = ChannelMonitorUpdate {
12745+
update_id: *update_id,
12746+
channel_id: Some(channel_id),
12747+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
12748+
htlc: htlc_id,
12749+
}],
12750+
};
12751+
12752+
let during_startup =
12753+
!self.background_events_processed_since_startup.load(Ordering::Acquire);
12754+
if during_startup {
12755+
let event = BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
12756+
counterparty_node_id,
12757+
funding_txo: channel_funding_outpoint,
12758+
channel_id,
12759+
update,
12760+
};
12761+
self.pending_background_events.lock().unwrap().push(event);
12762+
} else {
12763+
handle_new_monitor_update!(
12764+
self,
12765+
channel_funding_outpoint,
12766+
update,
12767+
peer_state,
12768+
peer_state,
12769+
per_peer_state,
12770+
counterparty_node_id,
12771+
channel_id,
12772+
POST_CHANNEL_CLOSE
12773+
);
12774+
}
12775+
},
1268712776
}
1268812777
}
1268912778
}
@@ -16410,6 +16499,7 @@ where
1641016499
monitor,
1641116500
Some(htlc.payment_hash),
1641216501
);
16502+
let htlc_id = SentHTLCId::from_source(&htlc_source);
1641316503
match htlc_source {
1641416504
HTLCSource::PreviousHopData(prev_hop_data) => {
1641516505
let pending_forward_matches_htlc = |info: &PendingAddHTLCInfo| {
@@ -16467,6 +16557,15 @@ where
1646716557
} => {
1646816558
if let Some(preimage) = preimage_opt {
1646916559
let pending_events = Mutex::new(pending_events_read);
16560+
let update = PaymentCompleteUpdate {
16561+
counterparty_node_id: monitor.get_counterparty_node_id(),
16562+
channel_funding_outpoint: monitor.get_funding_txo(),
16563+
channel_id: monitor.channel_id(),
16564+
htlc_id,
16565+
};
16566+
let mut compl_action = Some(
16567+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update)
16568+
);
1647016569
// Note that we set `from_onchain` to "false" here,
1647116570
// deliberately keeping the pending payment around forever.
1647216571
// Given it should only occur when we have a channel we're
@@ -16475,13 +16574,6 @@ where
1647516574
// generating a `PaymentPathSuccessful` event but regenerating
1647616575
// it and the `PaymentSent` on every restart until the
1647716576
// `ChannelMonitor` is removed.
16478-
let mut compl_action = Some(
16479-
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
16480-
channel_funding_outpoint: monitor.get_funding_txo(),
16481-
channel_id: monitor.channel_id(),
16482-
counterparty_node_id: path.hops[0].pubkey,
16483-
},
16484-
);
1648516577
pending_outbounds.claim_htlc(
1648616578
payment_id,
1648716579
preimage,
@@ -16493,6 +16585,41 @@ where
1649316585
&pending_events,
1649416586
&&logger,
1649516587
);
16588+
// If the completion action was not consumed, then there was no
16589+
// payment to claim, and we need to tell the `ChannelMonitor`
16590+
// we don't need to hear about the HTLC again, at least as long
16591+
// as the PaymentSent event isn't still sitting around in our
16592+
// event queue.
16593+
let have_action = if compl_action.is_some() {
16594+
let pending_events = pending_events.lock().unwrap();
16595+
pending_events.iter().any(|(_, act)| *act == compl_action)
16596+
} else {
16597+
false
16598+
};
16599+
if !have_action && compl_action.is_some() {
16600+
let mut peer_state = per_peer_state
16601+
.get(&counterparty_node_id)
16602+
.map(|state| state.lock().unwrap())
16603+
.expect("Channels originating a preimage must have peer state");
16604+
let update_id = peer_state
16605+
.closed_channel_monitor_update_ids
16606+
.get_mut(channel_id)
16607+
.expect("Channels originating a preimage must have a monitor");
16608+
*update_id += 1;
16609+
16610+
pending_background_events.push(BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
16611+
counterparty_node_id: monitor.get_counterparty_node_id(),
16612+
funding_txo: monitor.get_funding_txo(),
16613+
channel_id: monitor.channel_id(),
16614+
update: ChannelMonitorUpdate {
16615+
update_id: *update_id,
16616+
channel_id: Some(monitor.channel_id()),
16617+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
16618+
htlc: htlc_id,
16619+
}],
16620+
},
16621+
});
16622+
}
1649616623
pending_events_read = pending_events.into_inner().unwrap();
1649716624
}
1649816625
},

lightning/src/ln/functional_test_utils.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2798,6 +2798,9 @@ pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM = CM>>(
27982798
expected_fee_msat_opt: Option<Option<u64>>, expect_per_path_claims: bool,
27992799
expect_post_ev_mon_update: bool,
28002800
) -> (Option<PaidBolt12Invoice>, Vec<Event>) {
2801+
if expect_post_ev_mon_update {
2802+
check_added_monitors(node, 0);
2803+
}
28012804
let events = node.node().get_and_clear_pending_events();
28022805
let expected_payment_hash = PaymentHash(
28032806
bitcoin::hashes::sha256::Hash::hash(&expected_payment_preimage.0).to_byte_array(),
@@ -3047,6 +3050,7 @@ pub struct PaymentFailedConditions<'a> {
30473050
pub(crate) expected_blamed_chan_closed: Option<bool>,
30483051
pub(crate) expected_mpp_parts_remain: bool,
30493052
pub(crate) retry_expected: bool,
3053+
pub(crate) from_mon_update: bool,
30503054
}
30513055

30523056
impl<'a> PaymentFailedConditions<'a> {
@@ -3057,6 +3061,7 @@ impl<'a> PaymentFailedConditions<'a> {
30573061
expected_blamed_chan_closed: None,
30583062
expected_mpp_parts_remain: false,
30593063
retry_expected: false,
3064+
from_mon_update: false,
30603065
}
30613066
}
30623067
pub fn mpp_parts_remain(mut self) -> Self {
@@ -3081,6 +3086,10 @@ impl<'a> PaymentFailedConditions<'a> {
30813086
self.retry_expected = true;
30823087
self
30833088
}
3089+
pub fn from_mon_update(mut self) -> Self {
3090+
self.from_mon_update = true;
3091+
self
3092+
}
30843093
}
30853094

30863095
#[cfg(any(test, feature = "_externalize_tests"))]
@@ -3190,7 +3199,13 @@ pub fn expect_payment_failed_conditions<'a, 'b, 'c, 'd, 'e>(
31903199
node: &'a Node<'b, 'c, 'd>, expected_payment_hash: PaymentHash,
31913200
expected_payment_failed_permanently: bool, conditions: PaymentFailedConditions<'e>,
31923201
) {
3202+
if conditions.from_mon_update {
3203+
check_added_monitors(node, 0);
3204+
}
31933205
let events = node.node.get_and_clear_pending_events();
3206+
if conditions.from_mon_update {
3207+
check_added_monitors(node, 1);
3208+
}
31943209
expect_payment_failed_conditions_event(
31953210
events,
31963211
expected_payment_hash,

0 commit comments

Comments
 (0)