Skip to content

Commit fa7df00

Browse files
committed
Process monitor update events in block_[dis]connected asynchronously
The instructions for `ChannelManagerReadArgs` indicate that you need to connect blocks on a newly-deserialized `ChannelManager` in a separate pass from the newly-deserialized `ChannelMontiors` as the `ChannelManager` assumes the ability to update the monitors during block [dis]connected events, saying that users need to:``` 4) Reconnect blocks on your ChannelMonitors 5) Move the ChannelMonitors into your local chain::Watch. 6) Disconnect/connect blocks on the ChannelManager. ``` This is fine for `ChannelManager`'s purpose, but is very awkward for users. Notably, our new `lightning-block-sync` implemented on-load reconnection in the most obvious (and performant) way - connecting the blocks all at once, violating the `ChannelManagerReadArgs` API. Luckily, the events in question really don't need to be processed with the same urgency as most channel monitor updates. The only two monitor updates which can occur in block_[dis]connected is either a) in block_connected, we identify a now-confirmed commitment transaction, closing one of our channels, or b) in block_disconnected, the funding transaction is reorganized out of the chain, making our channel no longer funded. In the case of (a), sending a monitor update which broadcasts a conflicting holder commitment transaction is far from time-critical, though we should still ensure we do it. In the case of (b), we should try to broadcast our holder commitment transaction when we can, but within a few minutes is fine on the scale of block mining anyway. Note that in both cases cannot simply move the logic to ChannelMonitor::block[dis]_connected, as this could result in us broadcasting a commitment transaction from ChannelMonitor, then revoking the now-broadcasted state, and only then receiving the block_[dis]connected event in the ChannelManager. Thus, we move both events into an internal invent queue and process them in timer_chan_freshness_every_min().
1 parent fd31d91 commit fa7df00

File tree

4 files changed

+103
-7
lines changed

4 files changed

+103
-7
lines changed

lightning/src/ln/channel.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4173,6 +4173,11 @@ impl<Signer: Sign> Channel<Signer> {
41734173
/// Also returns the list of payment_hashes for channels which we can safely fail backwards
41744174
/// immediately (others we will have to allow to time out).
41754175
pub fn force_shutdown(&mut self, should_broadcast: bool) -> (Option<(OutPoint, ChannelMonitorUpdate)>, Vec<(HTLCSource, PaymentHash)>) {
4176+
// Note that we MUST NOT generate a monitor update which does anything but indicate
4177+
// force-closure - we're called during initialization prior to the chain_monitor in the
4178+
// encompassing ChannelManager being fully configured in some cases. Thus, its likely any
4179+
// monitor events we generate will be delayed in being processed!
4180+
// See the docs for `ChannelManagerReadArgs` for more.
41764181
assert!(self.channel_state != ChannelState::ShutdownComplete as u32);
41774182

41784183
// We go ahead and "free" any holding cell HTLCs or HTLCs we haven't yet committed to and

lightning/src/ln/channelmanager.rs

Lines changed: 95 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -333,6 +333,15 @@ pub(super) struct ChannelHolder<Signer: Sign> {
333333
pub(super) pending_msg_events: Vec<MessageSendEvent>,
334334
}
335335

336+
/// Events which we process internally, but which cannot be procsesed immediately at the
337+
/// generation site for some reason. They are handled in timer_chan_freshness_every_min, so may be
338+
/// processed with quite some time lag.
339+
enum BackgroundManagerEvent {
340+
/// Handle a ChannelMonitorUpdate which closes a channel, broadcasting its current latest holder
341+
/// commitment transaction.
342+
ClosingMonitorUpdate((OutPoint, ChannelMonitorUpdate)),
343+
}
344+
336345
/// State we hold per-peer. In the future we should put channels in here, but for now we only hold
337346
/// the latest Init features we heard from the peer.
338347
struct PeerState {
@@ -436,6 +445,7 @@ pub struct ChannelManager<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref,
436445
per_peer_state: RwLock<HashMap<PublicKey, Mutex<PeerState>>>,
437446

438447
pending_events: Mutex<Vec<events::Event>>,
448+
pending_background_events: Mutex<Vec<BackgroundManagerEvent>>,
439449
/// Used when we have to take a BIG lock to make sure everything is self-consistent.
440450
/// Essentially just when we're serializing ourselves out.
441451
/// Taken first everywhere where we are making changes before any other locks.
@@ -793,6 +803,7 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
793803
per_peer_state: RwLock::new(HashMap::new()),
794804

795805
pending_events: Mutex::new(Vec::new()),
806+
pending_background_events: Mutex::new(Vec::new()),
796807
total_consistency_lock: RwLock::new(()),
797808
persistence_notifier: PersistenceNotifier::new(),
798809

@@ -1853,13 +1864,35 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
18531864
events.append(&mut new_events);
18541865
}
18551866

1867+
/// Free the background events, generally called from timer_chan_freshness_every_min.
1868+
///
1869+
/// Exposed for testing to allow us to process events quickly without generating accidental
1870+
/// BroadcastChannelUpdate events in timer_chan_freshness_every_min.
1871+
///
1872+
/// Expects the caller to have a total_consistency_lock read lock.
1873+
pub(crate) fn process_background_events(&self) {
1874+
let mut background_events = Vec::new();
1875+
mem::swap(&mut *self.pending_background_events.lock().unwrap(), &mut background_events);
1876+
for event in background_events.drain(..) {
1877+
match event {
1878+
BackgroundManagerEvent::ClosingMonitorUpdate((funding_txo, update)) => {
1879+
// The channel has already been closed, so no use bothering to care about the
1880+
// monitor updating completing.
1881+
let _ = self.chain_monitor.update_channel(funding_txo, update);
1882+
},
1883+
}
1884+
}
1885+
}
1886+
18561887
/// If a peer is disconnected we mark any channels with that peer as 'disabled'.
18571888
/// After some time, if channels are still disabled we need to broadcast a ChannelUpdate
18581889
/// to inform the network about the uselessness of these channels.
18591890
///
18601891
/// This method handles all the details, and must be called roughly once per minute.
18611892
pub fn timer_chan_freshness_every_min(&self) {
18621893
let _persistence_guard = PersistenceNotifierGuard::new(&self.total_consistency_lock, &self.persistence_notifier);
1894+
self.process_background_events();
1895+
18631896
let mut channel_state_lock = self.channel_state.lock().unwrap();
18641897
let channel_state = &mut *channel_state_lock;
18651898
for (_, chan) in channel_state.by_id.iter_mut() {
@@ -1952,6 +1985,10 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
19521985
//identify whether we sent it or not based on the (I presume) very different runtime
19531986
//between the branches here. We should make this async and move it into the forward HTLCs
19541987
//timer handling.
1988+
1989+
// Note that we MUST NOT end up calling methods on self.chain_monitor here - we're called
1990+
// from block_connected which may run during initialization prior to the chain_monitor
1991+
// being fully configured. See the docs for `ChannelManagerReadArgs` for more.
19551992
match source {
19561993
HTLCSource::OutboundRoute { ref path, .. } => {
19571994
log_trace!(self.logger, "Failing outbound payment HTLC with payment_hash {}", log_bytes!(payment_hash.0));
@@ -3166,6 +3203,9 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
31663203
{
31673204
/// Updates channel state based on transactions seen in a connected block.
31683205
pub fn block_connected(&self, header: &BlockHeader, txdata: &TransactionData, height: u32) {
3206+
// Note that we MUST NOT end up calling methods on self.chain_monitor here - we're called
3207+
// during initialization prior to the chain_monitor being fully configured in some cases.
3208+
// See the docs for `ChannelManagerReadArgs` for more.
31693209
let header_hash = header.block_hash();
31703210
log_trace!(self.logger, "Block {} at height {} connected", header_hash, height);
31713211
let _persistence_guard = PersistenceNotifierGuard::new(&self.total_consistency_lock, &self.persistence_notifier);
@@ -3217,9 +3257,7 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
32173257
if let Some(short_id) = channel.get_short_channel_id() {
32183258
short_to_id.remove(&short_id);
32193259
}
3220-
// It looks like our counterparty went on-chain. We go ahead and
3221-
// broadcast our latest local state as well here, just in case its
3222-
// some kind of SPV attack, though we expect these to be dropped.
3260+
// It looks like our counterparty went on-chain. Close the channel.
32233261
failed_channels.push(channel.force_shutdown(true));
32243262
if let Ok(update) = self.get_channel_update(&channel) {
32253263
pending_msg_events.push(events::MessageSendEvent::BroadcastChannelUpdate {
@@ -3253,7 +3291,18 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
32533291
!htlcs.is_empty() // Only retain this entry if htlcs has at least one entry.
32543292
});
32553293
}
3256-
for failure in failed_channels.drain(..) {
3294+
for mut failure in failed_channels.drain(..) {
3295+
// It looks like our counterparty went on-chain. We cannot broadcast our latest local
3296+
// state via monitor update (as Channel::force_shutdown tries to make us do) as we may
3297+
// still be in initialization, so we track the update internally and handle it when the
3298+
// user next calls timer_chan_freshness_every_min, guaranteeing we're running normally.
3299+
if let Some((funding_txo, update)) = failure.0.take() {
3300+
assert_eq!(update.updates.len(), 1);
3301+
if let ChannelMonitorUpdateStep::ChannelForceClosed { should_broadcast } = update.updates[0] {
3302+
assert!(should_broadcast);
3303+
} else { unreachable!(); }
3304+
self.pending_background_events.lock().unwrap().push(BackgroundManagerEvent::ClosingMonitorUpdate((funding_txo, update)));
3305+
}
32573306
self.finish_force_close_channel(failure);
32583307
}
32593308

@@ -3281,6 +3330,9 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
32813330
/// If necessary, the channel may be force-closed without letting the counterparty participate
32823331
/// in the shutdown.
32833332
pub fn block_disconnected(&self, header: &BlockHeader) {
3333+
// Note that we MUST NOT end up calling methods on self.chain_monitor here - we're called
3334+
// during initialization prior to the chain_monitor being fully configured in some cases.
3335+
// See the docs for `ChannelManagerReadArgs` for more.
32843336
let _persistence_guard = PersistenceNotifierGuard::new(&self.total_consistency_lock, &self.persistence_notifier);
32853337
let mut failed_channels = Vec::new();
32863338
{
@@ -3305,7 +3357,19 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
33053357
}
33063358
});
33073359
}
3308-
for failure in failed_channels.drain(..) {
3360+
for mut failure in failed_channels.drain(..) {
3361+
// Channel::block_disconnected tells us to close if the funding transaction was
3362+
// un-confirmed due to a reorg. We cannot broadcast our latest local state via monitor
3363+
// update (as Channel::force_shutdown tries to make us do) as we may still be in
3364+
// initialization, so we track the update internally and handle it when the user next
3365+
// calls timer_chan_freshness_every_min, guaranteeing we're running normally.
3366+
if let Some((funding_txo, update)) = failure.0.take() {
3367+
assert_eq!(update.updates.len(), 1);
3368+
if let ChannelMonitorUpdateStep::ChannelForceClosed { should_broadcast } = update.updates[0] {
3369+
assert!(should_broadcast);
3370+
} else { unreachable!(); }
3371+
self.pending_background_events.lock().unwrap().push(BackgroundManagerEvent::ClosingMonitorUpdate((funding_txo, update)));
3372+
}
33093373
self.finish_force_close_channel(failure);
33103374
}
33113375
self.latest_block_height.fetch_sub(1, Ordering::AcqRel);
@@ -3913,6 +3977,17 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> Writeable f
39133977
event.write(writer)?;
39143978
}
39153979

3980+
let background_events = self.pending_background_events.lock().unwrap();
3981+
(background_events.len() as u64).write(writer)?;
3982+
for event in background_events.iter() {
3983+
match event {
3984+
BackgroundManagerEvent::ClosingMonitorUpdate((funding_txo, monitor_update)) => {
3985+
funding_txo.write(writer)?;
3986+
monitor_update.write(writer)?;
3987+
},
3988+
}
3989+
}
3990+
39163991
(self.last_node_announcement_serial.load(Ordering::Acquire) as u32).write(writer)?;
39173992

39183993
Ok(())
@@ -3931,8 +4006,11 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> Writeable f
39314006
/// 3) Register all relevant ChannelMonitor outpoints with your chain watch mechanism using
39324007
/// ChannelMonitor::get_outputs_to_watch() and ChannelMonitor::get_funding_txo().
39334008
/// 4) Reconnect blocks on your ChannelMonitors.
3934-
/// 5) Move the ChannelMonitors into your local chain::Watch.
3935-
/// 6) Disconnect/connect blocks on the ChannelManager.
4009+
/// 5) Disconnect/connect blocks on the ChannelManager.
4010+
/// 6) Move the ChannelMonitors into your local chain::Watch.
4011+
///
4012+
/// Note that the ordering of #4-6 is not of importance, however all three must occur before you
4013+
/// call any other methods on the newly-deserialized ChannelManager.
39364014
///
39374015
/// Note that because some channels may be closed during deserialization, it is critical that you
39384016
/// always deserialize only the latest version of a ChannelManager and ChannelMonitors available to
@@ -4134,6 +4212,15 @@ impl<'a, Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref>
41344212
}
41354213
}
41364214

4215+
let background_event_count: u64 = Readable::read(reader)?;
4216+
let mut pending_background_events_read: Vec<BackgroundManagerEvent> = Vec::with_capacity(cmp::min(background_event_count as usize, MAX_ALLOC_SIZE/mem::size_of::<BackgroundManagerEvent>()));
4217+
for _ in 0..background_event_count {
4218+
match <u8 as Readable>::read(reader)? {
4219+
0 => pending_background_events_read.push(BackgroundManagerEvent::ClosingMonitorUpdate((Readable::read(reader)?, Readable::read(reader)?))),
4220+
_ => return Err(DecodeError::InvalidValue),
4221+
}
4222+
}
4223+
41374224
let last_node_announcement_serial: u32 = Readable::read(reader)?;
41384225

41394226
let channel_manager = ChannelManager {
@@ -4160,6 +4247,7 @@ impl<'a, Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref>
41604247
per_peer_state: RwLock::new(per_peer_state),
41614248

41624249
pending_events: Mutex::new(pending_events_read),
4250+
pending_background_events: Mutex::new(pending_background_events_read),
41634251
total_consistency_lock: RwLock::new(()),
41644252
persistence_notifier: PersistenceNotifier::new(),
41654253

lightning/src/ln/functional_test_utils.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,13 @@ pub fn connect_block<'a, 'b, 'c, 'd>(node: &'a Node<'b, 'c, 'd>, block: &Block,
8383
let txdata: Vec<_> = block.txdata.iter().enumerate().collect();
8484
node.chain_monitor.chain_monitor.block_connected(&block.header, &txdata, height);
8585
node.node.block_connected(&block.header, &txdata, height);
86+
node.node.process_background_events();
8687
}
8788

8889
pub fn disconnect_block<'a, 'b, 'c, 'd>(node: &'a Node<'b, 'c, 'd>, header: &BlockHeader, height: u32) {
8990
node.chain_monitor.chain_monitor.block_disconnected(header, height);
9091
node.node.block_disconnected(header);
92+
node.node.process_background_events();
9193
}
9294

9395
pub struct TestChanMonCfg {

lightning/src/ln/reorg_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,7 @@ fn test_unconf_chan() {
207207
nodes[0].node.block_disconnected(&headers.pop().unwrap());
208208
}
209209
check_closed_broadcast!(nodes[0], false);
210+
nodes[0].node.process_background_events(); // Required to free the pending background monitor update
210211
check_added_monitors!(nodes[0], 1);
211212
let channel_state = nodes[0].node.channel_state.lock().unwrap();
212213
assert_eq!(channel_state.by_id.len(), 0);

0 commit comments

Comments
 (0)