Skip to content

Commit 4703d4e

Browse files
committed
Do not require that no calls are made post-disconnect_socket
The only practical way to meet this requirement is to block disconnect_socket until any pending events are fully processed, leading to this trivial deadlock: * Thread 1: select() woken up due to a read event * Thread 2: Event processing causes a disconnect_socket call to fire while the PeerManager lock is held. * Thread 2: disconnect_socket blocks until the read event in thread 1 completes. * Thread 1: bytes are read from the socket and PeerManager::read_event is called, waiting on the lock still held by thread 2. There isn't a trivial way to address this deadlock without simply making the final read_event call return immediately, which we do here. This also implies that users can freely call event methods after disconnect_socket, but only so far as the socket descriptor is different from any later socket descriptor (ie until the file descriptor is re-used).
1 parent 2f6205b commit 4703d4e

File tree

1 file changed

+22
-15
lines changed

1 file changed

+22
-15
lines changed

lightning/src/ln/peer_handler.rs

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -194,11 +194,8 @@ pub trait SocketDescriptor : cmp::Eq + hash::Hash + Clone {
194194
/// indicating that read events on this descriptor should resume. A resume_read of false does
195195
/// *not* imply that further read events should be paused.
196196
fn send_data(&mut self, data: &[u8], resume_read: bool) -> usize;
197-
/// Disconnect the socket pointed to by this SocketDescriptor. Once this function returns, no
198-
/// more calls to write_buffer_space_avail, read_event or socket_disconnected may be made with
199-
/// this descriptor. No socket_disconnected call should be generated as a result of this call,
200-
/// though races may occur whereby disconnect_socket is called after a call to
201-
/// socket_disconnected but prior to socket_disconnected returning.
197+
/// Disconnect the socket pointed to by this SocketDescriptor.
198+
/// No [`PeerManager::socket_disconnected`] call need be generated as a result of this call.
202199
fn disconnect_socket(&mut self);
203200
}
204201

@@ -616,7 +613,12 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
616613
pub fn write_buffer_space_avail(&self, descriptor: &mut Descriptor) -> Result<(), PeerHandleError> {
617614
let mut peers = self.peers.lock().unwrap();
618615
match peers.peers.get_mut(descriptor) {
619-
None => panic!("Descriptor for write_event is not already known to PeerManager"),
616+
None => {
617+
// This is most likely a simple race condition where the user found that the socket
618+
// was writeable, then we told the user to `disconnect_socket()`, then they called
619+
// this method. Return an error to make sure we get disconnected.
620+
return Err(PeerHandleError { no_connection_possible: false });
621+
},
620622
Some(peer) => {
621623
peer.awaiting_write_event = false;
622624
self.do_attempt_write_data(descriptor, peer);
@@ -636,7 +638,6 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
636638
/// If Ok(true) is returned, further read_events should not be triggered until a send_data call
637639
/// on this file descriptor has resume_read set (preventing DoS issues in the send buffer).
638640
///
639-
/// Panics if the descriptor was not previously registered in a new_*_connection event.
640641
pub fn read_event(&self, peer_descriptor: &mut Descriptor, data: &[u8]) -> Result<bool, PeerHandleError> {
641642
match self.do_read_event(peer_descriptor, data) {
642643
Ok(res) => Ok(res),
@@ -664,7 +665,12 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
664665
let mut msgs_to_forward = Vec::new();
665666
let mut peer_node_id = None;
666667
let pause_read = match peers.peers.get_mut(peer_descriptor) {
667-
None => panic!("Descriptor for read_event is not already known to PeerManager"),
668+
None => {
669+
// This is most likely a simple race condition where the user read some bytes
670+
// from the socket, then we told the user to `disconnect_socket()`, then they
671+
// called this method. Return an error to make sure we get disconnected.
672+
return Err(PeerHandleError { no_connection_possible: false });
673+
},
668674
Some(peer) => {
669675
assert!(peer.pending_read_buffer.len() > 0);
670676
assert!(peer.pending_read_buffer.len() > peer.pending_read_buffer_pos);
@@ -1292,12 +1298,9 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
12921298

12931299
/// Indicates that the given socket descriptor's connection is now closed.
12941300
///
1295-
/// This must only be called if the socket has been disconnected by the peer or your own
1296-
/// decision to disconnect it and must NOT be called in any case where other parts of this
1297-
/// library (eg PeerHandleError, explicit disconnect_socket calls) instruct you to disconnect
1298-
/// the peer.
1299-
///
1300-
/// Panics if the descriptor was not previously registered in a successful new_*_connection event.
1301+
/// This need only be called if the socket has been disconnected by the peer or your own
1302+
/// decision to disconnect it and may be skipped in any case where other parts of this library
1303+
/// (eg PeerHandleError, explicit disconnect_socket calls) instruct you to disconnect the peer.
13011304
pub fn socket_disconnected(&self, descriptor: &Descriptor) {
13021305
self.disconnect_event_internal(descriptor, false);
13031306
}
@@ -1306,7 +1309,11 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
13061309
let mut peers = self.peers.lock().unwrap();
13071310
let peer_option = peers.peers.remove(descriptor);
13081311
match peer_option {
1309-
None => panic!("Descriptor for disconnect_event is not already known to PeerManager"),
1312+
None => {
1313+
// This is most likely a simple race condition where the user found that the socket
1314+
// was disconnected, then we told the user to `disconnect_socket()`, then they
1315+
// called this method. Either way we're disconnected, return.
1316+
},
13101317
Some(peer) => {
13111318
match peer.their_node_id {
13121319
Some(node_id) => {

0 commit comments

Comments
 (0)