Skip to content
This repository was archived by the owner on Jan 6, 2025. It is now read-only.

Fix deadlock on deserialization failure #116

Merged
merged 3 commits into from
Mar 5, 2024

Conversation

tnull
Copy link
Collaborator

@tnull tnull commented Feb 29, 2024

Previously, we introduced disconnecting a counterparty who sends us bogus messages that we're unable to parse. However, before we disconnect we also send out a last LSPSMessage::Invalid in the hopes that the
counterparty understands this.

However, when we added this logic we unfortunately overlooked that we lock the request_id_to_method_map Mutex for parsing the message, but also try to lock when the PeerHandler calls get_and_clear_pending_msg.

Here, we avoid the resulting deadlock by dropping the lock as soon as it's not required anymore after parsing.

In a second commit, we avoid unnecessary locking of request_ids_and_methods_map in get_and_clear_pending_msg and remove a potentially dangerous (if we ever were to fail serialization for some reason) unwrap.

Previously, we introduced disconnecting a counterparty who sends us
bogus messages that we're unable to parse. However, before we disconnect
we also send out a last `LSPSMessage::Invalid` in the hopes that the
counterparty understands this.

However, when we added this logic we unfortunately overlooked that we
lock the `request_id_to_method_map` `Mutex` for parsing the message, but
also try to lock when the `PeerHandler` calls
`get_and_clear_pending_msgs`.

Here, we avoid the resulting deadlock by dropping the lock as soon as
it's not required anymore after parsing.
.. in `get_and_clear_pending_msg`. We also remove a potentially
dangerous (if we ever were to fail serialization for some reason)
`unwrap`.
Disconnecting might be unfortunate if we have open channels with the
counterparty. If they send us bogus data, we now just add them to a
(currently unpersisted) ignore list.

We should in the future figure out when to drop them from this list, so
that peers have the chance to recover.
action: ErrorAction::IgnoreAndLog(Level::Trace),
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this respond to peer with an error or just log it on our end? We've run into issues where we just silently fail/ignore requests (when user hits our rate limits) and the wallet user experience is odd because it just hangs. This could introduce that experience?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in this case it's really intended as it's a DoS protection if the user sends us bogus data. The alternative is disconnecting them which I'd even prefer, but we don't know if the user has any channels with us that still need to be kept operational. While this just adds the user to the ignorelist until restart, we def. want to remove them after a while in the future, especially once we start persisting things, as tracked here: https://github.com/lightningdevkit/lightning-liquidity/issues/117

@tnull tnull merged commit 8809ab1 into lightningdevkit:main Mar 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants