Skip to content

Somewhat clean up closure pipelines #3881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

TheBlueMatt
Copy link
Collaborator

Still have a few commits on top here but this ended up getting quite big so I'll do them in a followup. There's a pile of things to DRY things up and make small wins here or there, but the main two commits are:

Drop the (broken) ability to disable broadcast when FC'ing a chan

`ChannelManager` has two public methods to force-close a (single)
channel - `force_close_broadcasting_latest_txn` and
`force_close_without_broadcasting_txn`. The second exists to allow
a user who has restored a stale backup to (sort of) recover their
funds by starting, force-closing channels without broadcasting
their latest state, and only then reconnecting to peers (which
would otherwise cause a panic due to the stale state).

To my knowledge, no one has ever implemented this (fairly
half-assed) recovery flow, and more importantly its rather
substantially broken. `ChannelMonitor` has never tracked any
concept of whether a channel is stale, and if we connect blocks
towards an HTLC time-out it will immediately broadcast the stale
state anyway.

Finally, we really shouldn't encourage users to try to "recover"
from a stale state by running an otherwise-operational node on the
other end. Rather, we should formalize such a recovery flow by
having a `ChainMonitor` variant that takes a list of
`ChannelMonitor`s and claims available funds, dropping the
`ChannelManager` entirely.

While work continues towards that goal, having long-broken
functionality in `ChannelManager` is also not acceptable, so here
we simply remove the "without broadcasting" force-closure version.

Fixes #2875

and

Decide on close-broadcasting commitment txn based on channel state

In a previous commit, we removed the ability for users to pick
whether we will broadcast a commitment transaction on channel
closure. However, that doesn't mean that there is no value in never
broadcasting commitment transactions on channel closure. Rather, we
use it to avoid broadcasting transactions which we know cannot
confirm if the channel's funding transaction was not broadcasted.

Here we make this relationship more formal by splitting the
force-closure handling logic in `Channel` into the existing
`ChannelContext::force_shutdown` as well as a new
`ChannelContext::abandon_unfunded_chan`.
`ChannelContext::force_shutdown` is the only public method, but it
delegates to `abandon_unfunded_chan` based on the channel's state.

This has the nice side effect of avoiding commitment transaction
broadcasting when a batch open fails to get past the funding stage.

`ChannelManager::force_close_channel_with_peer` takes the peer's
node_id as a parameter and then returns it, presumably leftover
from before we stored channels indexed by peers.

Here we simply drop the return value.
When a channel is "coop" closed prior to it being funded, it is
closed immediately. Instead of reporting
`ClosureReason::*InitiatedCooperativeClosure` (which would be
somewhat misleading), we report
`ClosureReason::CounterpartyCoopClosedUnfundedCHannel` when our
counterparty does it. However, when we do it, we report
`ClosureReason::HolderForceClosed`, which is highly confusing given
the user did *not* call a force-closure method.

Instead, here, we add a
`ClosureReason::LocallyCoopClosedUnfundedChannel` to match the
`CounterpartyCoopClosedUnfundedCHannel` variant and use it.
`ClosureReason::HolderForceClosed` says explicitly in its docs
"Closure generated from `ChannelManager...`, called by the user."
However, we were using it extensively when we fail a channel due to
errors handling messages or generating our own responses.

Here, we convert these cases to the catch-all
`ClosureReason::ProcessingError` which exists to communicate errors
including a string that describes what happened.

This should substantially improve debug-ability in closure cases
by avoiding having to pull up logs.
Here we fix a few cases of swallowing appropriate `ClosureReason`s
and expand `ClosureReason::FundingTimedOut` to include when we
didn't get a channel funded in time, rather than using
`HolderForceClosed`.
In d17e759 we added the ability
for users to decide which message to send to peers when we
force-close a channel. Here we pipe that message back out through
the channel closure event via a new `message` field in
`ClosureReason::HolderForceClosed`.

This is nice to have but more importantly will be used in a coming
commit to simplify the internal interface to the force-closure
logic.
Removes totally unnecessary `#[inline]`s  on `MsgHandleErrInternal`
methods (LLVM will decide for us if it makes sense to inline
something, explicit `#[inline(always)]` belongs only on incredibly
performance-sensitive code and `#[inline]` belongs only on short
public methods that should be inlined into downstream crates) and
`rustfmt` `MsgHandleErrInternal::from_chan_no_close`.
`ChannelManager` has two public methods to force-close a (single)
channel - `force_close_broadcasting_latest_txn` and
`force_close_without_broadcasting_txn`. The second exists to allow
a user who has restored a stale backup to (sort of) recover their
funds by starting, force-closing channels without broadcasting
their latest state, and only then reconnecting to peers (which
would otherwise cause a panic due to the stale state).

To my knowledge, no one has ever implemented this (fairly
half-assed) recovery flow, and more importantly its rather
substantially broken. `ChannelMonitor` has never tracked any
concept of whether a channel is stale, and if we connect blocks
towards an HTLC time-out it will immediately broadcast the stale
state anyway.

Finally, we really shouldn't encourage users to try to "recover"
from a stale state by running an otherwise-operational node on the
other end. Rather, we should formalize such a recovery flow by
having a `ChainMonitor` variant that takes a list of
`ChannelMonitor`s and claims available funds, dropping the
`ChannelManager` entirely.

While work continues towards that goal, having long-broken
functionality in `ChannelManager` is also not acceptable, so here
we simply remove the "without broadcasting" force-closure version.

Fixes lightningdevkit#2875
While most of our closure logic mostly lives in
`locked_close_chan`, some important steps happen in
`convert_channel_err` and `handle_error` (mostly broadcasting a
channel closure `ChannelUpdate` and sending a relevant error
message to our peer).

While its fine to use `locked_close_chan` directly when we
manually write out the relevant extra steps, its nice to avoid it
to DRY up some of our closure logic, which we do here.
`Channel`'s new `FundingScope` exists to store both the channel's
live funding information as well as any in-flight splices. In order
to keep the patches which introduced and began using it simple, it
was exposed outside of `channel.rs` to `channelmanager.rs`,
breaking the `Channel` abstraction somewhat.

Here we remove one case of `FundingScope` being passed around
`channelmanager.rs` (in the hopes of eventually keeping it entirely
contained within `channel.rs`). Specifically, we remove the
`FundingScope` parameter from `locked_close_channel`.
In a previous commit, we removed the ability for users to pick
whether we will broadcast a commitment transaction on channel
closure. However, that doesn't mean that there is no value in never
broadcasting commitment transactions on channel closure. Rather, we
use it to avoid broadcasting transactions which we know cannot
confirm if the channel's funding transaction was not broadcasted.

Here we make this relationship more formal by splitting the
force-closure handling logic in `Channel` into the existing
`ChannelContext::force_shutdown` as well as a new
`ChannelContext::abandon_unfunded_chan`.
`ChannelContext::force_shutdown` is the only public method, but it
delegates to `abandon_unfunded_chan` based on the channel's state.

This has the nice side effect of avoiding commitment transaction
broadcasting when a batch open fails to get past the funding stage.
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 23, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

/// Also returns the list of payment_hashes for channels which we can safely fail backwards
/// immediately (others we will have to allow to time out).
/// Shuts down this channel (no more calls into this Channel may be made afterwards except
/// those explicitly stated to be alowed after shutdown, eg some simple getters).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the comment: alowed should be allowed.

Suggested change
/// those explicitly stated to be alowed after shutdown, eg some simple getters).
/// those explicitly stated to be allowed after shutdown, eg some simple getters).

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +3164 to +3166
/// For non-coop-close cases, you should generally prefer to call `convert_channel_err` and
/// [`handle_error`] instead (which delgate to this and [`ChannelManager::finish_close_channel`]),
/// as they ensure the relevant messages go out as well. In a coop close case, calling this
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a small typo in the documentation: delgate should be delegate.

Suggested change
/// For non-coop-close cases, you should generally prefer to call `convert_channel_err` and
/// [`handle_error`] instead (which delgate to this and [`ChannelManager::finish_close_channel`]),
/// as they ensure the relevant messages go out as well. In a coop close case, calling this
/// For non-coop-close cases, you should generally prefer to call `convert_channel_err` and
/// [`handle_error`] instead (which delegate to this and [`ChannelManager::finish_close_channel`]),
/// as they ensure the relevant messages go out as well. In a coop close case, calling this

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz June 23, 2025 01:58
@tnull tnull requested review from tnull and removed request for jkczyz June 23, 2025 06:12
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to document all these subtle API changes in a pending_changelog, as users might be leaning on a certain error variant.

@@ -335,6 +335,8 @@ pub enum ClosureReason {
/// [`ChannelManager::force_close_broadcasting_latest_txn`]: crate::ln::channelmanager::ChannelManager::force_close_broadcasting_latest_txn.
/// [`ChannelManager::force_close_without_broadcasting_txn`]: crate::ln::channelmanager::ChannelManager::force_close_without_broadcasting_txn.
broadcasted_latest_txn: Option<bool>,
/// XXX: What was passed to force_close!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this XXX still needs to be addressed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, sorry, I was somewhat rushing to get it up last night. Just needed to write docs here.

(2, HolderForceClosed) => { (1, broadcasted_latest_txn, option) },
(2, HolderForceClosed) => {
(1, broadcasted_latest_txn, option),
(3, message, required), // XXX: upgrade to empty string or whatever and document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: unaddressed XXX. Probably could use a default_value with "Channel force-closed" here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this commit is touching a lot of test code now. Should the respective files be converted to rustfmt::skip first, so that you can remove them for the parts that you touch?

/// Some links printed in log lines are included here to check them during build (when run with
/// `cargo doc --document-private-items`):
/// [`super::channelmanager::ChannelManager::force_close_without_broadcasting_txn`] and
/// [`super::channelmanager::ChannelManager::force_close_all_channels_without_broadcasting_txn`].
#[rustfmt::skip]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna remove this skip while you're at it?

@@ -4350,85 +4360,66 @@ where
/// `peer_msg` should be set when we receive a message from a peer, but not set when the
/// user closes, which will be re-exposed as the `ChannelClosed` reason.
#[rustfmt::skip]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this commit interacts with LSPS2 'client-trusts-LSP' mode and other 'hosted channel' scenarios where we'd delay broadcasting the funding transaction. Probably won't make any material difference as we don't support them well currently anyways, but it's a bit odd that we'd then have no API that would allow the user to avoid broadcasting an FC spend for a funding transaction of which only the user would know that it was never broadcast.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'm not really a fan of avoiding broadcast in such a case. If we aren't confident that the funding tx hasn't been broadcasted, I'd rather broadcast anyway and let the user suffer the rather minor inconvenience of getting a transaction broadcasted which cannot confirm rather than risk bugs leading to funds loss.

let pre_funding = matches!(self.channel_state, ChannelState::NegotiatingFunding(_));
let funded = matches!(self.channel_state, ChannelState::FundingNegotiated(_));
let awaiting_ready = matches!(self.channel_state, ChannelState::AwaitingChannelReady(_));
// TODO: allow pre-initial-monitor-storage but post-lock-in (is that a thing) closure?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend to address this in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes, this was done, just need to remove the TODO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants