-
Notifications
You must be signed in to change notification settings - Fork 103
Hold runtime lock during stop
and reduce timeout values across the board
#538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Thanks for assigning @joostjager as a reviewer! |
17372cd
to
46285be
Compare
To make sure no odd behavior is emerging when `stop`ing and `start`ing in quick succession, we now keep the runtime write lock until we're done shutting down.
46285be
to
8967a82
Compare
stop
and reduce timeout values across the boardstop
and reduce timeout values across the board
|
||
// The timeout after which we abort a wallet syncing operation. | ||
pub(crate) const LDK_WALLET_SYNC_TIMEOUT_SECS: u64 = 30; | ||
pub(crate) const LDK_WALLET_SYNC_TIMEOUT_SECS: u64 = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed so that huge values are 'hopefully' not required anymore? Did they remove that central mutex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side question: isn't it better to not have a timeout? I don't know if it is desired to have background processes running when stop
returns.
For the BDK timeout, it also seems that stop
returns Ok()
, so also no indication other than log that something is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed so that huge values are 'hopefully' not required anymore? Did they remove that central mutex?
Yes, they dropped that mutex that used to be held for the entire duration of syncing the wallet with BDK 1.0. Here, we just do (late) accommodations since the behavior since the upgrade improved considerably.
Side question: isn't it better to not have a timeout? I don't know if it is desired to have background processes running when stop returns.
Hmm, I'm not sure. In general we should never reach that timeout, even though we recently got some reports to the contrary. But not having a timeout at all might also lead users to just kill the process after some (even more random time), which might have worse consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they dropped that mutex that used to be held for the entire duration of syncing the wallet with BDK 1.0. Here, we just do (late) accommodations since the behavior since the upgrade improved considerably.
Is there a rationale for the new timeout values? I'd imagine they should match BDK timeouts plus something? Or is the BDK timeout behavior complex and not reducable to a single value?
Hmm, I'm not sure. In general we should never reach that timeout, even though we recently got some reports to the contrary. But not having a timeout at all might also lead users to just kill the process after some (even more random time), which might have worse consequences.
So when "our" timeout expires and the node is reported to be stopped, wouldn't the process then typically be terminated anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a rationale for the new timeout values? I'd imagine they should match BDK timeouts plus something? Or is the BDK timeout behavior complex and not reducable to a single value?
Not quite sure what you mean with 'BDK timeout'? The individual electrum/esplora clients might have separate timeouts on a per-request basis, is that what you were referring to here?
So when "our" timeout expires and the node is reported to be stopped, wouldn't the process then typically be terminated anyway?
Yes, but we'd have more control over what we require finishing before timing out. Although, indeed, if we're currently would for some reason get blocked in the events processing and just timeout and move on, we might miss the final persistence round of the BP on shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant whatever we are waiting for that is outside ldk-node.
Although, indeed, if we're currently would for some reason get blocked in the events processing and just timeout and move on, we might miss the final persistence round of the BP on shutdown.
Doesn't this mean that we're better off / safer without the timeout? Or should the timeout be restricted to just the external processes and not the whole event handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied in threads
Previously, we had to configure enormous syncing timeouts as the BDK wallet syncing would hold a central mutex that could lead to large parts of event handling and syncing locking up. Here, we drop the configured timeouts considerably across the board, since such huge values are hopefully not required anymore.
8967a82
to
d76160d
Compare
To make sure no odd behavior is emerging when
stop
ing andstart
ing in quick succession, we now keep the runtime write lock until we're done shutting down.Also, we previously had to configure enormous syncing timeouts as the BDK wallet syncing would hold a central mutex that could lead to large parts of event handling and syncing locking up. Here, we drop the configured timeouts considerably across the board, since such huge values are hopefully not required anymore.