Hold runtime lock during `stop` and reduce timeout values across the board #538

tnull · 2025-05-13T13:07:50Z

To make sure no odd behavior is emerging when stoping and starting in quick succession, we now keep the runtime write lock until we're done shutting down.

Also, we previously had to configure enormous syncing timeouts as the BDK wallet syncing would hold a central mutex that could lead to large parts of event handling and syncing locking up. Here, we drop the configured timeouts considerably across the board, since such huge values are hopefully not required anymore.

ldk-reviews-bot · 2025-05-13T13:07:53Z

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

To make sure no odd behavior is emerging when `stop`ing and `start`ing in quick succession, we now keep the runtime write lock until we're done shutting down.

joostjager · 2025-05-13T13:51:11Z

src/config.rs


 // The timeout after which we abort a wallet syncing operation.
-pub(crate) const LDK_WALLET_SYNC_TIMEOUT_SECS: u64 = 30;
+pub(crate) const LDK_WALLET_SYNC_TIMEOUT_SECS: u64 = 10;


What changed so that huge values are 'hopefully' not required anymore? Did they remove that central mutex?

Side question: isn't it better to not have a timeout? I don't know if it is desired to have background processes running when stop returns.

For the BDK timeout, it also seems that stop returns Ok(), so also no indication other than log that something is wrong?

What changed so that huge values are 'hopefully' not required anymore? Did they remove that central mutex?

Yes, they dropped that mutex that used to be held for the entire duration of syncing the wallet with BDK 1.0. Here, we just do (late) accommodations since the behavior since the upgrade improved considerably.

Side question: isn't it better to not have a timeout? I don't know if it is desired to have background processes running when stop returns.

Hmm, I'm not sure. In general we should never reach that timeout, even though we recently got some reports to the contrary. But not having a timeout at all might also lead users to just kill the process after some (even more random time), which might have worse consequences.

Yes, they dropped that mutex that used to be held for the entire duration of syncing the wallet with BDK 1.0. Here, we just do (late) accommodations since the behavior since the upgrade improved considerably.

Is there a rationale for the new timeout values? I'd imagine they should match BDK timeouts plus something? Or is the BDK timeout behavior complex and not reducable to a single value?

Hmm, I'm not sure. In general we should never reach that timeout, even though we recently got some reports to the contrary. But not having a timeout at all might also lead users to just kill the process after some (even more random time), which might have worse consequences.

So when "our" timeout expires and the node is reported to be stopped, wouldn't the process then typically be terminated anyway?

Is there a rationale for the new timeout values? I'd imagine they should match BDK timeouts plus something? Or is the BDK timeout behavior complex and not reducable to a single value?

Not quite sure what you mean with 'BDK timeout'? The individual electrum/esplora clients might have separate timeouts on a per-request basis, is that what you were referring to here?

So when "our" timeout expires and the node is reported to be stopped, wouldn't the process then typically be terminated anyway?

Yes, but we'd have more control over what we require finishing before timing out. Although, indeed, if we're currently would for some reason get blocked in the events processing and just timeout and move on, we might miss the final persistence round of the BP on shutdown.

I meant whatever we are waiting for that is outside ldk-node.

Although, indeed, if we're currently would for some reason get blocked in the events processing and just timeout and move on, we might miss the final persistence round of the BP on shutdown.

Doesn't this mean that we're better off / safer without the timeout? Or should the timeout be restricted to just the external processes and not the whole event handler?

src/lib.rs

joostjager

Replied in threads

Previously, we had to configure enormous syncing timeouts as the BDK wallet syncing would hold a central mutex that could lead to large parts of event handling and syncing locking up. Here, we drop the configured timeouts considerably across the board, since such huge values are hopefully not required anymore.

tnull force-pushed the 2025-05-add-shutdown-test branch from 17372cd to 46285be Compare May 13, 2025 13:11

ldk-reviews-bot requested a review from joostjager May 13, 2025 13:18

Hold runtime lock during shutdown

6de3fa3

To make sure no odd behavior is emerging when `stop`ing and `start`ing in quick succession, we now keep the runtime write lock until we're done shutting down.

tnull force-pushed the 2025-05-add-shutdown-test branch from 46285be to 8967a82 Compare May 13, 2025 13:31

tnull changed the title ~~Use separate runtime in stop and reduce timeout values across the board~~ Hold runtime lock during stop and reduce timeout values across the board May 13, 2025

joostjager reviewed May 13, 2025

View reviewed changes

tnull requested a review from joostjager May 14, 2025 12:16

joostjager reviewed May 14, 2025

View reviewed changes

tnull added 2 commits May 15, 2025 12:18

f comment

5a521c1

tnull force-pushed the 2025-05-add-shutdown-test branch from 8967a82 to d76160d Compare May 15, 2025 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hold runtime lock during `stop` and reduce timeout values across the board #538

Hold runtime lock during `stop` and reduce timeout values across the board #538

tnull commented May 13, 2025 •

edited

Loading

ldk-reviews-bot commented May 13, 2025 •

edited

Loading

joostjager May 13, 2025

joostjager May 13, 2025

tnull May 14, 2025

joostjager May 14, 2025

tnull May 15, 2025

joostjager May 15, 2025 •

edited

Loading

joostjager left a comment

Hold runtime lock during stop and reduce timeout values across the board #538

Are you sure you want to change the base?

Hold runtime lock during stop and reduce timeout values across the board #538

Conversation

tnull commented May 13, 2025 • edited Loading

ldk-reviews-bot commented May 13, 2025 • edited Loading

joostjager May 13, 2025

Choose a reason for hiding this comment

joostjager May 13, 2025

Choose a reason for hiding this comment

tnull May 14, 2025

Choose a reason for hiding this comment

joostjager May 14, 2025

Choose a reason for hiding this comment

tnull May 15, 2025

Choose a reason for hiding this comment

joostjager May 15, 2025 • edited Loading

Choose a reason for hiding this comment

joostjager left a comment

Choose a reason for hiding this comment

Hold runtime lock during `stop` and reduce timeout values across the board #538

Hold runtime lock during `stop` and reduce timeout values across the board #538

tnull commented May 13, 2025 •

edited

Loading

ldk-reviews-bot commented May 13, 2025 •

edited

Loading

joostjager May 15, 2025 •

edited

Loading