-
Notifications
You must be signed in to change notification settings - Fork 1
Fixed splice_locked messgae to include splice_txid field and other interop fixes #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed splice_locked messgae to include splice_txid field and other interop fixes #6
Conversation
Interop testing with Eclair revealed an issue with remote funding key rotation. This searches for the funding output using the rotated remote funding pubkey instead of the furrent funding pubkey. Also update the variable name to be more clear which this represents. Changelog-Changed: Interop fixes for compatability with Eclair
|
I also noticed the clightning is not resending Should the spec be changed to always respond to a |
|
I've attached the logs for the test:
|
|
Looking into this 👀.
Core Lightning's current behavior is to send I believe you are correct that there is a missing second case where |
| " expects one; resending splice_lock"); | ||
| peer_write(peer->pps, | ||
| take(towire_splice_locked(NULL, &peer->channel_id))); | ||
| take(towire_splice_locked(NULL, &peer->channel_id, &peer->channel->funding.txid))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be &peer->splice_state->locked_txid -- however this value is not preserved after channel restarts, so that needs to addressed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a heavy lift. I think this is the best way to make splice_state->locked_txid work over restarts.
- Adding
{SQL("ALTER TABLE channel_funding_inflights ADD locked_onchain BOOL DEFAULT 0"), NULL},to db.c at the end of the db table be able to mark an inflight locked onchain - inflight.c needs to add serializiation of
is_lockedwithinflight->is_locked = fromwire_bool(cursor, max);infromwire_inflightandtowire_bool(pptr, inflight->is_locked);in towire_inflight. - Modifying
channeld_updating_inflightto include anis_lockedparameter - Update
handle_update_inflightto takeis_lockedparameter and update that value in its inflight - Modify
wallet_inflight_saveSQL query to saveis_lockedto the database - Modify
wallet_channel_load_inflightsto load `is_locked from the database - Update the inflight copy code in the loop above
towire_channeld_initto addis_lockedto theinfcopy - Below
fromwire_channeld_init, loop through each inflight inpeer->splice_state->inflightsand get the txid from the psbt of the one thatis_locked, placing it in peer->splice_state->locked_txid. Additionally it would be useful to assert only 1 is locked as two would be an error.
I thiiiiink that would do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a heavy lift. I think this is the best way to make
splice_state->locked_txidwork over restarts.
I can't really comment on if that's the right approach. Do you need to use splice_state->locked_txid during reconnect? It should already be your latest funding tx once you have sent and received splice_locked. If it has reached acceptable depth then your node can resend splice_locked with the txid of the latest funding tx. You must have already have saved that information after you exchanged splice_locked with your peer.
However, when you do a reconnect before exchanging splice_locked I can see that you'll need to save the txid of the funding tx negotiated for the splice. We save this information with the splice commitment along with the signatures exchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it has reached acceptable depth then your node can resend
splice_lockedwith the txid of the latest funding tx.
Yes you're right, peer->channel->funding.txid is correct in this case.
There needs to be a new separate case that uses peer->splice_state->locked_txid that I'm working on at the moment.
It sounds like we need to rethink the sending of |
I looked a bit more into the spec and our own tests. My understanding is that in the case of a reconnect the node that has sent, but not received, In the test I ran, during channel reestablishment Eclair will have set the From the Message Retransmission: If you don't have a test for this behavior, then I think that's what you need to add along with a fix to the handling of I'd be happy to retest this situation once you have a fix in place. |
I believe the current behavior matches the spec as written Lines 5211 to 5218 in 305c377
"No current inflight" means the inflight candidate has become the latest funding transaction in CLN. I believe the issue your test is hitting is if The spec probably needs to be updated to something like |
// - if `next_funding_txid` is set:
} else if (remote_next_funding) { /* No current inflight */
// - if `next_funding_txid` matches the latest funding transaction:
if (bitcoin_txid_eq(remote_next_funding,
&peer->channel->funding.txid)) {
status_info("We have no pending splice but peer"
" expects one; resending splice_lock");
// - MUST send `splice_locked`.
peer_write(peer->pps,
take(towire_splice_locked(NULL, &peer->channel_id)));
} The line Is actually redundant because there is no way to become the funding transaction without reaching acceptable depth. |
Do you have a copy of these logs? 👀 |
|
Commit dc3ae19 fixes the problem with CLN sending the shared outpoint with |
ddustin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Few little things I would fix up but otherwise a great fix. Looks like the main issue was not setting the shared_outpoint in splice_initiator_user_update which makes sense as when the funding input was sent recently moved around.
Great find!
| status_info("ictx->shared_outpoint = %s",(ictx->shared_outpoint?"defined":"null")); | ||
| char txid_hex[65]; | ||
| if (ictx->shared_outpoint && bitcoin_txid_to_hex(&(ictx->shared_outpoint->txid), txid_hex, sizeof(txid_hex))) { | ||
| status_info("ictx->shared_outpoint->txid=%s, ictx->shared_outpoint->n=%d", txid_hex, ictx->shared_outpoint->n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use fmt_bitcoin_txid instead of bitcoin_txid_to_hex
For example:
if (!bitcoin_txid_eq(&signed_psbt_txid, ¤t_psbt_txid))
status_failed(STATUS_FAIL_INTERNAL_ERROR,
"Signed PSBT txid %s does not match"
" current_psbt_txid %s",
fmt_bitcoin_txid(tmpctx, &signed_psbt_txid),
fmt_bitcoin_txid(tmpctx, ¤t_psbt_txid));| status_info("ictx->shared_outpoint->txid=%s, ictx->shared_outpoint->n=%d", txid_hex, ictx->shared_outpoint->n); | ||
| } | ||
| if (bitcoin_txid_to_hex(&(point.txid), txid_hex, sizeof(txid_hex))) { | ||
| status_info("point.txid=%s, point.n=%d", txid_hex, point.n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt_bitcoin_txid here as well
| if (bitcoin_txid_to_hex(&(point.txid), txid_hex, sizeof(txid_hex))) { | ||
| status_info("point.txid=%s, point.n=%d", txid_hex, point.n); | ||
| } | ||
| status_info("here2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would cut this log statement
| " expects one; resending splice_lock"); | ||
| peer_write(peer->pps, | ||
| take(towire_splice_locked(NULL, &peer->channel_id))); | ||
| take(towire_splice_locked(NULL, &peer->channel_id, &peer->channel->funding.txid))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it has reached acceptable depth then your node can resend
splice_lockedwith the txid of the latest funding tx.
Yes you're right, peer->channel->funding.txid is correct in this case.
There needs to be a new separate case that uses peer->splice_state->locked_txid that I'm working on at the moment.
c73f22a to
8f8ef0e
Compare
|
I fixed the changes but you have editing turned off for the PR. I'll manually pull your commits over |
And add a check for new uses creeping in, since it got cut & paste
everywhere.
This means "this is a valid string, but truncate it to this many characters"
vs "%.*s" which means "only read this many characters of string":
```
['lightningd-3 2025-10-23T02:31:40.890Z **BROKEN** plugin-funder: Plugin marked as important, shutting down lightningd!']
--------------------------- Captured stderr teardown ---------------------------
#0 0x557da58ad1dc in printf_common(void*, char const*, __va_list_tag*) asan_interceptors.cpp.o
#1 0x557da5aff814 in json_out_addv /home/runner/work/lightning/lightning/ccan/ccan/json_out/json_out.c:239:11
#2 0x557da59740ce in plugin_logv /home/runner/work/lightning/lightning/plugins/libplugin.c:1777:2
#3 0x557da5969b6f in plugin_log /home/runner/work/lightning/lightning/plugins/libplugin.c:1934:2
#4 0x557da595c4f6 in datastore_del_success /home/runner/work/lightning/lightning/plugins/funder.c:161:2
#5 0x557da598b837 in handle_rpc_reply /home/runner/work/lightning/lightning/plugins/libplugin.c:1072:10
#6 0x557da598a4b0 in rpc_conn_read_response /home/runner/work/lightning/lightning/plugins/libplugin.c:1361:3
#7 0x557da5adbea5 in next_plan /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:60:9
ElementsProject#8 0x557da5ae06ff in do_plan /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:422:8
ElementsProject#9 0x557da5adfb58 in io_ready /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:439:10
ElementsProject#10 0x557da5aec2ce in io_loop /home/runner/work/lightning/lightning/ccan/ccan/io/poll.c:455:5
ElementsProject#11 0x557da59757ac in plugin_main /home/runner/work/lightning/lightning/plugins/libplugin.c:2409:3
ElementsProject#12 0x557da594fe23 in main /home/runner/work/lightning/lightning/plugins/funder.c:1723:2
ElementsProject#13 0x7f6572229d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ElementsProject#14 0x7f6572229e3f in __libc_start_main csu/../csu/libc-start.c:392:3
ElementsProject#15 0x557da588b584 in _start (/home/runner/work/lightning/lightning/plugins/funder+0x10d584) (BuildId: 71ba63ab577fc6fa60573d3e8555f6db7d5c584d)
0x624000009d28 is located 0 bytes to the right of 7208-byte region [0x624000008100,0x624000009d28)
allocated by thread T0 here:
#0 0x557da590e7f6 in __interceptor_realloc (/home/runner/work/lightning/lightning/plugins/funder+0x1907f6) (BuildId: 71ba63ab577fc6fa60573d3e8555f6db7d5c584d)
#1 0x557da5b2149b in tal_resize_ /home/runner/work/lightning/lightning/ccan/ccan/tal/tal.c:755:13
#2 0x557da59f2032 in membuf_tal_resize /home/runner/work/lightning/lightning/common/utils.c:203:2
#3 0x557da5b03934 in membuf_prepare_space_ /home/runner/work/lightning/lightning/ccan/ccan/membuf/membuf.c:45:12
#4 0x557da59d4289 in jsonrpc_io_read_ /home/runner/work/lightning/lightning/common/jsonrpc_io.c:127:2
#5 0x557da598a635 in rpc_conn_read_response /home/runner/work/lightning/lightning/plugins/libplugin.c:1366:9
#6 0x557da5adbea5 in next_plan /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:60:9
#7 0x557da5ae06ff in do_plan /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:422:8
ElementsProject#8 0x557da5adfb58 in io_ready /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:439:10
ElementsProject#9 0x557da5aec2ce in io_loop /home/runner/work/lightning/lightning/ccan/ccan/io/poll.c:455:5
ElementsProject#10 0x557da59757ac in plugin_main /home/runner/work/lightning/lightning/plugins/libplugin.c:2409:3
ElementsProject#11 0x557da594fe23 in main /home/runner/work/lightning/lightning/plugins/funder.c:1723:2
ElementsProject#12 0x7f6572229d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
SUMMARY: AddressSanitizer: heap-buffer-overflow asan_interceptors.cpp.o in printf_common(void*, char const*, __va_list_tag*)
Shadow bytes around the buggy address:
0x0c487fff9350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff9360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff9370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff9380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff9390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c487fff93a0: 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa fa fa
0x0c487fff93b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff93c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff93d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff93e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff93f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==26122==ABORTING
```
Signed-off-by: Rusty Russell <[email protected]>
After Eclair successfully sends
splice_lockedI saw this message in the eclair log:Apparently
splice_lockedin clightning is missing thesplice_txidfield as described in the latest splice PR.This PR includes the minimal changes I needed to continue my testing, but there are probably other things that should be done to check the
splice_txidis correct, update tests, etc.