-
Notifications
You must be signed in to change notification settings - Fork 412
Fix a debug panic caused by receiving MPP parts after a failure #1266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a debug panic caused by receiving MPP parts after a failure #1266
Conversation
Note that these panics are debug-only (apparently for good reason!) |
Codecov Report
@@ Coverage Diff @@
## main #1266 +/- ##
=======================================
Coverage 90.50% 90.50%
=======================================
Files 71 71
Lines 39210 39258 +48
=======================================
+ Hits 35486 35531 +45
- Misses 3724 3727 +3
Continue to review full report at Codecov.
|
fff9f22
to
6d9668c
Compare
Rebased on the Payee -> PaymentParameters change. |
expect_pending_htlcs_forwardable_ignore!(nodes[1]); | ||
nodes[1].node.process_pending_htlc_forwards(); | ||
nodes[1].node.fail_htlc_backwards(&our_payment_hash); | ||
|
||
expect_pending_htlcs_forwardable_ignore!(nodes[1]); | ||
nodes[1].node.process_pending_htlc_forwards(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason expect_pending_htlcs_forwardable
shouldn't be used in both places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it currently calls process_pending_htlc_forwards
twice without checking for a new event, which we don't want here.
nodes[0].node.handle_update_fail_htlc(&nodes[1].node.get_our_node_id(), &fail_updates_1.update_fail_htlcs[0]); | ||
nodes[0].node.handle_update_fail_htlc(&nodes[1].node.get_our_node_id(), &fail_updates_1.update_fail_htlcs[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there two failed HTLCs because the second payment was sent over two paths? Or am I misunderstanding what's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm? There were two send_payment
calls, so we have two HTLCs to fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My confusion again here on the testing scenario. I think what threw me was that nodes[1]
was failing a payment hash that it knew the preimage for. And I was thinking that it would only fail one HTLC since the first should have been sufficient. But I understand now that that would be incorrect behavior.
Then again, seeing this part of the commit message:
The solution is to avoid bulk-failing all pending HTLCs for a
payment. This feels like the right thing to do anyway - if a sender
accidentally sends an extra HTLC after a payment has ben fully
paid, we shouldn't fail the entire payment.
Makes me think only on path should be failed. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, nevermind about this. I understand what's going on now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way I could clarify the test comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just extending the last sentence to say something like "and that panic does not occur if the user manually fails the payment".
Added a further comment in the test. |
26a48c5
to
9cf0435
Compare
Prior to cryptographic payment secrets, when we process a received payment in `process_pending_htlc_fowards` we'd remove its entry from the `pending_inbound_payments` map and give the user a `PaymentReceived` event. Thereafter, if a second HTLC came in with the same payment hash, it would find no entry in the `pending_inbound_payments` map and be immediately failed in `process_pending_htlc_forwards`. Thus, each HTLC will either result in a `PaymentReceived` event or be failed, with no possibility for both. As of 8464875, we no longer materially have a pending-inbound-payments map, and thus more-than-happily accept a second payment with the same payment hash even if we just failed a previous one for having mis-matched payment data. This can cause an issue if the two HTLCs are received back-to-back, with the first being accepted as valid, generating a `PaymentReceived` event. Then, when the second comes in we'll hit the "total value {} ran over expected value" condition and fail *all* pending HTLCs with the same payment hash. At this point, we'll have a pending failure for both HTLCs, as well as a `PaymentReceived` event for the user. Thereafter, if the user attempts to fail the HTLC in response to the `PaymentReceived`, they'll get a debug panic at channel.rs:1657 'Tried to fail an HTLC that was already failed'. The solution is to avoid bulk-failing all pending HTLCs for a payment. This feels like the right thing to do anyway - if a sender accidentally sends an extra HTLC after a payment has ben fully paid, we shouldn't fail the entire payment. Found by the `chanmon_consistency` fuzz test.
9cf0435
to
be57e82
Compare
Squashed without changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACKing
CI failure is unrelated (tokio MSRV bump). |
Prior to cryptographic payment secrets, when we process a received
payment in
process_pending_htlc_fowards
we'd remove its entryfrom the
pending_inbound_payments
map and give the user aPaymentReceived
event.Thereafter, if a second HTLC came in with the same payment hash, it
would find no entry in the
pending_inbound_payments
map and beimmediately failed in
process_pending_htlc_forwards
.Thus, each HTLC will either result in a
PaymentReceived
event orbe failed, with no possibility for both.
As of 8464875, we no longer
materially have a pending-inbound-payments map, and thus
more-than-happily accept a second payment with the same payment
hash even if we just failed a previous one for having mis-matched
payment data.
This can cause an issue if the two HTLCs are received back-to-back,
with the first being accepted as valid, generating a
PaymentReceived
event. Then, when the second comes in we'll hitthe "total value {} ran over expected value" condition and fail
all pending HTLCs with the same payment hash. At this point,
we'll have a pending failure for both HTLCs, as well as a
PaymentReceived
event for the user.Thereafter, if the user attempts to fail the HTLC in response to
the
PaymentReceived
, they'll get a debug panic at channel.rs:1657'Tried to fail an HTLC that was already failed'.
The solution is to avoid bulk-failing all pending HTLCs for a
payment. This feels like the right thing to do anyway - if a sender
accidentally sends an extra HTLC after a payment has ben fully
paid, we shouldn't fail the entire payment.
Found by the
chanmon_consistency
fuzz test.