Skip to content

Ready for Review : Handle resolution in case of channel going onchain, pass state backward offchain #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

ariard
Copy link

@ariard ariard commented Sep 29, 2018

The point here is taking onchain state from ChainWatchInterface, parsing it in ChannelManager and from that actualize still off-chain channels with other peers.
It's still early, some hacks, see comments, I will finish timeout case with test and then try to find better solutions on some parts.

@@ -240,6 +241,8 @@ pub struct ChannelManager {
channel_state: Mutex<ChannelHolder>,
our_network_key: SecretKey,

unsolved_htlcs: Mutex<Vec<[u8;32]>>,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsolved htlc is a subset of claimable_htlcs as we can't iter on them at the same time and call downstream claim_funds/fail_backwards with another lock on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move the HTLCPreviousHopData into the ChannelMonitor when the Channel force-closes? I'd be kinda OK with breaking privacy and doing a HashMap<[u8; 32], HTLCPreviousHopData> there because other similar issues are gonna exist anyway (see #168).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I succeed to get around lock issues in ChannelManager block_connected, so I can get ride of unsolved_htlcs (in fact, real names should be unresolved htlc accordingly to BOLT 5..). Hopefully, will finish the other parts of PR tomorrow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current patch doesn't appear to use unsolved_htlcs for anything? It's only ever written to.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mistake of mine, forget to clean it, should be gone as aed435a

res
};
//TODO: size of redeemScript is test-only right now
if tx.input.len() > 0 && tx.input[0].witness.len() == 5 && tx.input[0].witness[4].len() == 138 && payment_hash160 == tx.input[0].witness[4][69..89] {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really dumb heuristics to extract preimage from witness, if there is a simple way I'm really more than open to change !

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe somehow return them from ChannelMonitor instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I didn't change it because, here the match is done against claimable_htlcs from channel_manager, just having a wrapper to extract preimage in channel_monitor, I don't think I have access to funding_txo oupoint here to ask to the right channel_monitor, it's really inter-channels stuff for me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but a ChannelMonitor should know how to monitor for things that happened to its channel on chain, including an HTLC claim that we may not have any response to, but saw. Even the current implementation relies on the since-closed ChannelMonitor behaving correctly to register transactions for monitoring.

Copy link
Author

@ariard ariard Oct 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I obviously lay on ChannelMonitor for detection of revocations so will refactor with is_fulfilled, is_timeout, is_revoked_tx calls to ChannelMonitor, which will handle state detection ? Or do you think it's better to have states registered in a hashmap by ChannelMonitor at blocks parsing and just ChannelManager peeks them up with hash as a key ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you're asking here, but clearly ChannelMonitor knows the relevant information to figure out the preimages and then pass them back up to ChannelManager.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, more something like in the last commit I just added about the fail backward in case of revocation.

@@ -1124,19 +1124,19 @@ impl ChannelMonitor {
/// Attempts to claim any claimable HTLCs in a commitment transaction which was not (yet)
/// revoked using data in local_claimable_outpoints.
/// Should not be used if check_spend_revoked_transaction succeeds.
fn check_spend_local_transaction(&self, tx: &Transaction, _height: u32) -> Vec<Transaction> {
fn check_spend_local_transaction(&self, tx: &Transaction, _height: u32) -> (Vec<Transaction>, (Sha256dHash, Vec<TxOut>)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to track outputs from our own commitment tx in case of other peer solving outputs on it and so us being able to timeout/success backward.

@@ -3193,15 +3232,15 @@ mod tests {
false
} else { true }
});
assert_eq!(res.len(), 2);
assert!(res.len() >= 2);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to check_spend_local_transaction, we have a block rescan and so same HTLC tx broadcast again

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems tests still pass if you change to assert!(res.len() == 2 || res.len() ==3); if == 3 { assert_eq!(res[1], res[2]); }

@@ -1065,7 +1065,7 @@ impl Channel {
}
}
if pending_idx == std::usize::MAX {
debug_assert!(false, "Unable to find a pending HTLC which matched the given HTLC ID");
//TODO: how to go around collisions ? debug_assert!(false, "Unable to find a pending HTLC which matched the given HTLC ID {}", htlc_id_arg);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and next, we have same payment_hash in claimable_htlcs inside duplicate_htlc_test, but we need to keep PreviousHopData infos.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just thinking maybe add a solution for this one : not caching PreviousHopdData if node is the origin one)

@TheBlueMatt
Copy link
Collaborator

This should be the last bit of #137, right? ie should be marked as "closes #137", right?

@ariard
Copy link
Author

ariard commented Sep 30, 2018

Yeah, apart of the key management thing for to_local output, that's the close of #137 (will do a last pass after it to be sure)

@ariard
Copy link
Author

ariard commented Oct 1, 2018

Add the timeout case, currently test should pass but in fact shouldn't. Seems to me that we lack the timeout htlc tx on received output, currently check_spend_remote_transaction doesn't differentiate and generate a HTLC Timeout tx do nothing. Will do it, and also more refinement of the whole PR (and more more checks in tests !)

@ariard
Copy link
Author

ariard commented Oct 2, 2018

Add the claiming of remote received htlc output in case of timeout, the case wasn't covered in check_send_remote_transaction. Still working on refinement.

@ariard
Copy link
Author

ariard commented Oct 6, 2018

Clean the commits, fix some weirds cases with right pruning of claimable_htlcs, rebased, harden the added tests, ready at least for a first pass I hope.

Note : we use accepted_htlc weight of 138 here, but production-setting seems to be 139 ? (thinking to add a constant with conditional compilation for testing)

Still lacking the fail backward on resolution by revoked commit tx, I'm still on it.

@@ -2943,17 +2943,19 @@ impl Channel {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holding_cells are being drained here. Do we forgo ClaimHTLC(payment_preimage) here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in fact modifications of channel force_shutdown were due to first design with unsolved htlcs. I delete this stuff thank to better get around with locks in block_connected. So even if I reverted it to original state as of aed435a, I think that the case described by TODO is gone thanks to registering previous inbound htlcs in claimable_htlcs at process_pending_forward ?

let mut ripemd = Ripemd160::new();
ripemd.input(htlc_with_hash);
let mut payment_hash160 = [0; 20];
ripemd.result(&mut payment_hash160);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, hashing per-pending-HTLC is gonna be really slow. I really think making ChannelMonitor do the detection here is the way to go - it knows the commitment tx txids so immediately knows which output-spends will contain an HTLC preimage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes seems also better to me after having implemented revocation case !

res
};
//TODO: size of redeemScript is test-only right now
if tx.input.len() > 0 && tx.input[0].witness.len() == 5 && tx.input[0].witness[4].len() == 138 && payment_hash160 == tx.input[0].witness[4][69..89] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you're asking here, but clearly ChannelMonitor knows the relevant information to figure out the preimages and then pass them back up to ChannelManager.

@ariard
Copy link
Author

ariard commented Oct 10, 2018

Refactored with suggested changes, effectively it's better, reducing number of lines.
Currently, tests don't pass due to test_no_existing_test_breakage in full_stack_target, will dig into it, but at first sight it's fine to update it.
Add const for script weight and handle test special cltv value.

@TheBlueMatt
Copy link
Collaborator

Yea, I like this version a lot better. One question I still have is if we actually want to go back to using claimable_htlcs to track all HTLCs (which was removed in #167) or if we want to just store the PreviousHopData pointers for the latest commitment transactions in ChannelMonitor and return them from is_resolving_output (or similar).

  • In not doing so we allow cross-payment same-hash HTLC claiming if the outbound HTLC is on-chain, which isn't ideal, but a privacy break using this costs going on-chain so I don't think its that concerning.
  • If we didn't do so, because we wouldn't want to keep around the PreviousHopData for all old commitment transactions we'd end up not being claim an inbound HTLC if the remote side broadcasts an old state. We can't lose money from this as we'd be able to claim the full tx value (as it's been revoked), but we don't get the free extra money.

In thinking about it, I kinda prefer option 2 because it means a little less tracking in ChannelManager, but if you want to keep it as-is I'm OK with that too, just means more complexity around making sure claimable_htlcs stays consistent.

@ariard
Copy link
Author

ariard commented Oct 10, 2018

So on the first point, seems to me that the privacy break as described in #168 is hard to play because if you have Alice -> Bob -> Caroll and want to learn if Bob is payee or not you have to force a channel closing between Bob and Caroll so here you have to be both Alice and Caroll (or attack them). But if so, well you already know that Bob isn't the payee, so doesn't seems to be valuable to exploit ?

Secondly, the case you described where we can claim the inbound HTLC only happen if as revoking an old state the remote node broadcast also the current commitment tx and claim inbound HTLC with a HTLC-Success, and I think it's covered by commits but yes effectively, not yet tested.

I prefer option 1 because it seems to me we avoid leaking htlc_id in watchtower (it already has short_channel_id..). Of course at the cost of more complexity, it took me times to ensure consistent remove of claimable_htlcs (spotted by duplicate_htlc_test) but I think it worth it.

}
payment_data = (Some(payment_preimage), payment_data.1);
}
if let Some(payment_hash) = payment_data.1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just payment_data.1.unwrap()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, done

{
let mut channel_state = Some(self.channel_state.lock().unwrap());
for tx in txn_matched {
if let Some(payments_data) = self.monitor.is_resolving_output(&tx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps return type of is_resolving_output can be just vec! . if None implies vec.len() == 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw awesome work, ariard! was studying carefully as I'm not familiar with on-chain transaction related code.

Copy link
Author

@ariard ariard Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm started to just return vec!, but not sure, feel to me that current is more rustacean way of doing things?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! In fact it's easier to understand I think if you have both BOLT2 and BOLT5 well in head

@ariard
Copy link
Author

ariard commented Oct 12, 2018

Updated with yuntai comment at 07895d0

@TheBlueMatt
Copy link
Collaborator

Indeed, I'm not particularly worried about breaking privacy in the case of on-chain closure, its a legitimate attack for off-chain things, but it's not of massive concern otherwise.

As for your point about leaking htlc_id in the watchtower, I don't think that matters, ChannelMonitor will eventually grow into two modes - one for remote watching and one for local use (this is what the KeyStorage enum is eventually intended to be used for), so we can drop extraneous information when we serialize/deserialize them for watchtower sending.

As you point out, making sure claimable_htlcs stays consistent really scares me, especially when things are unlocked/relocked as we go. Have you tried implementing this with HTLCSource storage in the LocalSignedTx struct and adding equivalent fields for latest remote commitment tx?

@TheBlueMatt
Copy link
Collaborator

Oh, oops, we're gonna have to support at least a cut-down version of HTLCSource in ChannelMonitors on watchtowers anyway, obviously - if two of our channels hit the chain the watchtower needs to be able to pass the HTLC preimage from one channel to another. That probably implies a pretty big change to the structure here - logic in ChannelManager would have to be duplicated in ManyChannelMonitor. Part of that is gonna have to happen anyway (for on-chain-claims-to-off-chain-claims) but part hopefully as little as possible.

@ariard
Copy link
Author

ariard commented Oct 16, 2018

Ah yeah on-chain-to-on-chain claims, well we forget it until now... If so seems obvious we need to have inter-hop data into ChannelMonitor so I'll try to implement it with HTLCSource storage. It's gonna to undermine a little HTLCSource destination, maybe rename it if needed.

@ariard
Copy link
Author

ariard commented Nov 17, 2018

Just rebased on master, back on it soon

let commitment_txid = tx.txid();
if let &Some(ref local_tx) = &self.current_local_signed_commitment_tx {
if local_tx.txid == commitment_txid {
match self.key_storage {
KeyStorage::PrivMode { ref delayed_payment_base_key, ref latest_per_commitment_point, .. } => {
return self.broadcast_by_local_state(local_tx, latest_per_commitment_point, &Some(*delayed_payment_base_key));
return (self.broadcast_by_local_state(local_tx, latest_per_commitment_point, &Some(*delayed_payment_base_key)), (commitment_txid, tx.output.clone()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and the other three return-something spots) I guess this is fine for now, but technically we don't actually care about all the outputs we only care about the HTLC ones.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed a bit broadcast_by_local_state to avoid tracking non-htlc outputs, please review it again to be sure I broke nothing (test said no but well..)

@@ -3193,15 +3232,15 @@ mod tests {
false
} else { true }
});
assert_eq!(res.len(), 2);
assert!(res.len() >= 2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems tests still pass if you change to assert!(res.len() == 2 || res.len() ==3); if == 3 { assert_eq!(res[1], res[2]); }

@@ -1883,7 +1889,7 @@ impl Channel {
/// waiting on this revoke_and_ack. The generation of this new commitment_signed may also fail,
/// generating an appropriate error *after* the channel state has been updated based on the
/// revoke_and_ack message.
pub fn revoke_and_ack(&mut self, msg: &msgs::RevokeAndACK, fee_estimator: &FeeEstimator) -> Result<(Option<msgs::CommitmentUpdate>, Vec<(PendingForwardHTLCInfo, u64)>, Vec<(HTLCSource, [u8; 32], HTLCFailReason)>, Option<msgs::ClosingSigned>, ChannelMonitor), ChannelError> {
pub fn revoke_and_ack(&mut self, msg: &msgs::RevokeAndACK, fee_estimator: &FeeEstimator) -> Result<(Option<msgs::CommitmentUpdate>, Vec<(PendingForwardHTLCInfo, u64)>, Vec<(HTLCSource, [u8; 32], HTLCFailReason)>, Option<msgs::ClosingSigned>, ChannelMonitor, OutPoint), ChannelError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm kinda confused about this refactor, maybe we're out of sync, and ChannelMonitor's different modes need to be actually documented and complete, plus I'm sure #255 only confused it more. I think we should move the ChannelMonitor::funding_txo into KeyStorage::PrivMode (and call it Storage::Local and Storage::Watchtower or so) and then we can make it non-Option-al, which would remove the need for this weird handing around the funding_txo, no? Or am I confused as to why this is here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I need to link HTLCSource to a backward ChannelMonitor, so I need funding_txo linked to a given HTLCSource which is itself linked by HTLC hash.
But wouldn't be surprised if there is an easier way ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe need deeper thoughts if we move funding_txo into KeysStorage we maybe don't want also to use it in SimpleManyChannelMonitor to index monitors (but afterwards SimpleManyChannelMonitor is just an implementation of ManyChannelMonitor, not necessarily the watchtower one)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SimpleManyChannelMonitor is intended really just for use for local channels, not for watchtowers, which need to index by a random public key which is used to sign updates from the client (as otherwise your counterparty could replace your monitor with bogus data), so I think we can rely on using the outpoint there. As for linking HTLCSource back to a ChannelMonitor, I think we can just eat the performance hit of storing the short_channel_id along with the funding_txo in the ChannelMonitor. That would let the current HTLCSource structure look up the ChannelMonitor and doing an iteration over all our ChannelMonitors in the case of resolving a on-chain <-> on-chain forward isn't really a big deal, I think.

@@ -119,27 +125,70 @@ pub struct SimpleManyChannelMonitor<Key> {
chain_monitor: Arc<ChainWatchInterface>,
broadcaster: Arc<BroadcasterInterface>,
pending_events: Mutex<Vec<events::Event>>,
htlc_sources: Mutex<HashMap<[u8; 32], (Key, HTLCPreviousHopData)>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, htlc_sources can grow unbounded, no? I dont think we need to track this outside of the ChannelMonitor itself or for historical transactions, though. Any attempt by our counterparty to broadcast a revoked state we can just treat as all-HTLCs-failed and fail backwards whatever HTLCs we have pending, and then we can just keep a set of HTLCPreviousHopData in each ChannelMonitor itself for the latest local and latest remote transactions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah dumb me it grow unbounded, not all tx are gonna to settle onchain obviously, should be doable to only keep last for latest {local, remote} commitment tx in ChannelMonitor. Just wait to be fixed on other issue on funding_txo, may change due to that.

@TheBlueMatt
Copy link
Collaborator

Oh, if possible, it'd be nice to have a test for duplicate-payment_hash failures:

A -> B -> C -> D where there are two identical payment_hash payments through the B->C hop. Then that channel hits chain and make sure that when HTLC-Timeout confirms for one output we only fail one backwards, not both, then C can even broadcast an HTLC-Success for the other and we'll claim that payment backwards.

@ariard
Copy link
Author

ariard commented Nov 23, 2018

Rebased, 2703ad1 add types for payment_hash, payment_preimages. Should avoid API misuses, I spotted a variable misnamed as txid instead of payment_hash.

Seen travis failures, will fix

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I got too eager on doing another round of review?

@@ -7945,9 +8284,9 @@ mod tests {
}
let revoked_htlc_txn = nodes[1].tx_broadcaster.txn_broadcasted.lock().unwrap();

assert_eq!(revoked_htlc_txn.len(), 2);
assert_eq!(revoked_htlc_txn[0], revoked_htlc_txn[2]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the len check instead of updating it, I dont want to end up in a state where we generate extra unchecked events/txn which are garbage because we dont test them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey mess this while rebasing, corrected

htlc_success_tx.input[0].witness.push(chan_utils::get_htlc_redeemscript_with_explicit_keys(htlc, &local_tx.a_htlc_key, &local_tx.b_htlc_key, &local_tx.revocation_key).into_bytes());

add_dynamic_output!(htlc_success_tx, 0);
res.push(htlc_success_tx);
}
watch_outputs.push(local_tx.tx.output[htlc.transaction_output_index as usize].clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I think you meant to put this on the next line (hence the bad indentation)...better question, why didn't the tests catch this, dont we need to watch for HTLC-Timeout confirmation as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm test didn't catch this because it wasn't covered explicitly by them, but correcting that broke test_static_spendable_outputs_justice_tx_revoked_htlc_timeout_tx and test_dynamic_spendable_outputs_local_htlc_timeout_tx, due to block rescan caused by detection extension

@@ -1281,9 +1397,10 @@ impl ChannelMonitor {
} else { (None, None) }
}

fn broadcast_by_local_state(&self, local_tx: &LocalSignedTx, per_commitment_point: &Option<PublicKey>, delayed_payment_base_key: &Option<SecretKey>) -> (Vec<Transaction>, Vec<SpendableOutputDescriptor>) {
fn broadcast_by_local_state(&self, local_tx: &LocalSignedTx, per_commitment_point: &Option<PublicKey>, delayed_payment_base_key: &Option<SecretKey>, commitment_txid: Sha256dHash) -> (Vec<Transaction>, Vec<SpendableOutputDescriptor>, (Sha256dHash, Vec<TxOut>)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically passing in the commitment_txid is redundant since we only get called with its == local_tx.txid (of course its also redundant to return it back since the caller knows the Sha256dHash from the last return pair is always the thing it passed in).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes corrected in something I hope smarter

@@ -1880,7 +1886,7 @@ impl Channel {
/// waiting on this revoke_and_ack. The generation of this new commitment_signed may also fail,
/// generating an appropriate error *after* the channel state has been updated based on the
/// revoke_and_ack message.
pub fn revoke_and_ack(&mut self, msg: &msgs::RevokeAndACK, fee_estimator: &FeeEstimator) -> Result<(Option<msgs::CommitmentUpdate>, Vec<(PendingForwardHTLCInfo, u64)>, Vec<(HTLCSource, [u8; 32], HTLCFailReason)>, Option<msgs::ClosingSigned>, ChannelMonitor), ChannelError> {
pub fn revoke_and_ack(&mut self, msg: &msgs::RevokeAndACK, fee_estimator: &FeeEstimator) -> Result<(Option<msgs::CommitmentUpdate>, Vec<(PendingForwardHTLCInfo, u64)>, Vec<(HTLCSource, PaymentHash, HTLCFailReason)>, Option<msgs::ClosingSigned>, ChannelMonitor, OutPoint), ChannelError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait I thought we were gonna remove this return change, or am I confused?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

@@ -97,7 +99,11 @@ pub trait ManyChannelMonitor: Send + Sync {
/// Implementor must also ensure that the funding_txo outpoint is registered with any relevant
/// ChainWatchInterfaces such that the provided monitor receives block_connected callbacks with
/// any spends of it.
fn add_update_monitor(&self, funding_txo: OutPoint, monitor: ChannelMonitor) -> Result<(), ChannelMonitorUpdateErr>;
fn add_update_monitor(&self, funding_txo: OutPoint, monitor: ChannelMonitor, pending_htlc_sources: Vec<(PaymentHash, (OutPoint, HTLCSource))>) -> Result<(), ChannelMonitorUpdateErr>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I thought we weren't doing the pending_htlc_sources change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

@ariard
Copy link
Author

ariard commented Nov 26, 2018

Rebased, still need to add test, sorry will split first commit to ease review so don't bother a new one now !

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, guess was too eager on the re-review, #198 (comment) wasn't addressed, or did I miss something?

::util::macro_logger::DebugFundingOption(&funding_info)
},
Storage::Watchtower { .. } => {
unimplemented!();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, no panics in log handlers, please, even if it shouldn't be called. Also this isnt really "log_funding_option" anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected

((*index_in_block as u64) << (2*8)) |
((txo_idx as u64) << (0*8)));
let short_channel_id = ((height as u64) << (5*8)) |
((*index_in_block as u64) << (2*8)) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation broken here, extra space added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

},
Storage::Watchtower { .. } => {
unimplemented!();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the unimplemented branch? Shouldn't the Local.funding_txo == None branch be the ignored one (since its nonsense to be adding a local monitor without a funding txo). Though of course it shouldnt panic, it should return an Err in that case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Err for Local.funding_txo == None, watch_all_txn for Watchtower

revocation_base_key: *revocation_base_key,
htlc_base_key: *htlc_base_key,
delayed_payment_base_key: *delayed_payment_base_key,
payment_base_key: *payment_base_key,
shutdown_pubkey: *shutdown_pubkey,
prev_latest_per_commitment_point: *latest_per_commitment_point,
latest_per_commitment_point: Some(local_keys.per_commitment_point),
funding_info: funding_info.clone(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if you make the funding_info ref mut, you can do a take() here, instead, avoiding the malloc implied by the Vec copy inside the Script. This is also true in a number of other places.

self.funding_txo = other.funding_txo.take();
}

self.key_storage = match self.key_storage {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dont think we need to support all these cases - at no point should we get called on a local monitor that doesn't have funding_info set, so that can just return an Err (which means the comparison can just go ahead as it was previously). Also, we should never get a call to merge a watchtower-stripped monitor with a local monitor, so that can just Err.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey simplify on this one, but maybe not as much as you was thinking to avoid breaking some tests

}
},
Storage::Watchtower { .. } => {
unimplemented!();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt this just return None?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

script.write(writer)?;
},
&None => {
// We haven't even been initialized...not sure why anyone is serializing us, but
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, now that we've converted to a fn returning Result, maybe just return an Err here. It really, really doesn't make sense to be serializing this, and we should never be exposing them publicly anyway. Could even do a debug_assert!(). Could then remove the "Technically this can..." comment in the deserializer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested.

writer.write_all(&byte_utils::be64_to_array(short_channel_id))?;
}
&None => {
writer.write_all(&[0; 1])?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just write out 8 bytes of 0s?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No 8 bytes of 0s fails serialization test, because you get Some(&[0;8]) instead of None at deserialization!

@@ -1413,7 +1417,7 @@ impl ChannelManager {
match forward_chan.send_htlc(forward_info.amt_to_forward, forward_info.payment_hash, forward_info.outgoing_cltv_value, htlc_source.clone(), forward_info.onion_packet.unwrap()) {
Err(_e) => {
let chan_update = self.get_channel_update(forward_chan).unwrap();
failed_forwards.push((htlc_source, forward_info.payment_hash, 0x1000 | 7, Some(chan_update)));
failed_forwards.push((htlc_source.clone(), forward_info.payment_hash, 0x1000 | 7, Some(chan_update)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this line seems stale, compiles fine without the "clone()" due to the continue on the next line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected

if pending_htlc_sources.len() > 0 {
for htlc_source in pending_htlc_sources.drain(..) {
match htlc_source.1 {
HTLCSource::PreviousHopData(source) => { htlc_sources.insert(htlc_source.0, source); },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this is still unbounded, no? I guess haven't finished up the last review?

@ariard
Copy link
Author

ariard commented Nov 28, 2018

Seen for #198 (comment), want to solve it while handling duplicate hash case. To solve this problem, a reliable way seems to me is to track commitment ouput index in HTLCSource for a given htlc_id, somewhere in process_pending_htlc_forward.
But if you come with a better idea, tell me will implement it tomorrow

@ariard
Copy link
Author

ariard commented Nov 30, 2018

Okay refactored thanks to inputs, add test in case of duplicate_payment_hash resolved differently.

Two notes:

  • see channelmanager, L1534, which error codes to return to previous node by processing node in case of forward channel unilateral closed by next node ? Seems not covered by bolt4, for now I return NODE|2 (temporary_node_failure), or change fail_htlc_backwards_internal with from_chain=true to avoid generating an error code ?
  • see channel, L1152, we may have block re-scan in ChannelMonitor which in this case generate spurrious HTLC update event. We can prune them in ChannelMonitor (currently done) but ChannelManager is only to know if updates have been handled, and should do only after they have been revoke_and_ack'ed. So for now, if ChannelManager get again a htlc update on an already Committed-->LocalRemoved htlc we discard it

new_storage_local!(our_funding_info.take(), our_short_channel_id.take())
}
} else {
new_storage_local!(funding_info.take(), short_channel_id.take())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm? I'm still really not convinced we need to support funding_info.is_none() in most places in ChannelMonitor. Specifically if I replace this line with a panic!() the tests still pass, and I don't think there should be a way to get to this anyway (it should return an Err as otherwise you could crash a watchtower, but the tests still pass fine without it, which could simplify this function).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, replaced by Err, in the current state, it's technically possible to hit this branch if user pass us a ChannelMonitor without funding_info set but agree it wouldn't make sense and not hit (I think) but our current code flow. Should go away with refactor in Channel as previously talked

@@ -751,7 +757,7 @@ impl Channel {
/// generated by the peer which proposed adding the HTLCs, and thus we need to understand both
/// which peer generated this transaction and "to whom" this transaction flows.
#[inline]
fn build_commitment_transaction(&self, commitment_number: u64, keys: &TxCreationKeys, local: bool, generated_by_local: bool, feerate_per_kw: u64) -> (Transaction, Vec<HTLCOutputInCommitment>) {
fn build_commitment_transaction(&self, commitment_number: u64, keys: &TxCreationKeys, local: bool, generated_by_local: bool, feerate_per_kw: u64) -> (Transaction, Vec<(HTLCOutputInCommitment, Option<HTLCSource>)>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get away with &HTLCSource here to avoid a bunch of clone()s, see TheBlueMatt@02b088a

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, thanks cherry-picked!

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't have enough time to do a through look, but generally is looking really close now!

pub(super) htlc_id: u64,
pub(super) incoming_packet_shared_secret: [u8; 32],
pub(crate) short_channel_id: u64,
pub(crate) htlc_id: u64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You dont need to expose htlc_id/incoming_packet_shared_secret - HTLCSource and HTLCPreviousHopData implement Writeable/Readable directly, so you also dont need to re-implement their serialization/deserialization manually in ChannelMonitor::write/read.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah damn has written superfluous stuff one more time! Corrected


/// Used by ChannelManager to get list of HTLC resolved onchain and which needed to be updated
/// with success or failure backward
fn fetch_pending_htlc_updated(&self) -> Vec<(PaymentHash, Option<PaymentPreimage>, Option<HTLCSource>)>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a strange return type (especially leaking the internal-type HTLCSource) but we never actually want the user to care whats inside, maybe we want to pub(crate) struct HTLCUpdate(.....)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped in HTLCUpdate, still with a htlc_source field, because even if HTLC is inbound the user may be interested in the HTLC onchain resolving

match htlc_source {
&HTLCSource::PreviousHopData(ref source) => {
if source.short_channel_id == our_short_channel_id {
monitor.provide_payment_preimage(&htlc.2, &htlc.1.unwrap());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, doing the update here really isnt sufficient, though getting it right seems nontrivial. We really need to also generate a monitor update call for the user to pass the updated monitor to the watchtower. In the case this upstream channel isnt already on-chain, this is also redundant work - the claim back from ChannelManager is sufficient. Maybe we should just have a call ChannelManager can make back into the ManyChannelMonitor saying "please update the monitor with id X and also update it at all watchtowers with the given payment_preimage"? Need to make provide_payment_preimage pub, then, but I think that's ok?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"In the case this upstream channel isn't already on-chain" but it's not gonna be called here if no payment_preimage was detected? Agree need to update watchtowers here, seems easy doable if we send back monitor id in HTLCUpdate, a likely call could be made just after fetch_pending_htlc_updated in ChannelManager block_connected, will try to write something like that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this duplicated code - if we get a preimage from an off-chain channel and have to claim it backwards to an on-chain channel ChannelManager needs to know how to find that ChannelMonitor and update it. If we are claiming on-chain to on-chain (ie right there) that's still the same from the claim-backwards PoV, so why not just keep that logic to ChannelManager (where it may be easier to update the monitor on watchtowers).

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the first commit as #264 with a few tweaks applied to fix nits below.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review the tests in detail yet, but have a few comments. Note that the claiming in ChannelMonitor for off-chain-to-off-chain I think just needs to move to the two "TODO: There is probably a channel manager somewhere that needs to learn the preimage...". If you want to kick the can down the road on that one to a future PR to get this one moving, I think that's fine, just comment out the relevant tests with a TODO and we'll get it fixed up when we fix up the on-chain-to-off-chain claim stuff.

pub struct HTLCUpdate {
pub(super) payment_hash: PaymentHash,
pub(super) payment_preimage: Option<PaymentPreimage>,
pub(super) source: Option<HTLCSource>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop the Option<> here and debug_assert!(false) anywhere where we're trying to push something with a None HTLCSource to a claim (None HTLCSource implies it was an inbound HTLC, which only we should be able to claim unless they HTLC-Timeout'd cause we gave up on it it or we broadcasted an old state - in either case there isn't anything we can do and it should only occur if we screwed up or we never had the preimage to begin with).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know that None HTLCSource implies inbound HTLC but I voluntary let them because user may be interested in logging that a given inbound HTLC has been resolved even if right now we do nothing with them after fetch_pending_htlc_updated ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay remove Option<> and just log_trace them if we detect them in is_resolving_htlc_output

match htlc_source {
&HTLCSource::PreviousHopData(ref source) => {
if source.short_channel_id == our_short_channel_id {
monitor.provide_payment_preimage(&htlc.2, &htlc.1.unwrap());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this duplicated code - if we get a preimage from an off-chain channel and have to claim it backwards to an on-chain channel ChannelManager needs to know how to find that ChannelMonitor and update it. If we are claiming on-chain to on-chain (ie right there) that's still the same from the claim-backwards PoV, so why not just keep that logic to ChannelManager (where it may be easier to update the monitor on watchtowers).

}
preimage = Some(payment_preimage);
}
htlc_updated.push((payment_data.0, preimage, payment_data.2.unwrap()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we at least detect if they took this output via a revocation signature and log that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe useful to know for debug that we're dysfunctional and broadcasting revoked local commitment tx. So added a commit on top which should do that, with assumption that witness[1] is 33-bytes revocation_pubkey. But obviously need to add test!

@TheBlueMatt
Copy link
Collaborator

Also, may be worth taking https://0bin.net/paste/2QD6vXt+SjMkArLe#hsGeu+WGCvqONIpXBUnabBXd4Uhs7hclvMWJb0Hq02X after "Add is_resolving_output in ChannelMonitor"

@ariard
Copy link
Author

ariard commented Dec 3, 2018

Updated "Add_is_resolving_output..." with 0bin stuff (was wandering if it was possible to get a HTLC solved in a different way after a reorg but seems to me that would do absolutely no sense because the only update translation I see is from HTLC-Success to a timeout tx by a dysfunctional remote peer?)

Antoine Riard and others added 13 commits December 5, 2018 21:09
Move PrivMode as Local, SigsMode as Watchtower

Cut funnding_txo from ChannelMonitor, move it inside Local

Rename log_funding_option as log_funding_info
Aims to detect onchain resolution of channel

Modify in consequence test_txn_broadcast to still pass
channel_monitor_network_test

Modify some tests due to block re-scan caused by
detections extensions
Insert it in current_local_signed_tx, prev_local_signed_tx,
remote_claimable_outpoints. For so get it provided by
Channel calls to provide_latest_{local,remote}_tx
Called in ChannelMonitor block_connected, returning
HTLCUpdate upstream via ManyChannelMonitor to
link htlcs between monitors. Used by ChannelManager to
fulfill/fail htlcs backwards accordingly

Send back HTLCUpdate for all pendingg htlc in
check_spend_remote_tx in case of revoked commitment tx

If spurrious HTLCUpdate are generated due to block re-scan
and htlc are already LocalRemoved, discard them in
channel get_update_*_htlc
Pass failure backward

Add test_htlc_on_chain_timeout
Harden test_htlc_on_chain_timeout with asserts, refactor useless tests
Refactor block_connected to ease output resolution

Add test_commitment_revoked_fail_backward

Close lightningdevkit#137
ease readability

Conditionnal compilation for weight of second one to handle test special
cltv values
Fix variable name as payment_hash instead of txid for index
of remote_hash_commitment_number in ChannelMonitor reader
Extend commitment_signed_dance with PaymentFailureNetworkUpdate
case, expecting ChannelClosed update
In case of broadcast of revoked local commitment tx, we may be
interested that we've screwed up
@TheBlueMatt
Copy link
Collaborator

Will take as #269.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants