-
Notifications
You must be signed in to change notification settings - Fork 693
Handle errors during sync #765
Handle errors during sync #765
Conversation
# This is the 1st commit message: fixes ethereum#760 ethereum#762 ethereum#737 # The commit message #2 will be skipped: # use eth-utils big endian integer utils # The commit message #3 will be skipped: # Fix IndexError when an empty bucket is encountered while looking up nodes # The commit message #4 will be skipped: # dirty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (As far as I can tell from my humble knowledge). Minor nitpick would be usage of n
and c
as variable names that I thing should be expanded to something more meaningful.
p2p/kademlia.py
Outdated
self.bond(n, cancel_token) | ||
for n | ||
in bootstrap_nodes | ||
if (n not in self.ping_callbacks and n not in self.pong_callbacks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly concerned about these changes because if there's ever an API (and we might have one already) that does not clean up those callback lists as it should, we'll have a memory leak that will be hard to track. More importantly, though, we cold end up in a situation where a node with which we fail to bond once will keep failing forever because we think there's a bond in progress with it.
Instead, I'd like to suggest we change PeerPool
so that it never runs multiple discovery lookups in parallel, as that, together with the fact that KademliaProtocol.lookup()
does not try to bond with peers more than once, should stop those AlreadyWaiting
errors. I've just created #767 for that
Would you be ok with shelving just the changes that check whether a node is already present in a callback list and trying #767 for a while to see if it works as I expect?
p2p/kademlia.py
Outdated
class CallbackLock: | ||
def __init__(self, | ||
callback: Callable, | ||
timeout: int=300) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about making the default here something like 2 * k_request_timeout
as by then we can be sure nobody should be waiting for the callback anyway?
@gsalgado if you have follow up review on the extra commits I'll address it in another PR |
fixes #760 #762 #737
replaces #764 #763 #761
What was wrong?
Errors that occur during sync
How was it fixed?
Add condition checks.