fix: ping race condition #144
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have been guessing in the dark for a while but this seems to fix the crash (see below) that is happening in certain cases. I cannot reproduce them other than extremely heavy load for hours or sometimes days.
In general, having ~2000 active Websocket connections with 10 second ping intervals each on 8 different servers generates 1-2 of those crashes per day.
I assumed it was a race condition, which was confirmed by a few people on the Discord as well. These changes seem to fix it, which means I haven't been able to see the crash since.
In general, being in the EventLoop is a good approach anyways, even if I am not sure why there are cases of race conditions.
One idea I had is that setting
pingInterval
triggers the schedule function, but is not guaranteed to be in the EventLoop. If channel is accessed at the same time from somewhere else the race condition might occur. It's a guess though.Original Crash Report: