-
Notifications
You must be signed in to change notification settings - Fork 691
Fix intermittency in test_timer::alarm_fires #1626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @blt |
31f09c9
to
5786b3b
Compare
// 2) Replace the old signal handler now that we've completed the test. If | ||
// the test fails this process panics, so the fact we might not get here | ||
// is okay. | ||
drop(timer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to address the issue on my NetBSD VM. I'm not entirely sure why, but I think it might be related to the following section of the timer_delete man page:
Pending notification events (signals) may or may not be delivered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a 10ms delay after dropping but before removing the signal handler seems to provide enough time for things to resolve on my VM. I'm unsure if there is a more deterministic way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. From the FreeBSD man page: "Pending signals for the deleted timer are cleared." and from OpenGroup: "The disposition of pending signals for the deleted timer is unspecified." So I guess OpenGroup is the lowest common denominator. How do we ensure that all pending signals are dealt with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried thread::yield_now()
without success, so I think your current timing solution is as good as we'll get for now.
I'm going to leave the tests running for a bit, and if they don't fail with the current timeout for a few minutes I'll merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hmm. I suppose this implementation could destroy the timer, replace the old signal handler and then call sigpending
. If the result of sigpending
turns up empty the previously deleted timer didn't leave any trailing signals behind on systems that don't clean up. I'm not totally convinced that the OpenGroup wording doesn't allow for some unexpected ordering of operations in that proposed implementation, that said. The implementation that was merged up here makes good sense to me.
Apologies for not noticing this myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think sigpending
will work, because of this line from the man page:
Signals may be pending because they are currently
masked, or transiently before delivery (although the latter case is not
normally detectable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha, I believe you are correct.
The test was disabling the signal handler before disabling the timer. Fix intermittent failures by: * Reversing the cleanup order. * Sleeping for a while before removing the signal handler, since POSIX does not guarantee that timer_delete will clear pending signals. Also, speed up the timer to make the test suite complete faster.
5786b3b
to
e5b9b97
Compare
bors r+ |
The test was disabling the signal handler before disabling the timer.
Fix intermittent failures but reversing the cleanup order.
Also, speed up the timer to make the test suite complete faster.