Skip to content

Flaky test TestSimulator::test_one_gaussian #5409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Jan 27, 2022 · 3 comments · Fixed by #5467
Closed

Flaky test TestSimulator::test_one_gaussian #5409

ricardoV94 opened this issue Jan 27, 2022 · 3 comments · Fixed by #5467

Comments

@ricardoV94
Copy link
Member

This one failed a couple of times recently, including in https://github.com/pymc-devs/pymc/runs/4964000272?check_suite_focus=true

We need to check that behavior is still as expected and if so, just tweak the threshold.

@LukeLB
Copy link
Contributor

LukeLB commented Feb 12, 2022

@ricardoV94 I've been taking a look at this. It looks like the test is failing due to random sampling error. I simulated 5000 examples of the failing assert statement, the results are shown in the plot below.
image
The results show that that from 5000 samples about 0.1% of the tests are expected to fail based on the 0.1 threshold currently in use. Changing the threshold to 0.2 would reduce the failure rate to <0.02% (from 5000 samples). So my suggestion is to change the threshold to 0.2 but also add a comment to the test explaining why it may fail in the future (given enough time a value greater than 0.2 is still probable). What do you think?

@ricardoV94
Copy link
Member Author

@LukeLB thanks for investigating this. Changing the threshold seems fine, perhaps something like 0.15 is enough.

Do you want to open a PR?

@LukeLB
Copy link
Contributor

LukeLB commented Feb 13, 2022

Sure thing, I'll make that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants