Skip to content

Fix flakiness on StochasticDepth test #4758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 27, 2021

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Oct 27, 2021

Resolves partially #4506

cc @pmeier

@facebook-github-bot
Copy link

facebook-github-bot commented Oct 27, 2021

💊 CI failures summary and remediations

As of commit 59c1c8b (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Contributor Author

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some highlights on the changes:

@pytest.mark.parametrize("p", [0.2, 0.5, 0.8])
@pytest.mark.parametrize("mode", ["batch", "row"])
def test_stochastic_depth(self, mode, p):
def test_stochastic_depth_random(self, seed, mode, p):
torch.manual_seed(seed)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We maintain the original test because it allows us to check that the different mode values operate as expected. Using p values in the interval (0, 1) is critical because for p=0 and p=1 the two modes behave the same. The mitigation for the flakiness here is to set the seed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my laptop the test passed over 97/100 seeds so there's a very small change it will fail in the future. (first failure on the 11th seed lol)

I agree it's fine to leave as-is for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because I increased the p-value threshold to 1%. You kind of expect false positives around that rate now, hopefully we won't see it due to the seed setting.

@@ -1173,7 +1175,22 @@ def test_stochastic_depth(self, mode, p):
num_samples += batch_size

p_value = stats.binom_test(counts, num_samples, p=p)
assert p_value > 0.0001
assert p_value > 0.01
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Significantly increase the threshold since now we only check 10 seeds. We can reduce if flakiness continues.

@pytest.mark.parametrize("seed", range(10))
@pytest.mark.parametrize("p", (0, 1))
@pytest.mark.parametrize("mode", ["batch", "row"])
def test_stochastic_depth(self, seed, mode, p):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an additional test with p=0 and p=1 to confirm it works as expected for the extreme values.

noise = noise.bernoulli_(survival_rate).div_(survival_rate)
noise = noise.bernoulli_(survival_rate)
if survival_rate > 0.0:
noise.div_(survival_rate)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually a bug; the previous code produced nans! Though it really isn't something that users will face. Setting p=1 to the operator means that you will always drop the block (set it to 0). I'm not sure that this is something many users would like to do, but worth fixing anyway.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @datumbox!

@pytest.mark.parametrize("p", [0.2, 0.5, 0.8])
@pytest.mark.parametrize("mode", ["batch", "row"])
def test_stochastic_depth(self, mode, p):
def test_stochastic_depth_random(self, seed, mode, p):
torch.manual_seed(seed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my laptop the test passed over 97/100 seeds so there's a very small change it will fail in the future. (first failure on the 11th seed lol)

I agree it's fine to leave as-is for now

@datumbox datumbox merged commit bbfda42 into pytorch:main Oct 27, 2021
@datumbox datumbox deleted the tests/flaky_stochastic_depth branch October 27, 2021 13:27
facebook-github-bot pushed a commit that referenced this pull request Nov 1, 2021
Summary:
* Fix flakiness on the TestStochasticDepth test.

* Fix minor bug when p=1.0

* Remove device and dtype setting.

Reviewed By: datumbox

Differential Revision: D32064694

fbshipit-source-id: 4107800cb6f8e56bcd85db31176afae394b86a21
cyyever pushed a commit to cyyever/vision that referenced this pull request Nov 16, 2021
* Fix flakiness on the TestStochasticDepth test.

* Fix minor bug when p=1.0

* Remove device and dtype setting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants