-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Simplify EMA to use Pytorch's update_parameters #5284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A related issue to propose a EMA optimizer / functional optimizer into core: pytorch/pytorch#71683 |
I'm interested in this work. |
@xiaohu2015 It's yours. :) It should be really straight forward as we basically want to reuse what we upstream to Core. We will need to do a couple of dummy training runs with very little data to confirm that the EMA utility works as expected but nothing as large-scale as training models. |
Should we wait for Pytorch's release with the related changes? |
Pytorch 1.11 version? |
We can use the nightly of PyTorch that contains the change. Our release branch for the upcoming release is already cut so your changes won't affect it. |
Uh oh!
There was an error while loading. Please reload this page.
Pytorch's AveragedModel handles buffer properly after pytorch/pytorch#71763. This can be used now instead of having a custom update_parameters() here.
cc @datumbox
The text was updated successfully, but these errors were encountered: