Skip to content

Simplify EMA to use Pytorch's update_parameters #5284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prabhat00155 opened this issue Jan 26, 2022 · 6 comments · Fixed by #5469
Closed

Simplify EMA to use Pytorch's update_parameters #5284

prabhat00155 opened this issue Jan 26, 2022 · 6 comments · Fixed by #5469

Comments

@prabhat00155
Copy link
Contributor

prabhat00155 commented Jan 26, 2022

Pytorch's AveragedModel handles buffer properly after pytorch/pytorch#71763. This can be used now instead of having a custom update_parameters() here.

cc @datumbox

@vadimkantorov
Copy link

A related issue to propose a EMA optimizer / functional optimizer into core: pytorch/pytorch#71683

@xiaohu2015
Copy link
Contributor

I'm interested in this work.

@datumbox
Copy link
Contributor

@xiaohu2015 It's yours. :)

It should be really straight forward as we basically want to reuse what we upstream to Core. We will need to do a couple of dummy training runs with very little data to confirm that the EMA utility works as expected but nothing as large-scale as training models.

@prabhat00155
Copy link
Contributor Author

Should we wait for Pytorch's release with the related changes?

@xiaohu2015
Copy link
Contributor

Should we wait for Pytorch's release with the related changes?

Pytorch 1.11 version?

@datumbox
Copy link
Contributor

We can use the nightly of PyTorch that contains the change. Our release branch for the upcoming release is already cut so your changes won't affect it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants