-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Initial version of classification reference scripts #819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also log the learning rate
Identified a bug in the reporting of the results. They need to be reduced between all processes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed the train script. didn't review classification/utills.py
|
||
|
||
def setup_for_distributed(is_master): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
printing override seems fine, but torch.save override seems pretty sketchy. Maybe consider having a utils.save_on_master
that you use, rather than monkey-patching torch.save
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
This is a fundamental feature for distributed training, so it's better to have it right.
Codecov Report
@@ Coverage Diff @@
## master #819 +/- ##
=======================================
Coverage 51.58% 51.58%
=======================================
Files 34 34
Lines 3342 3342
Branches 536 536
=======================================
Hits 1724 1724
Misses 1486 1486
Partials 132 132 Continue to review full report at Codecov.
|
This PR introduces the foundations for reference training/evaluation scripts for torchvision.
The idea is that all pre-trained models will have corresponding training scripts / command-line arguments, so that reproducing a trained model should be straightforward.
This is not at its final version. I'll be merging this soon, and after adding segmentation and detection training/evaluation scripts, a lot of it will be refactored and included inside torchvision.