Skip to content

Add RegNet Architecture in TorchVision #4403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Sep 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
9d3f0b1
initial code
kazhang Sep 9, 2021
e797fca
add SqueezeExcitation
kazhang Sep 9, 2021
692fbaa
initial code
kazhang Sep 9, 2021
eb6fb9f
add SqueezeExcitation
kazhang Sep 9, 2021
88840c3
add SqueezeExcitation
kazhang Sep 10, 2021
8bde15a
regnet blocks, stems and model definition
kazhang Sep 14, 2021
56352a0
nit
kazhang Sep 14, 2021
ce181c3
add fc layer
kazhang Sep 14, 2021
a91c32b
use Callable instead of Enum for block, stem and activation
kazhang Sep 15, 2021
0d1601b
add regnet_x and regnet_y model build functions, add docs
kazhang Sep 15, 2021
59c5c7e
remove unused depth
kazhang Sep 16, 2021
33ad54e
use BN/activation constructor and ConvBNActivation
kazhang Sep 17, 2021
346aba7
add expected test pkl files
kazhang Sep 17, 2021
852593d
allow custom activation in SqueezeExcitation
kazhang Sep 19, 2021
e486307
use ReLU as the default activation
kazhang Sep 20, 2021
8cab2bb
initial code
kazhang Sep 9, 2021
12b9d72
add SqueezeExcitation
kazhang Sep 9, 2021
89fbb2b
initial code
kazhang Sep 9, 2021
df48903
add SqueezeExcitation
kazhang Sep 9, 2021
d71014c
add SqueezeExcitation
kazhang Sep 10, 2021
b440ae4
regnet blocks, stems and model definition
kazhang Sep 14, 2021
0dc5bc8
nit
kazhang Sep 14, 2021
e02d886
add fc layer
kazhang Sep 14, 2021
5a6c729
use Callable instead of Enum for block, stem and activation
kazhang Sep 15, 2021
48a6e36
add regnet_x and regnet_y model build functions, add docs
kazhang Sep 15, 2021
2dbcd6d
remove unused depth
kazhang Sep 16, 2021
baca24f
use BN/activation constructor and ConvBNActivation
kazhang Sep 17, 2021
233bdff
reuse SqueezeExcitation from efficientnet
kazhang Sep 20, 2021
0968d27
refactor RegNetParams into BlockParams
kazhang Sep 20, 2021
2417685
use nn.init, replace np with torch
kazhang Sep 20, 2021
f3b3e96
update README
kazhang Sep 20, 2021
e60e4da
construct model with stem, block, classifier instances
kazhang Sep 21, 2021
27da2c7
Revert "construct model with stem, block, classifier instances"
kazhang Sep 22, 2021
ddf5383
remove unused blocks
kazhang Sep 22, 2021
293073d
support scaled model
kazhang Sep 22, 2021
3957d5d
fuse into ConvBNActivation
kazhang Sep 22, 2021
208f045
make reset_parameters private
kazhang Sep 22, 2021
f78a27f
fix type errors
kazhang Sep 22, 2021
f59ea8c
fix for unit test
kazhang Sep 23, 2021
b0325b6
add pretrained weights for 6 variant models, update docs
kazhang Sep 29, 2021
7fc4948
Merge branch 'main' into models/regnet
kazhang Sep 29, 2021
3f74fa8
Merge branch 'main' into models/regnet
kazhang Sep 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions docs/source/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ architectures for image classification:
- `Wide ResNet`_
- `MNASNet`_
- `EfficientNet`_
- `RegNet`_

You can construct a model with random weights by calling its constructor:

Expand Down Expand Up @@ -65,6 +66,20 @@ You can construct a model with random weights by calling its constructor:
efficientnet_b5 = models.efficientnet_b5()
efficientnet_b6 = models.efficientnet_b6()
efficientnet_b7 = models.efficientnet_b7()
regnet_y_400mf = models.regnet_y_400mf()
regnet_y_800mf = models.regnet_y_800mf()
regnet_y_1_6gf = models.regnet_y_1_6gf()
regnet_y_3_2gf = models.regnet_y_3_2gf()
regnet_y_8gf = models.regnet_y_8gf()
regnet_y_16gf = models.regnet_y_16gf()
regnet_y_32gf = models.regnet_y_32gf()
regnet_x_400mf = models.regnet_x_400mf()
regnet_x_800mf = models.regnet_x_800mf()
regnet_x_1_6gf = models.regnet_x_1_6gf()
regnet_x_3_2gf = models.regnet_x_3_2gf()
regnet_x_8gf = models.regnet_x_8gf()
regnet_x_16gf = models.regnet_x_16gf()
regnet_x_32gf = models.regnet_x_32gf()

We provide pre-trained models, using the PyTorch :mod:`torch.utils.model_zoo`.
These can be constructed by passing ``pretrained=True``:
Expand Down Expand Up @@ -94,6 +109,12 @@ These can be constructed by passing ``pretrained=True``:
efficientnet_b5 = models.efficientnet_b5(pretrained=True)
efficientnet_b6 = models.efficientnet_b6(pretrained=True)
efficientnet_b7 = models.efficientnet_b7(pretrained=True)
regnet_y_400mf = models.regnet_y_400mf(pretrained=True)
regnet_y_800mf = models.regnet_y_800mf(pretrained=True)
regnet_y_8gf = models.regnet_y_8gf(pretrained=True)
regnet_x_400mf = models.regnet_x_400mf(pretrained=True)
regnet_x_800mf = models.regnet_x_800mf(pretrained=True)
regnet_x_8gf = models.regnet_x_8gf(pretrained=True)

Instancing a pre-trained model will download its weights to a cache directory.
This directory can be set using the `TORCH_MODEL_ZOO` environment variable. See
Expand Down Expand Up @@ -188,6 +209,12 @@ EfficientNet-B4 83.384 96.594
EfficientNet-B5 83.444 96.628
EfficientNet-B6 84.008 96.916
EfficientNet-B7 84.122 96.908
regnet_x_400mf 72.834 90.950
regnet_x_800mf 75.190 92.418
regnet_x_8gf 79.324 94.694
regnet_y_400mf 74.024 91.680
regnet_y_800mf 76.420 93.136
regnet_y_8gf 79.966 95.100
================================ ============= =============


Expand All @@ -204,6 +231,7 @@ EfficientNet-B7 84.122 96.908
.. _ResNeXt: https://arxiv.org/abs/1611.05431
.. _MNASNet: https://arxiv.org/abs/1807.11626
.. _EfficientNet: https://arxiv.org/abs/1905.11946
.. _RegNet: https://arxiv.org/abs/2003.13678

.. currentmodule:: torchvision.models

Expand Down Expand Up @@ -317,6 +345,24 @@ EfficientNet
.. autofunction:: efficientnet_b6
.. autofunction:: efficientnet_b7

RegNet
------------

.. autofunction:: regnet_y_400mf
.. autofunction:: regnet_y_800mf
.. autofunction:: regnet_y_1_6gf
.. autofunction:: regnet_y_3_2gf
.. autofunction:: regnet_y_8gf
.. autofunction:: regnet_y_16gf
.. autofunction:: regnet_y_32gf
.. autofunction:: regnet_x_400mf
.. autofunction:: regnet_x_800mf
.. autofunction:: regnet_x_1_6gf
.. autofunction:: regnet_x_3_2gf
.. autofunction:: regnet_x_8gf
.. autofunction:: regnet_x_16gf
.. autofunction:: regnet_x_32gf

Quantized Models
----------------

Expand Down
4 changes: 4 additions & 0 deletions hubconf.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@
mnasnet1_3
from torchvision.models.efficientnet import efficientnet_b0, efficientnet_b1, efficientnet_b2, \
efficientnet_b3, efficientnet_b4, efficientnet_b5, efficientnet_b6, efficientnet_b7
from torchvision.models.regnet import regnet_y_400mf, regnet_y_800mf, \
regnet_y_1_6gf, regnet_y_3_2gf, regnet_y_8gf, regnet_y_16gf, regnet_y_32gf, \
regnet_x_400mf, regnet_x_800mf, regnet_x_1_6gf, regnet_x_3_2gf, regnet_x_8gf, \
regnet_x_16gf, regnet_x_32gf

# segmentation
from torchvision.models.segmentation import fcn_resnet50, fcn_resnet101, \
Expand Down
30 changes: 30 additions & 0 deletions references/classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,36 @@ The weights of the B0-B4 variants are ported from Ross Wightman's [timm repo](ht

The weights of the B5-B7 variants are ported from Luke Melas' [EfficientNet-PyTorch repo](https://github.com/lukemelas/EfficientNet-PyTorch/blob/1039e009545d9329ea026c9f7541341439712b96/efficientnet_pytorch/utils.py#L562-L564).


### RegNet

#### Small models
```
torchrun --nproc_per_node=8 train.py\
--model $MODEL --epochs 100 --batch-size 128 --wd 0.00005 --lr=0.8\
--lr-scheduler=cosineannealinglr --lr-warmup-method=linear\
--lr-warmup-epochs=5 --lr-warmup-decay=0.1
```
Here `$MODEL` is one of `regnet_x_400mf`, `regnet_x_800mf`, `regnet_x_1_6gf`, `regnet_y_400mf`, `regnet_y_800mf` and `regnet_y_1_6gf`. Please note we used learning rate 0.4 for `regent_y_400mf` to get the same Acc@1 as [the paper)(https://arxiv.org/abs/2003.13678).

### Medium models
```
torchrun --nproc_per_node=8 train.py\
--model $MODEL --epochs 100 --batch-size 64 --wd 0.00005 --lr=0.4\
--lr-scheduler=cosineannealinglr --lr-warmup-method=linear\
--lr-warmup-epochs=5 --lr-warmup-decay=0.1
```
Here `$MODEL` is one of `regnet_x_3_2gf`, `regnet_x_8gf`, `regnet_x_16gf`, `regnet_y_3_2gf` and `regnet_y_8gf`.

### Large models
```
torchrun --nproc_per_node=8 train.py\
--model $MODEL --epochs 100 --batch-size 32 --wd 0.00005 --lr=0.2\
--lr-scheduler=cosineannealinglr --lr-warmup-method=linear\
--lr-warmup-epochs=5 --lr-warmup-decay=0.1
```
Here `$MODEL` is one of `regnet_x_32gf`, `regnet_y_16gf` and `regnet_y_32gf`.

## Mixed precision training
Automatic Mixed Precision (AMP) training on GPU for Pytorch can be enabled with the [NVIDIA Apex extension](https://github.com/NVIDIA/apex).

Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added test/expect/ModelTester.test_regnet_x_8gf_expect.pkl
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added test/expect/ModelTester.test_regnet_y_8gf_expect.pkl
Binary file not shown.
1 change: 1 addition & 0 deletions torchvision/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from .mnasnet import *
from .shufflenetv2 import *
from .efficientnet import *
from .regnet import *
from . import segmentation
from . import detection
from . import video
Expand Down
Loading