Moving documentation of quantized models in the right place.

datumbox · datumbox · commit 5baebb462626 · 2021-01-29T15:04:15.000Z
diff --git a/docs/source/models.rst b/docs/source/models.rst
@@ -263,6 +263,53 @@ MNASNet
 .. autofunction:: mnasnet1_0
 .. autofunction:: mnasnet1_3
 
+Quantized Models
+----------------
+
+The following architectures provide support for INT8 quantized models. You can get
+a model with random weights by calling its constructor:
+
+.. code:: python
+
+    import torchvision.models as models
+    googlenet = models.quantization.googlenet()
+    inception_v3 = models.quantization.inception_v3()
+    mobilenet_v2 = models.quantization.mobilenet_v2()
+    mobilenet_v3_large = models.quantization.mobilenet_v3_large()
+    mobilenet_v3_small = models.quantization.mobilenet_v3_small()
+    resnet18 = models.quantization.resnet18()
+    resnet50 = models.quantization.resnet50()
+    resnext101_32x8d = models.quantization.resnext101_32x8d()
+    shufflenet_v2_x0_5 = models.quantization.shufflenet_v2_x0_5()
+    shufflenet_v2_x1_0 = models.quantization.shufflenet_v2_x1_0()
+    shufflenet_v2_x1_5 = models.quantization.shufflenet_v2_x1_5()
+    shufflenet_v2_x2_0 = models.quantization.shufflenet_v2_x2_0()
+
+Obtaining a pre-trained quantized model can be done with a few lines of code:
+
+.. code:: python
+
+    import torchvision.models as models
+    model = models.quantization.mobilenet_v2(pretrained=True, quantize=True)
+    model.eval()
+    # run the model with quantized inputs and weights
+    out = model(torch.rand(1, 3, 224, 224))
+
+We provide pre-trained quantized weights for the following models:
+
+================================  =============  =============
+Model                             Acc@1          Acc@5
+================================  =============  =============
+MobileNet V2                      71.658         90.150
+MobileNet V3 Large                TODO           TODO
+ShuffleNet V2                     68.360         87.582
+ResNet 18                         69.494         88.882
+ResNet 50                         75.920         92.814
+ResNext 101 32x8d                 78.986         94.480
+Inception V3                      77.176         93.354
+GoogleNet                         69.826         89.404
+================================  =============  =============
+
 
 Semantic Segmentation
 =====================
diff --git a/references/classification/README.md b/references/classification/README.md
@@ -74,27 +74,6 @@ python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py\
 ```
 
 ## Quantized
-### INT8 models
-We add INT8 quantized models to follow the quantization support added in PyTorch 1.3. 
-
-Obtaining a pre-trained quantized model can be obtained with a few lines of code:
-```
-model = torchvision.models.quantization.mobilenet_v2(pretrained=True, quantize=True)
-model.eval()
-# run the model with quantized inputs and weights
-out = model(torch.rand(1, 3, 224, 224))
-```
-We provide pre-trained quantized weights for the following models:
-
-|       Model       |  Acc@1 |  Acc@5 |
-|:-----------------:|:------:|:------:|
-|    MobileNet V2   | 71.658 | 90.150 |
-|   ShuffleNet V2:  | 68.360 | 87.582 |
-|     ResNet 18     | 69.494 | 88.882 |
-|     ResNet 50     | 75.920 | 92.814 |
-| ResNext 101 32x8d | 78.986 | 94.480 |
-|    Inception V3   | 77.176 | 93.354 |
-|     GoogleNet     | 69.826 | 89.404 |
 
 ### Parameters used for generating quantized models:
 
@@ -106,6 +85,10 @@ For all post training quantized models (All quantized models except mobilenet-v2
 4. eval_batch_size: 128
 5. backend: 'fbgemm'
 
+```
+python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' --model='<model_name>'
+```
+
 For Mobilenet-v2, the model was trained with quantization aware training, the settings used are:
 1. num_workers: 16
 2. batch_size: 32
@@ -119,14 +102,17 @@ For Mobilenet-v2, the model was trained with quantization aware training, the se
 10. lr_step_size:30
 11. lr_gamma: 0.1
 
+```
+python -m torch.distributed.launch --nproc_per_node=8 --use_env train_quantization.py --model='mobilenetv2'
+```
+
 Training converges at about 10 epochs.
 
 For post training quant, device is set to CPU. For training, the device is set to CUDA
 
 ### Command to evaluate quantized models using the pre-trained weights:
-For all quantized models:
+
 ```
-python references/classification/train_quantization.py  --data-path='imagenet_full_size/' \
-    --device='cpu' --test-only --backend='fbgemm' --model='<model_name>'
+python train_quantization.py --device='cpu' --test-only --backend='fbgemm' --model='<model_name>'
 ```