- Python: >= 3.10
- PyTorch: >= 1.11.0+cu11.3 (Install Here)
- Path: Modify
sys.path.append('/sjq/NeuroQuant/')in the head of Python files to your local path.
NeuroQuant is a post-training quantization framework. Follow these steps to get started.
Since our method is applied post-training, you first need a full-precision (FP32) model as a baseline.
HNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/regress.py --data_path bunny --vid Bunny --arch hnerv --outf HNeRV_Bunny_1280x640 --config configs/HNeRV/Bunny_1280x640_3M.yamlNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/regress.py --data_path bunny --vid Bunny --arch nerv --outf NeRV_Bunny_1280x640 --config configs/NeRV/Bunny_1280x640_3M.yamlAfter obtaining the FP32 model, we perform layer-wise bit-width allocation to achieve mixed-precision quantization. Adjusting the total bit budget allows different rate-distortion trade-offs (see Section 3.1 in the paper).
- Define mixed-precision configurations for a given bit range.
- Select the optimal configuration using the proposed Omega sensitivity criterion.
- Optional: Integer Linear Programming or Genetic Algorithms can be applied using Omega as the scoring metric.
A toy example is provided in methods/bit_assign.py.
HNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/bit_assign.py --data_path bunny \
--arch hnerv --vid Bunny --outf HNeRV_Bunny_1280x640 --config configs/HNeRV/Bunny_1280x640_3M.yaml \
--batch_size 2 --channel_wise --init max --mode omega \
--ckpt results/HNeRV_Bunny_1280x640/Bunny_e300_b1_lr0.0005_l2/Encoder_0.31M_Decoder_2.65M_Total_2.66M/epoch300.pthHNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/bit_assign.py --data_path bunny \
--arch nerv --vid Bunny --outf NeRV_Bunny_1280x640 --config configs/NeRV/Bunny_1280x640_3M.yaml \
--batch_size 2 --channel_wise --init max --mode omega \
--ckpt results/NeRV_Bunny_1280x640/Bunny_e300_b1_lr0.0005_l2/Encoder_0.0M_Decoder_3.08M_Total_3.08M/epoch300.pthOnce bit-widths are determined, calibrate the quantization parameters. Based on AdaRound and BRECQ, we explore INR-centric strategies, including (see Section 3.2 in the paper):
-
Channel-wise quantization
-
Network-wise calibration
HNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/calibrate_network.py --data_path bunny \
--arch hnerv --vid Bunny --outf HNeRV_Bunny_1280x640 --config configs/HNeRV/Bunny_1280x640_3M.yaml \
--batch_size 2 --channel_wise --init max --opt_mode mse --input_prob 1.0 --norm_p 2.0 --iters_w 21000 \
--hadamard --weight 0.01 --b_start 20 --b_end 2 --warmup 0.2 --lr 0.003 --precision 6 5 4 5 5 6 6 \
--ckpt results/HNeRV_Bunny_1280x640/Bunny_e300_b1_lr0.0005_l2/Encoder_0.31M_Decoder_2.65M_Total_2.66M/epoch300.pthNeRV-Bunny
CUDA_VISIBLE_DEVICES=0 python methods/calibrate_network.py --data_path bunny \
--arch nerv --vid Bunny --outf NeRV_Bunny_1280x640 --config configs/NeRV/Bunny_1280x640_3M.yaml \
--batch_size 2 --channel_wise --init max --opt_mode mse --input_prob 1.0 --norm_p 2.0 --iters_w 21000 \
--hadamard --weight 0.01 --b_start 20 --b_end 2 --warmup 0.2 --lr 0.003 --precision 6 5 4 5 5 6 6 \
--ckpt results/NeRV_Bunny_1280x640/Bunny_e300_b1_lr0.0005_l2/Encoder_0.0M_Decoder_3.08M_Total_3.08M/epoch300.pthNote: Hadamard transform can increase runtime. Disable it with
--hadamardor usefast-hadamard-transforminquantization.quant_layer.pyfor faster CUDA implementation.
Finally, the quantized model is encoded into a bitstream. Any entropy codec or entropy model can be used, so this step is implementation-agnostic.
Based on the scripts of above quaick start, we provide some training logs and sample results on results for reference. Due to remote access, the training time shown in the log will be longer than reported in paper. The corresponding model can be found in Google Drive.
Modify configs and hyperparameters to explore additional experiments and results.
Tools for analyzing INR characteristics are included, as described in the paper.
See draw/ReadMe.md for details.
@inproceedings{shiquantizing,
title={On Quantizing Neural Representation for Variable-Rate Video Coding},
author={Shi, Junqi and Chen, Zhujia and Li, Hanfei and Zhao, Qi and Lu, Ming and Chen, Tong and Ma, Zhan},
booktitle={The Thirteenth International Conference on Learning Representations}
}
