Object detection with Mqbench

This part, we introduce how to quantize an object detection model using mqbench.

Getting Started

1. Clone the repositories.

git clone https://github.com/ModelTC/MQBench.git
# Clone UP repository and install it.

2. Quantization aware training.

# Prepare your float pretrained model.
cd united-perception/scripts
# Follow the prompts to set config in train_quant.sh.
sh dist_train_quant.sh

We have several examples of qat config in united-perception repository:

For retinanet-tensorrt:

float pretrained config file: configs/det/retinanet/retinanet-r18-improve.yaml
qat config file: configs/det/retinanet/retinanet-r18-improve_quant_trt.yaml

For yolox-tensorrt:

float pretrained config file: configs/det/retinanet/yolox_s_ret_a1_comloc.yaml
qat config file: configs/det/retinanet/yolox_s_ret_a1_comloc_quant_trt.yaml

For yolox-vitis:

float pretrained config file: configs/det/yolox/yolox_fpga.yaml
qat config file: configs/det/yolox/yolox_fpga_quant_vitis.yaml

Something import in config file:

deploy_backend: Choose your deploy backend supported in mqbench.

ptq_only: If True, only ptq will be executed. If False, qat will be executed after ptq calibration.

extra_qconfig_dict: Choose your quantization config supported in mqbench.

leaf_module: Prevent torch.fx tool entering the module.

extra_quantizer_dict: Add some qat modules.

resume_model: The path to your float pretrained model.

tocaffe_friendly: It is recommended to set it to true, which will affect the output onnx model.

3. Resume training during qat.

cd united-perception/scripts
# just set resume_model in config file to your model, we will do all the rest.
sh dist_train_quant.sh

4. Evaluate your quantized model.

cd united-perception/scripts
# set resume_model in config file to your model
# add -e to train_quant.sh
sh dist_train_quant.sh

Introduction of UP-Mqbench Project

The training codes start in united-perception/commands/train.py. The delpoy codes start in united-perception/commands/quant_deploy.py.

When you set the runner type to quant in config file, QuantRunner will be executed in united-perception/runner/quant_runner.py.

Firstly, build your float model in self.build_model().
Load your float pretrained model/quantized model in self.load_ckpt().
Use torch.fx to trace your model in self.quantize_model().
Set your optimization and lr scheduler in self.build_trainer().
Ptq and eval in self.calibrate()
Train in self.train()

Something important:

Your model should be splited into network and post-processing. Fx should only trace the network.

Quantized model should be saved with the key of qat, as shown in self.save(). This will be used in self.resume_model_from_fp() and self.resume_model_from_quant().

We disable the ema in qat. If your ckpt has ema state, we will load ema state into model, as shown in self.load_ckpt().

Be careful when your quantized model has extra learnable parameters. You can check it in optimizer, such as united-perception/tasks/det/plugins/yolov5/utils/optimizer_helper.py. Lsq has been checked.

When you are going to deploy model, self.model.deploy should be set to True, as shown in united-perception/apis/quant_deploy.py. This will remove redundant nodes in your model.