Object detection with Mqbench
This part, we introduce how to quantize an object detection model using mqbench.
Getting Started
1. Clone the repositories.
git clone https://github.com/ModelTC/MQBench.git
# Clone UP repository and install it.
2. Quantization aware training.
# Prepare your float pretrained model.
cd united-perception/scripts
# Follow the prompts to set config in train_quant.sh.
sh dist_train_quant.sh
We have several examples of qat config in united-perception repository:
- For retinanet-tensorrt:
float pretrained config file: configs/det/retinanet/retinanet-r18-improve.yaml
qat config file: configs/det/retinanet/retinanet-r18-improve_quant_trt.yaml
- For yolox-tensorrt:
float pretrained config file: configs/det/retinanet/yolox_s_ret_a1_comloc.yaml
qat config file: configs/det/retinanet/yolox_s_ret_a1_comloc_quant_trt.yaml
- For yolox-vitis:
float pretrained config file: configs/det/yolox/yolox_fpga.yaml
qat config file: configs/det/yolox/yolox_fpga_quant_vitis.yaml
Something import in config file:
deploy_backend: Choose your deploy backend supported in mqbench.
ptq_only: If True, only ptq will be executed. If False, qat will be executed after ptq calibration.
extra_qconfig_dict: Choose your quantization config supported in mqbench.
leaf_module: Prevent torch.fx tool entering the module.
extra_quantizer_dict: Add some qat modules.
resume_model: The path to your float pretrained model.
tocaffe_friendly: It is recommended to set it to true, which will affect the output onnx model.
3. Resume training during qat.
cd united-perception/scripts
# just set resume_model in config file to your model, we will do all the rest.
sh dist_train_quant.sh
4. Evaluate your quantized model.
cd united-perception/scripts
# set resume_model in config file to your model
# add -e to train_quant.sh
sh dist_train_quant.sh
Introduction of UP-Mqbench Project
The training codes start in united-perception/commands/train.py. The delpoy codes start in united-perception/commands/quant_deploy.py.
When you set the runner type to quant in config file, QuantRunner will be executed in united-perception/runner/quant_runner.py.
Firstly, build your float model in self.build_model().
Load your float pretrained model/quantized model in self.load_ckpt().
Use torch.fx to trace your model in self.quantize_model().
Set your optimization and lr scheduler in self.build_trainer().
Ptq and eval in self.calibrate()
Train in self.train()
Something important:
Your model should be splited into network and post-processing. Fx should only trace the network.
Quantized model should be saved with the key of qat, as shown in self.save(). This will be used in self.resume_model_from_fp() and self.resume_model_from_quant().
We disable the ema in qat. If your ckpt has ema state, we will load ema state into model, as shown in self.load_ckpt().
Be careful when your quantized model has extra learnable parameters. You can check it in optimizer, such as united-perception/tasks/det/plugins/yolov5/utils/optimizer_helper.py. Lsq has been checked.
When you are going to deploy model, self.model.deploy should be set to True, as shown in united-perception/apis/quant_deploy.py. This will remove redundant nodes in your model.