Start Over

One-stage object detection knowledge distillation via adversarial learning.

Authors :: Dong, Na
Zhang, Yongqiang
Ding, Mingli
Xu, Shibiao
Bai, Yancheng
Source :: Applied Intelligence; Mar2022, Vol. 52 Issue 4, p4582-4598, 17p
Publication Year :: 2022
Abstract: Impressive methods for object detection tasks have been proposed based on convolutional neural networks (CNNs), however, they usually use very computation expensive deep networks to obtain such significant performance. Knowledge distillation has attracted much attention in the task of image classification lately since it can use compact models that reduce computations while preserving performance. Moreover, the best performing deep neural networks often assemble the outputs of multiple networks in an average way. However, the memory required to store these networks, and the time required to execute them in inference, which prohibits these methods used in real-time applications. In this paper, we present a knowledge distillation method for one-stage object detection, which can assemble a variety of large, complex trained networks into a lightweight network. In order to transfer diverse knowledge from various trained one-stage object detection networks, an adversarial-based learning strategy is employed as supervision to guide and optimize the lightweight student network to recover the knowledge of teacher networks, and to enable the discriminator module to distinguish the feature of teacher and student simultaneously. The proposed method exhibits two predominant advantages: (1) The lightweight student model can learn the knowledge of the teacher, which contains richer discriminative information than the model trained from scratch. (2) Faster inference speed than traditional ensemble methods from multiple networks is realized. A large number of experiments are carried out on PASCAL VOC and MS COCO datasets to verify the effectiveness of the proposed method for one-stage object detection, which obtains 3.43%, 2.48%, and 5.78% mAP promotions for vgg11-ssd, mobilenetv1-ssd-lite and mobilenetv2-ssd-lite student network on the PASCAL VOC 2007 dataset, respectively. Furthermore, with multi-teacher ensemble method, vgg11-ssd gains 7.10% improvement, which is remarkable. [ABSTRACT FROM AUTHOR]