1. Single-shot bidirectional pyramid networks for high-quality object detection
- Author
-
Xiongwei Wu, Jianke Zhu, Daoxin Zhang, Steven C. H. Hoi, and Doyen Sahoo
- Subjects
0209 industrial biotechnology ,Computer science ,business.industry ,Cognitive Neuroscience ,Deep learning ,Detector ,Single shot ,Pattern recognition ,02 engineering and technology ,Pascal (programming language) ,Object detection ,Computer Science Applications ,020901 industrial engineering & automation ,Artificial Intelligence ,Pyramid ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,computer.programming_language - Abstract
Recent years have witnessed significant advances in deep learning based object detection. Despite being extensively explored, most existing detectors are designed to detect objects with relatively low-quality prediction of locations, i.e., they are often trained with the threshold of Intersection over Union (IoU) set as 0.5. This can yield low-quality or even noisy detections. Designing high quality object detectors which have a more precise localization (e.g. IoU > 0.5) remains an open challenge. In this paper, we propose a novel single-shot detection framework called Bidirectional Pyramid Networks (BPN) for high-quality object detection. It comprises two novel components: (i) Bidirectional Feature Pyramid structure and Anchor Refinement (AR). The bidirectional feature pyramid structure aims to use semantic-rich deep layer features to enhance the quality of the shallow layer features, and simultaneously use the spatially-rich shallow layer features to enhance the quality of deep layer features, leading to a stronger representation of both small and large objects for high quality detection. Our anchor refinement scheme gradually refines the quality of pre-designed anchors by learning multi-level regressors, giving more precise localization predictions. We performed extensive experiments on both PASCAL VOC and MSCOCO datasets, and achieved the best performance among all single-shot detectors. The performance was especially superior in the regime of high-quality detection.
- Published
- 2020