301. GPro3D: Deriving 3D BBox from ground plane in monocular 3D object detection.
- Author
-
Yang, Fan, Xu, Xinhao, Chen, Hui, Guo, Yuchen, He, Yuwei, Ni, Kai, and Ding, Guiguang
- Subjects
- *
OBJECT recognition (Computer vision) , *MONOCULARS , *COMPUTER vision - Abstract
Considering the inherent ill-posed nature, monocular 3D object detection (M3OD) is extremely challenging. The ground plane prior is a highly informative geometry clue in M3OD. However, it has been neglected by most mainstream methods. This paper introduces an original M3OD framework that leverages the ground plane to directly derive the object's 3D Bounding Box (BBox) and 3D attributes geometrically. We identify and tackle three key factors that limit the applicability of the ground plane: the projection point localization issue, the ground plane tilt issue, and the lack of ground plane annotation issue. For the projection point localization issue, we propose leveraging the car's explicit and salient wheel pixels, which are easier for the neural network to detect compared to the bottom vertices or the bottom center of the 3D BBox. To tackle the ground plane tilt problem, we propose a vertical-edge-enhanced horizon line detection algorithm to precisely deduce the ground plane equation. Moreover, using only M3OD labels, wheel pixel and horizon line pseudo-labels can be easily generated to train the network without extra data or annotation cost. Extensive experiments demonstrate the effectiveness and superiority of our framework over previous methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF