MonoCD: Monocular 3D Object Detection with Complementary Depths
CVPR 2024(2024)
摘要
Monocular 3D object detection has attracted widespread attention due to its
potential to accurately obtain object 3D localization from a single image at a
low cost. Depth estimation is an essential but challenging subtask of monocular
3D object detection due to the ill-posedness of 2D to 3D mapping. Many methods
explore multiple local depth clues such as object heights and keypoints and
then formulate the object depth estimation as an ensemble of multiple depth
predictions to mitigate the insufficiency of single-depth information. However,
the errors of existing multiple depths tend to have the same sign, which
hinders them from neutralizing each other and limits the overall accuracy of
combined depth. To alleviate this problem, we propose to increase the
complementarity of depths with two novel designs. First, we add a new depth
prediction branch named complementary depth that utilizes global and efficient
depth clues from the entire image rather than the local clues to reduce the
correlation of depth predictions. Second, we propose to fully exploit the
geometric relations between multiple depth clues to achieve complementarity in
form. Benefiting from these designs, our method achieves higher
complementarity. Experiments on the KITTI benchmark demonstrate that our method
achieves state-of-the-art performance without introducing extra data. In
addition, complementary depth can also be a lightweight and plug-and-play
module to boost multiple existing monocular 3d object detectors. Code is
available at https://github.com/elvintanhust/MonoCD.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要