DA-BEV: Depth Aware BEV Transformer for 3D Object Detection

arxiv(2023)

引用 0|浏览65
暂无评分
摘要
In this paper, we present DA-BEV, an implicit depth learning method for Transformer-based camera-only 3D object detection in bird's eye view (BEV). First, a Depth-Aware Spatial Cross-Attention (DA-SCA) module is proposed to take depth into consideration when querying image features to construct BEV features. Then, to make the BEV feature more depth-aware, we introduce an auxiliary learning task, called Depth-wise Contrastive Learning (DCL), by sampling positive and negative BEV features along each ray that connects an object and a camera. DA-SCA and DCL jointly improve the BEV representation and make it more depth-aware. We show that DA-BEV obtains significant improvement (+2.8 NDS) on nuScenes val under the same setting when compared with the baseline method BEVFormer. DA-BEV also achieves strong results of 60.0 NDS and 51.5mAP on nuScenes test with pre-trained VoVNet-99 as backbone. We will release our code.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要