Explicit Interaction for Fusion-Based Place Recognition
CoRR(2024)
摘要
Fusion-based place recognition is an emerging technique jointly utilizing
multi-modal perception data, to recognize previously visited places in
GPS-denied scenarios for robots and autonomous vehicles. Recent fusion-based
place recognition methods combine multi-modal features in implicit manners.
While achieving remarkable results, they do not explicitly consider what the
individual modality affords in the fusion system. Therefore, the benefit of
multi-modal feature fusion may not be fully explored. In this paper, we propose
a novel fusion-based network, dubbed EINet, to achieve explicit interaction of
the two modalities. EINet uses LiDAR ranges to supervise more robust vision
features for long time spans, and simultaneously uses camera RGB data to
improve the discrimination of LiDAR point clouds. In addition, we develop a new
benchmark for the place recognition task based on the nuScenes dataset. To
establish this benchmark for future research with comprehensive comparisons, we
introduce both supervised and self-supervised training schemes alongside
evaluation protocols. We conduct extensive experiments on the proposed
benchmark, and the experimental results show that our EINet exhibits better
recognition performance as well as solid generalization ability compared to the
state-of-the-art fusion-based place recognition approaches. Our open-source
code and benchmark are released at: https://github.com/BIT-XJY/EINet.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要