Sitpose: a siamese convolutional transformer for relative camera pose estimation

Kai Leng,Cong Yang,Wei Sui,Jie Liu, Zhijun Li

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME(2023)

引用 0|浏览10
暂无评分
摘要
Relative Camera Pose Estimation (RCPE) aims to calculate the translation and rotation between two frames with overlapped regions, which is crucial to computer vision and robotics. This paper presents a novel siamese convolutional transformer model, SiTPose, to regress relative camera pose directly. SiTPose is distinguished in three aspects: (1) With a cross-attention feature extractor and a compact transformer encoder, extreme rotation errors (> 150 degrees) are significantly reduced: from 9.7% with the state-of-the-art 8-Points to 1. on the 7Scenes dataset. (2) SiTPose is also robust to narrow-baseline cases (slight rotation angle and large translation between neighboring frames), while existing RCPE methods mainly focus on wide-baseline cases. (3) SiTPose can be flexibly extended to geometry-based vSLAM (namely SiT-SLAM) in a multi-threaded way to prevent tracking lost and scale ambiguity problems. Results on multiple datasets show that SiT-SLAM yields a marked improvement in robustness and localization accuracy in complex scenarios, e.g., RMSE error is reduced from 26.36m with the classic ORBSLAM3 method to 6.94m on the KITTI-09.
更多
查看译文
关键词
Relative Pose Estimation, SLAM, Camera Pose Estimation, Cross Attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要