Misaligned Visible-Thermal Object Detection: A Drone-based Benchmark and Baseline

Kechen Song, Xiaotong Xue,Hongwei Wen, Yingying Ji,Yunhui Yan,Qinggang Meng

IEEE Transactions on Intelligent Vehicles（2024）

引用 0|浏览9

暂无评分

摘要

Multispectral object detection has achieved remarkable results due to its ability to fuse information from visible and thermal modalities in recent years. However, the existing visible-thermal datasets are constructed based on manually aligned image pairs, which cannot fully represent the challenges of real-world scenarios where image pairs are often misaligned. Existing methods for visible-thermal object detection are based on aligned data and are limited by the accuracy of registration. To address the above issues, we propose a dataset, namely DVTOD, which is a misaligned visible-thermal object detection dataset captured by drones. DVTOD includes 16 challenging attributes and 54 capture scenes. Furthermore, we introduce a cross-modal alignment detector (CMA-Det) for misaligned visible-thermal object detection. Firstly, we design an alignment network to estimate the visible-to-thermal deformation field, which is used to correct for misalignment of the corresponding visible and thermal features. Secondly, we propose a strategy called Object Search Rectification (OSR) to improve the robustness of feature alignment. To better remove the interference of complex backgrounds, a bi-directional feature correction fusion module (BFCFM) is designed to calibrate bimodal features by exploiting the correlation of channel and spatial information between two modalities. CMA-Det outperforms existing methods on the DVTOD dataset and two other visible-thermal object detection datasets. The dataset and code will be published at https://github.com/VDT-2048/DVTOD .

查看译文

关键词

Multispectral object detection,visible-thermal dataset,cross-modal alignment,feature alignment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要