Towards RGB-NIR Cross-modality Image Registration and Beyond
CoRR(2024)
摘要
This paper focuses on the area of RGB(visible)-NIR(near-infrared)
cross-modality image registration, which is crucial for many downstream vision
tasks to fully leverage the complementary information present in visible and
infrared images. In this field, researchers face two primary challenges - the
absence of a correctly-annotated benchmark with viewpoint variations for
evaluating RGB-NIR cross-modality registration methods and the problem of
inconsistent local features caused by the appearance discrepancy between
RGB-NIR cross-modality images. To address these challenges, we first present
the RGB-NIR Image Registration (RGB-NIR-IRegis) benchmark, which, for the first
time, enables fair and comprehensive evaluations for the task of RGB-NIR
cross-modality image registration. Evaluations of previous methods highlight
the significant challenges posed by our RGB-NIR-IRegis benchmark, especially on
RGB-NIR image pairs with viewpoint variations. To analyze the causes of the
unsatisfying performance, we then design several metrics to reveal the toxic
impact of inconsistent local features between visible and infrared images on
the model performance. This further motivates us to develop a baseline method
named Semantic Guidance Transformer (SGFormer), which utilizes high-level
semantic guidance to mitigate the negative impact of local inconsistent
features. Despite the simplicity of our motivation, extensive experimental
results show the effectiveness of our method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要