Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization
CoRR(2024)
摘要
Effectively and efficiently retrieving images from remote sensing databases
is a critical challenge in the realm of remote sensing big data. Utilizing
hand-drawn sketches as retrieval inputs offers intuitive and user-friendly
advantages, yet the potential of multi-level feature integration from sketches
remains underexplored, leading to suboptimal retrieval performance. To address
this gap, our study introduces a novel zero-shot, sketch-based retrieval method
for remote sensing images, leveraging multi-level, attention-guided
tokenization. This approach starts by employing multi-level self-attention
feature extraction to tokenize the query sketches, as well as self-attention
feature extraction to tokenize the candidate images. It then employs
cross-attention mechanisms to establish token correspondence between these two
modalities, facilitating the computation of sketch-to-image similarity. Our
method demonstrates superior retrieval accuracy over existing sketch-based
remote sensing image retrieval techniques, as evidenced by tests on four
datasets. Notably, it also exhibits robust zero-shot learning capabilities and
strong generalizability in handling unseen categories and novel remote sensing
data. The method's scalability can be further enhanced by the pre-calculation
of retrieval tokens for all candidate images in a database. This research
underscores the significant potential of multi-level, attention-guided
tokenization in cross-modal remote sensing image retrieval. For broader
accessibility and research facilitation, we have made the code and dataset used
in this study publicly available online. Code and dataset are available at
https://github.com/Snowstormfly/Cross-modal-retrieval-MLAGT.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要