UAV-Ground Visual Tracking: A Unified Dataset and Collaborative Learning Approach.

Dengdi Sun, Leilei Cheng, Song Chen, Chenglong Li , Yun Xiao, Bin Luo

IEEE Trans. Circuits Syst. Video Technol.(2024)

引用 0|浏览3
Visual tracking from the ground view and the UAV view has received increasing attention due to its wide range of practical applications. These two tasks have strong complementary benefits in the description of the target object, such as detailed appearance in the ground view and global motion information in the UAV view, and their combination has the potential to allow the tracking system to be more robust. However, no work has studied this problem in-depth, and it is challenging to accurately combine the ground view information and the UAV view information. To fill the gap and address the challenge, we propose a new computer vision task called UAV-Ground visual tracking. Considering the lack of relevant data and methods, we first propose a unified video dataset called UGVT, which includes 210 pairs of UAV and ground high-resolution video sequences with a total of more than 204K frames, which can be used as a comprehensive evaluation platform for relevant tracking methods. Secondly, based on the newly constructed dataset, we propose a co-learning method called MvCL to fuse the information of ground and UAV views. It first associates the same tracking target in the two views based on cross-attention operation and then fuses the complementary information of the two views. In particular, as a plug-and-play module based on Transformer structure, this method can be flexibly embedded into different tracking frameworks. Extensive experiments are conducted on the newly created dataset. The results demonstrate the effectiveness of the proposed method in improving the robustness of the tracking system compared with 10 state-of-the-art tracking methods and also indicate the prospect and significance of potential UAV-Ground visual tracking research. The dataset is available at:
Visual tracking,Transformer,UAV and ground views,Benchmark dataset,Collaborative learning
AI 理解论文
Chat Paper