CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding
CoRR(2024)
摘要
This paper introduces a novel approach named CrossVideo, which aims to
enhance self-supervised cross-modal contrastive learning in the field of point
cloud video understanding. Traditional supervised learning methods encounter
limitations due to data scarcity and challenges in label acquisition. To
address these issues, we propose a self-supervised learning method that
leverages the cross-modal relationship between point cloud videos and image
videos to acquire meaningful feature representations. Intra-modal and
cross-modal contrastive learning techniques are employed to facilitate
effective comprehension of point cloud video. We also propose a multi-level
contrastive approach for both modalities. Through extensive experiments, we
demonstrate that our method significantly surpasses previous state-of-the-art
approaches, and we conduct comprehensive ablation studies to validate the
effectiveness of our proposed designs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要