Object-based (yet Class-agnostic) Video Domain Adaptation
CoRR(2023)
摘要
Existing video-based action recognition systems typically require dense
annotation and struggle in environments when there is significant distribution
shift relative to the training data. Current methods for video domain
adaptation typically fine-tune the model using fully annotated data on a subset
of target domain data or align the representation of the two domains using
bootstrapping or adversarial learning. Inspired by the pivotal role of objects
in recent supervised object-centric action recognition models, we present
Object-based (yet Class-agnostic) Video Domain Adaptation (ODAPT), a simple yet
effective framework for adapting the existing action recognition systems to new
domains by utilizing a sparse set of frames with class-agnostic object
annotations in a target domain. Our model achieves a +6.5 increase when
adapting across kitchens in Epic-Kitchens and a +3.1 increase adapting between
Epic-Kitchens and the EGTEA dataset. ODAPT is a general framework that can also
be combined with previous unsupervised methods, offering a +5.0 boost when
combined with the self-supervised multi-modal method MMSADA and a +1.7 boost
when added to the adversarial-based method TA$^3$N on Epic-Kitchens.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要