CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning
arxiv(2024)
摘要
This paper explores the problem of continual learning (CL) of vision-language
models (VLMs) in open domains, where the models need to perform continual
updating and inference on a streaming of datasets from diverse seen and unseen
domains with novel classes. Such a capability is crucial for various
applications in open environments, e.g., AI assistants, autonomous driving
systems, and robotics. Current CL studies mostly focus on closed-set scenarios
in a single domain with known classes. Large pre-trained VLMs like CLIP have
demonstrated superior zero-shot recognition ability, and a number of recent
studies leverage this ability to mitigate catastrophic forgetting in CL, but
they focus on closed-set CL in a single domain dataset. Open-domain CL of large
VLMs is significantly more challenging due to 1) large class correlations and
domain gaps across the datasets and 2) the forgetting of zero-shot knowledge in
the pre-trained VLMs in addition to the knowledge learned from the newly
adapted datasets. In this work we introduce a novel approach, termed CoLeCLIP,
that learns an open-domain CL model based on CLIP. It addresses these
challenges by a joint learning of a set of task prompts and a cross-domain
class vocabulary. Extensive experiments on 11 domain datasets show that
CoLeCLIP outperforms state-of-the-art methods for open-domain CL under both
task- and class-incremental learning settings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要