Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning
arxiv(2024)
摘要
Continual learning aims to learn from a stream of continuously arriving data
with minimum forgetting of previously learned knowledge. While previous works
have explored the effectiveness of leveraging the generalizable knowledge from
pre-trained models in continual learning, existing parameter-efficient
fine-tuning approaches focus on the use of a predetermined or task-wise set of
adapters or prompts. However, these approaches still suffer from forgetting due
to task interference on jointly used parameters or restricted flexibility. The
reliance on a static model architecture may lead to the allocation of excessive
parameters that are not essential or, conversely, inadequate adaptation for
downstream tasks, given that the scale and distribution of incoming data are
unpredictable in continual learning. We propose Self-Expansion of pre-trained
models with Modularized Adaptation (SEMA), a novel fine-tuning approach which
automatically decides to reuse or add adapter modules on demand in continual
learning, depending on whether drastic distribution shift that could not be
handled by existing modules is detected at different representation levels. We
design each adapter module to consist of an adapter and a representation
descriptor, specifically, implemented as an autoencoder. The representation
descriptor functions as a distributional shift indicator during training and
triggers adapter expansion. For better usage of the adapters, an expandable
weighting router is learned jointly for mixture of adapter outputs. By
comparing with vision-transformer-based continual learning adaptation methods,
we demonstrate that the proposed framework outperforms the state-of-the-art
without memory rehearsal.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要