Which Pretrain Samples to Rehearse when Finetuning Pretrained Models?
CoRR(2024)
摘要
Fine-tuning pretrained foundational models on specific tasks is now the de
facto approach for text and vision tasks. A known pitfall of this approach is
the forgetting of pretraining knowledge that happens during finetuning.
Rehearsing samples randomly from the pretrain dataset is a common approach to
alleviate such forgetting. However, we find that random mixing unintentionally
includes samples which are not (yet) forgotten or unlearnable by the model. We
propose a novel sampling scheme, mix-cd, that identifies and prioritizes
samples that actually face forgetting, which we call collateral damage. Since
directly identifying collateral damage samples is computationally expensive, we
propose a procedure to estimate the distribution of such samples by tracking
the statistics of finetuned samples. Our approach is lightweight, easy to
implement, and can be seamlessly integrated into existing models, offering an
effective means to retain pretrain performance without additional computational
costs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要