Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
CoRR(2023)
摘要
Vision Foundation Models (VFMs) pretrained on massive datasets exhibit
impressive performance on various downstream tasks, especially with limited
labeled target data. However, due to their high inference compute cost, these
models cannot be deployed for many real-world applications. Motivated by this,
we ask the following important question, "How can we leverage the knowledge
from a large VFM to train a small task-specific model for a new target task
with limited labeled training data?", and propose a simple task-oriented
knowledge transfer approach as a highly effective solution to this problem. Our
experimental results on five target tasks show that the proposed approach
outperforms task-agnostic VFM distillation, web-scale CLIP pretraining,
supervised ImageNet pretraining, and self-supervised DINO pretraining by up to
11.6
approach also demonstrates up to 9x, 4x and 15x reduction in pretraining
compute cost when compared to task-agnostic VFM distillation, ImageNet
pretraining and DINO pretraining, respectively, while outperforming them. We
also show that the dataset used for transferring knowledge has a significant
effect on the final target task performance, and introduce a
retrieval-augmented knowledge transfer strategy that uses web-scale image
retrieval to curate effective transfer sets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要