For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
CoRR(2024)
摘要
Data augmentation has been pivotal in successfully training deep learning
models on classification tasks over the past decade. An important subclass of
data augmentation techniques - which includes both label smoothing and Mixup -
involves modifying not only the input data but also the input label during
model training. In this work, we analyze the role played by the label
augmentation aspect of such methods. We prove that linear models on linearly
separable data trained with label augmentation learn only the minimum variance
features in the data, while standard training (which includes weight decay) can
learn higher variance features. An important consequence of our results is
negative: label smoothing and Mixup can be less robust to adversarial
perturbations of the training data when compared to standard training. We
verify that our theory reflects practice via a range of experiments on
synthetic data and image classification benchmarks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要