Benign overfitting in leaky ReLU networks with moderate input dimension
arxiv(2024)
摘要
The problem of benign overfitting asks whether it is possible for a model to
perfectly fit noisy training data and still generalize well. We study benign
overfitting in two-layer leaky ReLU networks trained with the hinge loss on a
binary classification task. We consider input data which can be decomposed into
the sum of a common signal and a random noise component, which lie on subspaces
orthogonal to one another. We characterize conditions on the signal to noise
ratio (SNR) of the model parameters giving rise to benign versus non-benign, or
harmful, overfitting: in particular, if the SNR is high then benign overfitting
occurs, conversely if the SNR is low then harmful overfitting occurs. We
attribute both benign and non-benign overfitting to an approximate margin
maximization property and show that leaky ReLU networks trained on hinge loss
with Gradient Descent (GD) satisfy this property. In contrast to prior work we
do not require near orthogonality conditions on the training data: notably, for
input dimension d and training sample size n, while prior work shows
asymptotically optimal error when d = Ω(n^2 log n), here we require
only d = Ω(n log1/ϵ) to obtain error within
ϵ of optimal.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要