SGD Biased towards 1-1arly Important Samples for Efficient Training

23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023(2023)

引用 0|浏览1
暂无评分
摘要
In deep learning, using larger training datasets usually leads to more accurate models. However, simply adding more but redundant data may he inefficient, as some training samples may be more informative than others. We propose to bias SGD (Stochastic Gradient Descent) towards samples that are found to be more important after a few training epochs, by sampling them more often for the rest of training. In contrast to state-of-the-art, our approach requires less computational overhead to estimate sample importance, as it computes estimates once during training using the prediction probabilities, and does not require that training be restarted. In the experimental evaluation, we see that our learning technique trains faster than state-of-the-art and can achieve higher test accuracy, especially when datasets are not well balanced. Lastly, results suggest that our approach has intrinsic balancing properties. Code is available at htt ps://github.com/AlessioQuercia/sgd_biased.
更多
查看译文
关键词
Deep Learning,Optimization,Hard Example,Mining.
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要