Precise learning curves and higher-order scaling limits for dot-product kernel regression

Lechao Xiao, Hong Hu,Theodor Misiakiewicz,Yue M. Lu,Jeffrey Pennington

arxiv（2023）

引用 0|浏览67

暂无评分

摘要

As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop preciseestimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves(LCs) that characterize how the prediction error depends on the number ofsamples is restricted to either large-sample asymptotics (m ->infinity) or, for certain simple data distributions, to the high-dimensional asymptotics in whichthe number of samples scales linearly with the dimension (m proportional to d). There is awide gulf between these two regimes, including all higher-order scaling relations m proportional to d (R), which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the mean of the test error, bias and variance, for data drawn uniformlyfrom the sphere with isotropic random labels in the rth-order asymptotic scalingregime m ->infinity withm/drheld constant. We observe a peak in the LC whenever m approximate to d (R)/r! for any integerr, leading to multiple sample-wise descent and non-trivial behavior at multiple scales. We include a colab (available at:https://tinyurl.com/2nzym7ym) notebook that reproduces the essential results of the paper.

查看译文

关键词

machine learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要