Stochastic Canonical Correlation Analysis

JOURNAL OF MACHINE LEARNING RESEARCH(2019)

引用 27|浏览150
暂无评分
摘要
We study the sample complexity of canonical correlation analysis (CCA), i.e., the number of samples needed to estimate the population canonical correlation and directions up to arbitrarily small error. With mild assumptions on the data distribution, we show that in order to achieve epsilon-suboptimality in a properly defined measure of alignment between the estimated canonical directions and the population solution, we can solve the empirical objective exactly with N(epsilon, Delta, gamma) samples, where Delta is the singular value gap of the whitened cross-covariance matrix and 1/gamma is an upper bound of the condition number of auto-covariance matrices. Moreover, we can achieve the same learning accuracy by drawing the same level of samples and solving the empirical objective approximately with a stochastic optimization algorithm; this algorithm is based on the shift-and-invert power iterations and only needs to process the dataset for O (log 1/c) passes. Finally, we show that, given an estimate of the canonical correlation, the streaming version of the shift-and-invert power iterations achieves the same learning accuracy with the same level of sample complexity, by processing the data only once.
更多
查看译文
关键词
Canonical correlation analysis,sample complexity,shift-and-invert preconditioning,streaming CCA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要