On Learning Mixtures of Well-Separated Gaussians

2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)(2017)

引用 88|浏览90
暂无评分
摘要
We consider the problem of efficiently learning mixtures of a large number of spherical Gaussians, when the components of the mixture are well separated. In the most basic form of this problem, we are given samples from a uniform mixture of k standard spherical Gaussians with means μ 1 , . . . , μ k ∈ ℝ d , and the goal is to estimate the means up to accuracy δ using poly(k, d, 1/δ) samples. In this work, we study the following question: what is the minimum separation needed between the means for solving this task? The best known algorithm due to Vempala and Wang [JCSS 2004] requires a separation of roughly min{k, d}1/4. On the other hand, Moitra and Valiant [FOCS 2010] showed that with separation o(1), exponentially many samples are required. We address the significant gap between these two bounds, by showing the following results.; We show that with separation o(√(log k)), superpolynomially many samples are required. In fact, this holds even when the k means of the Gaussians are picked at random in d = O(log k) dimensions.; We show that with separation Ω(√(log k)), picked at random in d = O(log k) dimensions. poly(k, d, 1/δ) samples suffice. Notice that the bound on the separation is independent of δ. This result is based on a new and efficient “accuracy boosting” algorithm that takes as input coarse estimates of the true means and in time (and samples) poly(k, d, 1/δ) outputs estimates of the means up to arbitrarily good accuracy δ assuming the separation between the means is Ω(min{√(log k), √d}) (independently of δ). The idea of the algorithm is to iteratively solve a “diagonally dominant” system of non-linear equations. We also (1) present a computationally efficient algorithm in d = O(1) dimensions with only Ω(√d) separation, and (2) extend our results to the case that components might have different weights and variances. These results together essentially characterize the optimal order of separation between components that is needed to learn a mixture of k spherical Gaussians with polynomial samples.
更多
查看译文
关键词
mixtures of Gaussians,learning,clustering,parameter estimation,sample complexity,iterative algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要