Chrome Extension
WeChat Mini Program
Use on ChatGLM

Learning Mixtures of Gaussians Using Diffusion Models

CoRR(2024)

Cited 4|Views23
Abstract
We give a new algorithm for learning mixtures of k Gaussians (with identity covariance in ℝ^n) to TV error ε, with quasi-polynomial (O(n^poly log(n+k/ε))) time and sample complexity, under a minimum weight assumption. Unlike previous approaches, most of which are algebraic in nature, our approach is analytic and relies on the framework of diffusion models. Diffusion models are a modern paradigm for generative modeling, which typically rely on learning the score function (gradient log-pdf) along a process transforming a pure noise distribution, in our case a Gaussian, to the data distribution. Despite their dazzling performance in tasks such as image generation, there are few end-to-end theoretical guarantees that they can efficiently learn nontrivial families of distributions; we give some of the first such guarantees. We proceed by deriving higher-order Gaussian noise sensitivity bounds for the score functions for a Gaussian mixture to show that that they can be inductively learned using piecewise polynomial regression (up to poly-logarithmic degree), and combine this with known convergence results for diffusion models. Our results extend to continuous mixtures of Gaussians where the mixing distribution is supported on a union of k balls of constant radius. In particular, this applies to the case of Gaussian convolutions of distributions on low-dimensional manifolds, or more generally sets with small covering number.
More
Translated text
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种基于扩散模型的新算法,用于学习具有单位协方差的高斯混合模型,实现了准多项式时间和样本复杂度,并在特定条件下提供了理论保证。

方法】:算法通过学习高斯分布的得分函数(梯度对数概率密度函数)沿着一个将纯噪声分布(高斯分布)转化为数据分布的过程。

实验】:本文通过推导高斯混合得分函数的高阶噪声敏感性界限,并使用分段多项式回归进行学习,结合扩散模型的已知收敛结果,证明了算法的有效性。实验涉及连续高斯混合,其中混合分布在k个半径恒定的球上支撑,适用于低维流形或具有小覆盖数的集合上的高斯卷积分布。具体数据集名称未提及。