SPEECH DEREVERBERATION USING VARIATIONAL AUTOENCODERS

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)（2021）

引用 4|浏览10

暂无评分

摘要

This paper presents a statistical method for single-channel speech dereverberation using a variational autoencoder (VAE) for modelling the speech spectra. One popular approach for modelling speech spectra is to use non-negative matrix factorization (NMF) where learned clean speech spectral bases are used as a linear generative model for speech spectra. This work replaces this linear model with a powerful nonlinear deep generative model based on VAE. Further, this paper formulates a unified probabilistic generative model of reverberant speech based on Gaussian and Poisson distributions. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the VAE and estimating the room impulse response for both probabilistic models. Evaluation results show the superiority of the proposed VAE-based models over the NMF-based counterparts.

查看译文

关键词

speech dereverberation, variational autoencoders, non-negative matrix factorization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要