谷歌浏览器插件
订阅小程序
在清言上使用

Deep Neural Network Based Low-Latency Speech Separation with Asymmetric Analysis-Synthesis Window Pair

29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021)(2021)

引用 5|浏览0
暂无评分
摘要
Time-frequency masking or spectrum prediction computed via short symmetric windows are commonly used in low-latency deep neural network (DNN) based source separation. In this paper, we propose the use of an asymmetric analysis-synthesis window pair which allows for training with targets with better frequency resolution, while retaining the low-latency during inference suitable for real-time speech enhancement or assisted hearing applications. In order to assess our approach across various model types and datasets, we evaluate it with a speaker-independent deep clustering (DC) model and a speaker-dependent mask inference (MI) model. We report an improvement in separation performance of up to 1.5 dB in terms of source-to-distortion ratio (SDR) while maintaining an algorithmic latency of 8 ms.
更多
查看译文
关键词
Monaural speaker separation,Low latency,Asymmetric windows,Deep clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要