谷歌浏览器插件
订阅小程序
在清言上使用

Microphone Array Channel Combination Algorithms for Overlapped Speech Detection

Interspeech 2022(2022)

引用 0|浏览40
暂无评分
摘要
Overlapped speech occurs when multiple speakers are simultaneously active. This may lead to severe performance degradation in automatic speech processing systems such as speaker diarization. Overlapped speech detection (OSD) aims at detecting time segments in which several speakers are simultaneously active. Recent deep neural network architectures have shown impressive results in the close-talk scenario. However, performance tends to deteriorate in the context of distant speech. Microphone arrays are often considered under these conditions to record signals including spatial information. This paper investigates the use of the self-attention channel combinator (SACC) system as a feature extractor for OSD. This model is also extended in the complex space (cSACC) to improve the interpretability of the approach. Results show that distant OSD performance with self-attentive models gets closer to the near-field condition. A detailed analysis of the cSACC combination-weights is also conducted showing that the self-attention module focuses attention on the speakers' direction.
更多
查看译文
关键词
overlapped speech detection,multi-microphone,distant speech,interpretability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要