DFR-ECAPA: Diffusion Feature Refinement for Speaker Verification Based on ECAPA-TDNN

Ya Gao,Wei Song,Xiaobing Zhao,Xiangchun Liu

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X（2024）

引用 0|浏览2

暂无评分

摘要

Diffusion Probabilistic Models have gained significant recognition for their exceptional performance in generative image modeling. However, in the field of speech processing, a large number of diffusion-based studies focus on generative tasks such as speech synthesis and speech conversion, and few studies apply diffusion models to speaker verification. We investigated the integration of the diffusion model with the ECAPA-TDNN model. By constructing a dual-network branch architecture, the network further extracts and refines speaker embeddings under the guidance of the intermediate activations of the pre-trained DDPM. We put forward two methods for fusing network branch features, both of which demonstrated certain improvements. Furthermore, our proposed model also provides a new solution for semi-supervised cross-domain speaker verification. Experiments on Voxceleb and CN-Celeb show that DFR-ECAPA outperform origin ECAPA-TDNN by around 20%.

查看译文

关键词

Speaker verification,Diffusion model,Feature fusion

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要