Robust Spoof Speech Detection Based on Multi-Scale Feature Aggregation and Dynamic Convolution

Haochen Wu,Jie Zhang, Zhentao Zhang, Wenting Zhao, Bin Gu,Wu Guo

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览7
暂无评分
摘要
Spoof speech detection (SSD) can help to protect an automatic speaker recognition system against malicious attacks. However, there exists a great diversity in the spoof utterances generated by different text-to-speech and voice conversion algorithms, resulting in a poor generality of an SSD system to unseen spoofing attacks. To address this problem, we integrate multi-scale feature aggregation (MFA) and dynamic convolution operations into the anti-spoofing framework to detect different local and global artifacts of unseen spoofing attacks. The proposed framework mainly contains eight stacked MFA blocks, where in each block the light-Res2Net module is used to capture multi-scale features and the convolutional kernel is dynamically generated by the local and global statistical information of the inputs. Results on two benchmark datasets (i.e., ADD 2023 Fake Audio Detection and ASVspoof 2021 Logical Access) show the superiority of the proposed method over existing state-of-the-art systems.
更多
查看译文
关键词
Multi-scale feature aggregation,spoof speech detection,anti-spoofing,ADD 2023,ASVspoof 2021
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要