Robust Spoof Speech Detection Based on Multi-Scale Feature Aggregation and Dynamic Convolution

Haochen Wu,Jie Zhang, Zhentao Zhang, Wenting Zhao, Bin Gu,Wu Guo

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2024）

引用 0|浏览7

暂无评分

摘要

Spoof speech detection (SSD) can help to protect an automatic speaker recognition system against malicious attacks. However, there exists a great diversity in the spoof utterances generated by different text-to-speech and voice conversion algorithms, resulting in a poor generality of an SSD system to unseen spoofing attacks. To address this problem, we integrate multi-scale feature aggregation (MFA) and dynamic convolution operations into the anti-spoofing framework to detect different local and global artifacts of unseen spoofing attacks. The proposed framework mainly contains eight stacked MFA blocks, where in each block the light-Res2Net module is used to capture multi-scale features and the convolutional kernel is dynamically generated by the local and global statistical information of the inputs. Results on two benchmark datasets (i.e., ADD 2023 Fake Audio Detection and ASVspoof 2021 Logical Access) show the superiority of the proposed method over existing state-of-the-art systems.

查看译文

关键词

Multi-scale feature aggregation,spoof speech detection,anti-spoofing,ADD 2023,ASVspoof 2021

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要