A Large-Scale Evaluation of Speech Foundation Models

Shu-wen Yang,Heng-Jui Chang,Zili Huang, Andy T. Liu,Cheng-I Lai,Haibin Wu,Jiatong Shi,Xuankai Chang, Hsiang-Sheng Tsai,Wen-Chin Huang, Tzu-hsun Feng,Po-Han Chi,Yist Y. Lin,Yung-Sung Chuang, Tzu-Hsien Huang,Wei-Cheng Tseng,Kushal Lakhotia, Shang-Wen Li,Abdelrahman Mohamed,Shinji Watanabe,Hung-yi Lee

IEEE/ACM Transactions on Audio, Speech, and Language Processing（2024）

引用 0|浏览9

暂无评分

摘要

The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work, we establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the paradigm for speech. We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads. Combining our results with community submissions, we verify that the foundation model paradigm is promising for speech, and our multi-tasking framework is simple yet effective, as the best-performing foundation model shows competitive generalizability across most SUPERB tasks. For reproducibility and extensibility, we have developed a long-term maintained platform that enables deterministic benchmarking, allows for result sharing via an online leaderboard, and promotes collaboration through a community-driven benchmark database to support new development cycles. Finally, we conduct a series of analyses to offer an in-depth understanding of SUPERB and speech foundation models, including information flows across tasks inside the models, the correctness of the weighted-sum benchmarking protocol and the statistical significance and robustness of the benchmark.

查看译文

关键词

speech,foundation model,self-supervised learning,representation learning,task generalization,benchmark,evaluation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要