Real and synthetic Punjabi speech datasets for automatic speech recognition

DATA IN BRIEF(2024)

引用 0|浏览1
暂无评分
摘要
Automatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource lan-guages. Among these languages is Punjabi, despite its large number of speakers, Punjabi lacks high-quality annotated datasets for accurate speech recognition. To address this gap, we introduce three labeled Punjabi speech datasets: Punjabi Speech (real speech dataset) and Google-synth/CMU-synth (synthesized speech datasets). The Punjabi Speech dataset consists of read speech recordings captured in various envi-ronments, including both studio and open settings. In addi-tion, the Google-synth dataset is synthesized using Google's Punjabi text-to-speech cloud services. Furthermore, the CMU-synth dataset is created using the Clustergen model avail-able in the Festival speech synthesis system developed by CMU. These datasets aim to facilitate the development of ac-curate Punjabi speech recognition systems, bridging the re-source gap for this important language.(c) 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )
更多
查看译文
关键词
Automatic speech recognition,low-resource languages,Speech dataset,Punjabi language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要