Articulatory Synthesis for Data Augmentation in Phoneme Recognition

Paul Konstantin Krug,Peter Birkholz,Branislav Gerazov,Daniel Rudolph van Niekerk,Anqi Xu,Yi Xu

Conference of the International Speech Communication Association (INTERSPEECH)（2022）

引用 1|浏览3

暂无评分

摘要

While numerous studies on automatic speech recognition have been published in recent years describing data augmentation strategies based on time or frequency domain signal processing, few works exist on the artificial extensions of training data sets using purely synthetic speech data. In this work, the German KIEL corpus was augmented with synthetic data generated with the state-of-the-art articulatory synthesizer VOCALTRACT-LAB. It is shown that the additional synthetic data can lead to a significantly better performance in single-phoneme recognition in certain cases, while at the same time, the performance can also decrease in other cases, depending on the degree of acoustic naturalness of the synthetic phonemes. As a result, this work can potentially guide future studies to improve the quality of articulatory synthesis via the link between synthetic speech production and automatic speech recognition.

查看译文

关键词

automatic speech recognition, phoneme recognition, articulatory speech synthesis, data augmentation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要