SYNTACC : Synthesizing Multi-Accent Speech By Weight Factorization

Tuan-Nam Nguyen,Ngoc-Quan Pham,Alexander Waibel

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览6

暂无评分

摘要

Conventional multi-speaker text-to-speech synthesis (TTS) is known to be capable of synthesizing speech for multiple voices, yet it cannot generate speech in different accents. This limitation has motivated us to develop SYNTACC (Synthesizing speech with accents) which adapts conventional multi-speaker TTS to produce multi-accent speech. Our method uses the YourTTS model and involves a novel multi-accent training mechanism. The method works by decomposing each weight matrix into a shared component and an accent-dependent component, with the former being initialized by the pretrained multi-speaker TTS model and the latter being factorized into vectors using rank-1 matrices to reduce the number of training parameters per accent. This weight factorization method proves to be effective in fine-tuning the SYNTACC on multi-accent data sets in a low-resource condition. Our SYNTACC model eventually allows speech synthesis in not only different voices but also in different accents.

查看译文

关键词

Speech synthesis,accent adaptation,multi-speaker TTS,weight factorization,weight decomposition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要