Building and training of a new mexican spanish voice for festival

Humberto Pérez Espinosa,Carlos Alberto Reyes García

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence（2005）

引用 1|浏览1

暂无评分

摘要

In this paper we describe the work done to build a new voice based on diphone concatenation in the Spanish spoken in Mexico. This voice is compatible with the Text to Speech Synthesis System Festival. In the development of each module of the system the own features of Spanish were taken into account. In this work we hope to enhance the naturalness of the synthesized voice by including a prosodic model. The prosodic factors taken into consideration by the model are: phrasing, accentuation, duration and F0 contour. Duration and F0 prediction models were trained from natural speech corpora. We found the best prediction models by testing several machine learning methods and two different corpora. The paper describes the building, and training process as well as the results and their respective interpretation.

查看译文

关键词

different corpus,diphone concatenation,prosodic factor,f0 contour,new voice,speech synthesis system festival,best prediction model,prosodic model,f0 prediction model,new mexican spanish voice,synthesized voice,prediction model,machine learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要