Voice Jitter Estimation Using High-Order Synchrosqueezing Operators

IEEE/ACM Transactions on Audio, Speech, and Language Processing(2021)

引用 7|浏览21
暂无评分
摘要
Voice jitter is defined as a random perturbation of the glottal cycle duration which can be useful for voice parametrization and that usually depends on finding fiducial points in this signal. In this paper, a novel application of the Fourier-based high-order synchrosqueezing (FSSTN) operators on voice signals is introduced for voice jitter estimation without period-segmentation. To this end, an innovative interpretation of the relative jitter formula in terms of the total variation of the sequence of periods is proposed. This allows us to derive an algorithm for jitter estimation that uses the (continuous) instantaneous fundamental frequency of the signal and its first derivative (chirp rate) which can be obtained, respectively, from the local complex frequency and the local complex modulation FSSTN operators. Numerical experiments using synthetic signals with known true jitter show that this novel approach yields similar results to other state-of-the-art method, PRAAT, for true jitter within the range [0.2%, 1.2%], and that it outperforms PRAAT for true jitter values in the range [1%, 15%]. The here proposed method seems a promising tool for voice jitter estimation and constitutes a novel application of the high-order synchrosqueezing operators for voice signals with potential impact on jitter modeling and on the clinical field.
更多
查看译文
关键词
Chirp rate,synchrosqueezing transform,voice jitter,instantaneous frequency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要