Performance of Speech Recognition Algorithms in Musical Speech used for Speech-Language Pathology Rehabilitation.

MeMeA（2023）

引用 0|浏览3

暂无评分

摘要

Musical speech in speech-language pathology rehabilitation is the production of speech following simple musical (rhythmic or melodic) patterns. This type of speech is used to facilitate speech processing in patients. In this study, we examined the performance of current automatic speech recognition (ASR) algorithms in recognizing normal and musical speech. From a first list of 28 identified algorithms, 24 were excluded for reasons such as low accuracy rate, high computational cost, high price, difficulty of use, long runtime, implementation problems. The four algorithms included were those from Amazon Web Services (AWS Transcribe), Google Speech Recognition, IBM Watson and Rev AI. We ran the selected algorithms on 60 sentences recorded under four speech conditions (Melodic; Rhythmic; Regular Slow; and Regular Normal). All algorithms did perfectly in recognizing the normal speech. The two algorithms with the best performance in musical speech (rhythmic and melodic speech) were AWS Transcribe and IBM Watson, both providing recognition accuracy above 98%. When adding moderate level of white noise and reverberation to the stimuli, AWS Transcribe remained with an acceptable (> 70%) or satisfactory (> 95%) ASR performance. These results may guide the development of software that use ASR to enable patients to undergo self-directed sessions of music-based speech-language rehabilitation, such as the melodic intonation therapy for post-stroke aphasia. The possibility to recognize musical speech allows to compare a patient’s performance to corresponding target phrases and provide feedback in the absence of a clinician. Given the recommended high intensity of treatment and the limited availability of speech-language pathologists, such software would be highly valuable to our healthcare systems.

查看译文

关键词

automatic speech recognition,speech-to-text,speech-language pathology,rehabilitation,music,aphasia

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要