The Development of Thai Real-time Captioning Service

Ananlada Chotimongkol, Woottipong Boonma,Nattanun Thatphithakkul, Dechawat Chuengjatupornchai

PROCEEDINGS OF THE16TH INTERNATIONAL CONVENTION ON REHABILITATION ENGINEERING AND ASSISTIVE TECHNOLOGY, I-CREATE 2023(2023)

引用 0|浏览0
暂无评分
摘要
Real-time captioning is a challenging task, as accurate text transcription must be produced within a few seconds after the word is uttered. We have explored several techniques that could be used to produce real-time captioning for Thai. An on-line real-time captioning platform which supports three different transcription techniques: simultaneous typing, re-voice, and Automatic Speech Recognition (ASR), has been developed. We evaluated the caption quality produced from the three techniques based on 2 aspects accuracy and delay. The evaluation results reveal that simultaneous typing is suitable for a less formal speaking style and for speech with a lot of code-switching between Thai and English. As the machine learning technology continues to improve the ASR technique is a promising choice as it achieved highest accuracy for more formal captioning tasks and used fewer number of human agents. Furthermore, the proposed 2-stages VAD (Voice Activity Detection) scheme can reduce the ASR transcription delay by 33% relatively.
更多
查看译文
关键词
caption,transcription,deaf,hard of hearing,crowdsourcing,revoice,voice activity detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要