谷歌浏览器插件
订阅小程序
在清言上使用

LUMINA: Linguistic unified multimodal Indonesian natural audio-visual dataset

DATA IN BRIEF(2024)

引用 0|浏览0
暂无评分
摘要
The LUMINA (Linguistic Unified Multimodal Indonesian Natural Audio -Visual) Dataset is a carefully curated constrained audio-visual dataset designed to support research in the field of speech perception. Spoken exclusively in Indonesian, LUMINA contains high -quality audio-visual recordings featuring 14 native speakers, including 9 males and 5 females. Each speaker contributes approximately 1,000 sentences, producing a rich and diverse data collection. The recorded videos focus on facial recordings, capturing essential visual cues and expressions that accompany speech. This extensive dataset provides a valuable resource for understanding how humans perceive and process spoken language, paving the way for speech recognition and synthesis technology advancements. (c) 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY -NC license ( http://creativecommons.org/licenses/by-nc/4.0/ )
更多
查看译文
关键词
Constrained audio-visual dataset,Lips reading,Speech synthesis,Face processing,Computer vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要