Automatic Human Utility Evaluation Of Asr Systems: Does Wer Really Predict Performance?

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5(2013)

引用 44|浏览116
暂无评分
摘要
We propose an alternative evaluation metric to Word Error Rate (WER) for the decision audit task of meeting recordings, which exemplifies how to evaluate speech recognition within a legitimate application context. Using machine learning on an initial seed of human-subject experimental data, our alternative metric handily outperforms WER, which correlates very poorly with human subjects' success in finding decisions given ASR transcripts with a range of WERs.
更多
查看译文
关键词
Automatic speech recognition,Evaluation,User study,Ecological Validity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要