Comparing human and automatic speech recognition in simple and complex acoustic scenes.

Constantin Spille,Birger Kollmeier,Bernd T. Meyer

Computer Speech & Language（2018）

Cited 36|Views28

No score

Abstract

•Automatic speech recognition and human listeners are compared in single-channel and spatial scenes.•In single-channel scenes, ASR is on a par with normal-hearing listeners.•In spatial scenes, there is a substantial human-machine gap of 12.3 dB.•5.3 dB of this gap can be attributed to poor localization and missing speaker-related features.

Translated text

Key words

Human-machine comparison,Speech recognition threshold,Deep neural networks,Speech intelligibility prediction,Spatial scenes

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined