Real-Time Multimodal Cognitive Assistant for Emergency Medical Services
arxiv(2024)
摘要
Emergency Medical Services (EMS) responders often operate under
time-sensitive conditions, facing cognitive overload and inherent risks,
requiring essential skills in critical thinking and rapid decision-making. This
paper presents CognitiveEMS, an end-to-end wearable cognitive assistant system
that can act as a collaborative virtual partner engaging in the real-time
acquisition and analysis of multimodal data from an emergency scene and
interacting with EMS responders through Augmented Reality (AR) smart glasses.
CognitiveEMS processes the continuous streams of data in real-time and
leverages edge computing to provide assistance in EMS protocol selection and
intervention recognition. We address key technical challenges in real-time
cognitive assistance by introducing three novel components: (i) a Speech
Recognition model that is fine-tuned for real-world medical emergency
conversations using simulated EMS audio recordings, augmented with synthetic
data generated by large language models (LLMs); (ii) an EMS Protocol Prediction
model that combines state-of-the-art (SOTA) tiny language models with EMS
domain knowledge using graph-based attention mechanisms; (iii) an EMS Action
Recognition module which leverages multimodal audio and video data and protocol
predictions to infer the intervention/treatment actions taken by the responders
at the incident scene. Our results show that for speech recognition we achieve
superior performance compared to SOTA (WER of 0.290 vs. 0.618) on
conversational data. Our protocol prediction component also significantly
outperforms SOTA (top-3 accuracy of 0.800 vs. 0.200) and the action recognition
achieves an accuracy of 0.727, while maintaining an end-to-end latency of 3.78s
for protocol prediction on the edge and 0.31s on the server.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要