AliFuse: Aligning and Fusing Multi-modal Medical Data for Computer-Aided Diagnosis
CoRR(2024)
摘要
Medical data collected for making a diagnostic decision are typically
multi-modal and provide complementary perspectives of a subject. A
computer-aided diagnosis system welcomes multi-modal inputs; however, how to
effectively fuse such multi-modal data is a challenging task and attracts a lot
of attention in the medical research field. In this paper, we propose a
transformer-based framework, called Alifuse, for aligning and fusing
multi-modal medical data. Specifically, we convert images and unstructured and
structured texts into vision and language tokens, and use intramodal and
intermodal attention mechanisms to learn holistic representations of all
imaging and non-imaging data for classification. We apply Alifuse to classify
Alzheimer's disease and obtain state-of-the-art performance on five public
datasets, by outperforming eight baselines. The source code will be available
online later.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要