Multiple Prompt Fusion for Zero-Shot Lesion Detection Using Vision-Language Models

MICCAI (5)(2023)

引用 0|浏览7
暂无评分
摘要
The success of large-scale pre-trained vision-language models (VLM) has provided a promising direction of transferring natural image representations to the medical domain by providing a well-designed prompt with medical expert-level knowledge. However, one prompt has difficulty in describing the medical lesions thoroughly enough and containing all the attributes. Besides, the models pre-trained with natural images fail to detect lesions precisely. To solve this problem, fusing multiple prompts is vital to assist the VLM in learning a more comprehensive alignment between textual and visual modalities. In this paper, we propose an ensemble guided fusion approach to leverage multiple statements when tackling the phrase grounding task for zero-shot lesion detection. Extensive experiments are conducted on three public medical image datasets across different modalities and the detection accuracy improvement demonstrates the superiority of our method.
更多
查看译文
关键词
Vision-language models,Lesion detection,Multiple prompts,Prompt fusion,Ensemble learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要