A Recent Survey of Vision Transformers for Medical Image Segmentation

Asifullah Khan,Zunaira Rauf,Abdul Rehman Khan,Saima Rathore,Saddam Hussain Khan, Sahar Shah, Umair Farooq,Hifsa Asif, Aqsa Asif,Umme Zahoora, Rafi Ullah Khalil,Suleman Qamar, Umme Hani Asif,Faiza Babar Khan,Abdul Majid,Jeonghwan Gwak

CoRR（2023）

Cited 0|Views13

No score

Abstract

Medical image segmentation plays a crucial role in various healthcare applications, enabling accurate diagnosis, treatment planning, and disease monitoring. In recent years, Vision Transformers (ViTs) have emerged as a promising technique for addressing the challenges in medical image segmentation. In medical images, structures are usually highly interconnected and globally distributed. ViTs utilize their multi-scale attention mechanism to model the long-range relationships in the images. However, they do lack image-related inductive bias and translational invariance, potentially impacting their performance. Recently, researchers have come up with various ViT-based approaches that incorporate CNNs in their architectures, known as Hybrid Vision Transformers (HVTs) to capture local correlation in addition to the global information in the images. This survey paper provides a detailed review of the recent advancements in ViTs and HVTs for medical image segmentation. Along with the categorization of ViT and HVT-based medical image segmentation approaches we also present a detailed overview of their real-time applications in several medical image modalities. This survey may serve as a valuable resource for researchers, healthcare practitioners, and students in understanding the state-of-the-art approaches for ViT-based medical image segmentation.

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined