AS-TransUnet: Combining ASPP and Transformer for Semantic Segmentation.

ICIRA (2)(2023)

引用 0|浏览2
暂无评分
摘要
Semantic segmentation is a task to classify each pixel in an image. Most recent semantic segmentation methods adopt full convolutional network FCN. FCN uses a fully convolutional network with encoding and decoder architecture. Encoders are used for feature extraction, and the decoder uses encoder-encoded features as input to decode the final segmentation prediction results. However, the convolutional kernel of feature extraction is not too large, so the model can only use local information to understand the input image, limiting the initial receptive field of the model. In addition, semantic segmentation tasks also need details in addition to semantic information, such as contextual information. To solve the above problems, we innovatively introduced the space pyramid structure (ASPP) into TransUnet, a model based on Transformers and U-Net, which is called AS-TransUnet. The spatial pyramid module can obtain more receptive fields to obtain multi-scale information. In addition, we add an attention module to the decoder to help the model learn relevant features. To verify the performance and efficiency of the model, we conducted experiments on two common data sets and compared them with the latest model. Experimental results show the superiority of this model.
更多
查看译文
关键词
semantic segmentation,aspp,transformer,as-transunet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要