Vision transformer-based autonomous crack detection on asphalt and concrete surfaces

Elyas Asadi Shamsabadi,Chang Xu,Aravinda S. Rao,Tuan Nguyen,Tuan Ngo,Daniel Dias-da-Costa

Automation in Construction（2022）

引用 29|浏览28

暂无评分

摘要

Previous research has shown the high accuracy of convolutional neural networks (CNNs) in asphalt and concrete crack detection in controlled conditions. Yet, human-like generalisation remains a significant challenge for industrial applications where the range of conditions varies significantly. Given the intrinsic biases of CNNs, this paper proposes a vision transformer (ViT)-based framework for crack detection on asphalt and concrete surfaces. With transfer learning and the differentiable intersection over union (IoU) loss function, the encoder-decoder network equipped with ViT could achieve an enhanced real-world crack segmentation performance. Compared to the CNN-based models (DeepLabv3+ and U-Net), TransUNet with a CNN-ViT backbone achieved up to ~61% and ~3.8% better mean IoU on the original images of the respective datasets with very small and multi-scale crack semantics. Moreover, ViT assisted the encoder-decoder network to show a robust performance against various noisy signals where the mean Dice score attained by the CNN-based models significantly dropped (<10%).

查看译文

关键词

Crack detection,Deep learning,Vision transformer,Convolutional neural network,Human recognition system

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要