Modified ResNet-152 Network With Hybrid Pyramidal Pooling for Local Change Detection

IEEE transactions on artificial intelligence(2023)

引用 0|浏览0
暂无评分
摘要
Background subtraction is an essential step in many computer vision tasks. In this article, we put forth a unique attempt to detect the local changes in challenging video scenes by exploring the capabilities of an encoder-decoder type network that employs a modified ResNet-152 architecture with a multi-scale features extraction framework. The proposed encoder network consists of a modified ResNet-152 network where the initial two blocks are freeze and the weights of the last blocks are learned using a transfer learning mechanism. The said encoder can reduce the computational complexity of the proposed model and extract fine as well as coarse-scale features. We have proposed a multiscale features extraction (MFE) mechanism block which is a hybridization of pyramidal pooling architecture (PPA), and various atrous convolutional layers where the high-level features from the encoder network are utilized to extract features at various scales. The use of PPA in the MFE block preserves maximum value in every pooling area, to retain the contextual relationship between the pixels in the complex video frames that can handle various challenging scenes. The proposed decoder network consists of stacked transposed convolution layers that learn a mapping from feature space to image space, predicting a score map. Then, a threshold is applied on the score map to get the binary class labels as the background and foreground. The shortcut connections followed by global average pooling (GAP) drive the low-level feature coefficients from the encoder network to the decoder network to enhance the feature representation. The performance of the proposed scheme is validated by testing it against thirty-one state-of-the-art techniques. The results obtained by the proposed method are corroborated qualitatively as well as quantitatively. Further, the efficacy of the proposed algorithm is verified with an unseen video setup and is found to provide better performance.
更多
查看译文
关键词
Background subtraction,deep neural network,multi-scale features extraction block
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要