谷歌浏览器插件
订阅小程序
在清言上使用

Gated 3D-CNN for Action Recognition

Recent Challenges in Intelligent Information and Database Systems(2022)

引用 0|浏览2
暂无评分
摘要
Human action recognition is an active field in computer vision tasks. It is mostly based on the extensively developed image recognition algorithm using convolutional neural networks(CNNs) or recurrent neural networks (RNNs). Action recognition is considered as a more challenging task than image recognition as a video consists of an image sequence that changes in every frame, and the model has to deal with both spatial and temporal information simultaneously. Recently proposed methods using the two-stream fusion technique show good performance in such tasks. However, these methods are computationally expensive and are complex to build for learning spatio-temporal dependencies of the action. This paper proposes a simple yet efficient deep neural network architecture, Gated 3D-CNN, consisting of 3D convolutional layers and gating modules to act as an LSTM model for learning spatial and temporal dependencies and give attention to essential features. The proposed method first learns spatial and temporal features of actions through 3D-CNN. Then, the sigmoid gated 3D convolution layers of local and global gating help to locate attention to the essential features of the action. The proposed architecture is comparatively simpler to implement and gives a competitive performance on the UFC-101 dataset.
更多
查看译文
关键词
3D-CNN, Attention mechanism, Action recognition, Gated 3D-CNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要