Human action recognition by multiple spatial clues network

Neurocomputing(2022)

引用 2|浏览11
暂无评分
摘要
Human action can be recognized in still images since the whole image represents an action with some spatial clues, such as human poses, action-specific parts, and global surroundings. To represent the spatial clues, the recent methods require labor-intensive annotations to locate the human body and objects, which are computationally intensive. To eliminate strong supervision, a Multiple Spatial Clues Network (MSCNet) is proposed to represent the spatial clues with only image-level action label. Neither accurately manual annotated bounding boxes nor extra labeled datasets are required as additional supervision. First, the proposed MSCNet exploits spatial-attention module to generate spatial attention regions, and detects the spatial clues with minimal supervision. Then, spatial clues exploitation is proposed to utilize the learned spatial clues with three modules: the context module, body + context module and body + semantics module. Experiments on three benchmark datasets demonstrate the effectiveness of the proposed MSCNet.
更多
查看译文
关键词
Human action recognition,Deep learning,Weakly supervised learning,Attention module
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要