Two-Stream Network with 3D Common-Specific Framework for RGB-D Action Recognition

Xiaolei Qin,Yongxin Ge,Jinyuan Feng,Yida Chen,Liuwei Zhan,Xuchu Wang,Yuangan Wang

2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)（2019）

引用 1|浏览8

暂无评分

摘要

This paper presents a novel end-to-end network called TSN-3DCSF, which completes the task of RGB-D action recognition in video. Considering depth information can well express the relationship between different body parts, which is very helpful in action recognition, we employ it together with RGB information as inputs in our framework. Despite the characteristics of these two modalities are quite different, they have consistent semantic information and extracting the common and specific features is very meaningful for action recognition. Unlike most works which obtain temporal information by optical flow, our approach utilizes 3D convolution to build a layer to extract temporal information and common-specific feature simultaneously, which enhances the accuracy and reduces the amount of computation. Extensive experiments on three widely used RGB-D action datasets show that our method achieves comparable performance to the state-of-the-art methods.

查看译文

关键词

common-specific,RGB-D,action recognition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要