谷歌浏览器插件
订阅小程序
在清言上使用

Action Recognition Based on Dense Action Captioning

Chenglong Zhang,Lijuan Zhou,Changyong Niu

ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing(2023)

引用 0|浏览1
暂无评分
摘要
Considering that visually similar actions may be easily distinguished in textual description, it provides an opportunity to introduce textual description to assist action recognition. This paper proposes a novel action recognition method based on dense action captioning. Considering that an action may include multiple sub-actions with temporal relationship, this paper extends the thought of video captioning into actions. Multiple descriptions are generated from an action video sequence and each description corresponds to one sub-action sequence. In this paper, temporal constraints are added into a dense video captioning model for dense action captioning. With the generated descriptions, actions are recognized by a decision fusion strategy on both visual and textual presentation. The classification of visually similar actions can be refined based on textual classification of generated descriptions. The proposed method could be used in other action recognition models based on only visual representation. Experiments conducted on WorkoutUOW-18 and TAPOS datasets demonstrate the effectiveness of the proposed method on action captioning and classification.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要