Temporal Action Detection with Structured Segment Networks

International Journal of Computer Vision(2019)

引用 0|浏览622
暂无评分
摘要
This paper addresses an important and challenging task, namely detecting the temporal intervals of actions in untrimmed videos. Specifically, we present a framework called structured segment network (SSN). It is built on temporal proposals of actions. SSN models the temporal structure of each action instance via a structured temporal pyramid. On top of the pyramid, we further introduce a decomposed discriminative model comprising two classifiers, respectively for classifying actions and determining completeness. This allows the framework to effectively distinguish positive proposals from background or incomplete ones, thus leading to both accurate recognition and precise localization. These components are integrated into a unified network that can be efficiently trained in an end-to-end manner. Additionally, a simple yet effective temporal action proposal scheme, dubbed temporal actionness grouping is devised to generate high quality action proposals. We further study the importance of the decomposed discriminative model and discover a way to achieve similar accuracy using a single classifier, which is also complementary with the original SSN design. On two challenging benchmarks, THUMOS’14 and ActivityNet, our method remarkably outperforms previous state-of-the-art methods, demonstrating superior accuracy and strong adaptivity in handling actions with various temporal structures.
更多
查看译文
关键词
Temporal action detection, Temporal action localization, Temporal action proposals, Human action recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要