A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos.

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)(2019)

引用 46|浏览197
暂无评分
摘要
Existing approaches for spatio-temporal action detection in videos are limited by the spatial extent and temporal duration of the actions. In this paper, we present a modular system for spatio-temporal action detection in untrimmed surveillance videos. We propose a two stage approach. The first stage generates dense spatio-temporal proposals using hierarchical clustering and temporal jittering techniques on frame-wise object detections. The second stage is a Temporal Refinement I3D (TRI-3D) network that performs action classification and temporal refinement on the generated proposals. The object detection-based proposal generation step helps in detecting actions occurring in a small spatial region of a video frame, while temporal jittering and refinement helps in detecting actions of variable lengths. Experimental results on an unconstrained surveillance action detection dataset - DIVA - show the effectiveness of our system. For comparison, the performance of our system is also evaluated on the THUMOS'14 temporal action detection dataset.
更多
查看译文
关键词
Videos,Proposals,Security,Object detection,Task analysis,Training,Automobiles
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要