Anchor-free temporal action localization via Progressive Boundary-aware Boosting

Information Processing & Management(2023)

引用 0|浏览56
暂无评分
摘要
Enormous untrimmed videos from the real world are difficult to analyze and manage. Temporal action localization algorithms can help us to locate and recognize human activity clips in untrimmed videos. Recently, anchor-free temporal action localization methods have gained increasing attention due to small computational costs and no complex hyperparameters of pre-set anchors. Although the performance has been significantly improved, most existing anchor-free temporal action localization methods still suffer from inaccurate action boundary predictions. In this paper, we want to alleviate the above problem through boundary refinement and temporal context aggregation. To this end, a novel Progressive Boundary-aware Boosting Network (PBBNet) is proposed for anchor-free temporal action localization. The PBBNet consists of three main modules: Temporal Context-aware Module (TCM), Instance-wise Boundary-aware Module (IBM), and Frame-wise Progressive Boundary-aware Module (FPBM). The TCM aggregates the temporal context information and provides features for the IBM and the FPBM. The IBM generates multi-scale video features to predict action results coarsely. Compared with IBM, the FPBM focuses on instance features corresponding to action predictions and uses more supervision information for boundary regression. Given action results from IBM, the FPBM uses a progressive boosting strategy to refine the boundary predictions multiple times with supervision from weak to strong. Extensive experiments on three benchmark datasets THUMOS14, ActivityNet-v1.3 and HACS show our PPBNet outperforms all existing anchor-free methods. Further, our PPBNet achieves state-of-the-art performance (72.5% mAP at tIoU = 0.5) on THUMOS14 dataset.
更多
查看译文
关键词
Temporal action localization,Anchor-free,Video understanding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要