BoxMask: Revisiting Bounding Box Supervision for Video Object Detection

Khurram Azeem Hashmi,Alain Pagani,Didier Stricker,Muhammad Zeshan Afzal

WACV（2023）

引用 3|浏览7

暂无评分

摘要

We present a new, simple yet effective approach to uplift video object detection. We observe that prior works operate on instance-level feature aggregation that imminently neglects the refined pixel-level representation, resulting in confusion among objects sharing similar appearance or motion characteristics. To address this limitation, we propose BoxMask, which effectively learns discriminative representations by incorporating class-aware pixel-level information. We simply consider bounding box-level annotations as a coarse mask for each object to supervise our method. The proposed module can be effortlessly integrated into any region-based detector to boost detection. Extensive experiments on ImageNet VID and EPIC KITCHENS datasets demonstrate consistent and significant improvement when we plug our BoxMask module into numerous recent state-of-the-art methods. The code will be available at https://github.com/khurramHashmi/BoxMask.

查看译文

关键词

revisiting bounding boxmask supervision,detection,video,object

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要