Video Object Segmentation with Dynamic Query Modulation

2024 IEEE International Conference on Multimedia and Expo (ICME)（2024）

引用 0|浏览13

暂无评分

摘要

Storing intermediate frame segmentations as memory for long-range contextmodeling, spatial-temporal memory-based methods have recently showcasedimpressive results in semi-supervised video object segmentation (SVOS).However, these methods face two key limitations: 1) relying on non-localpixel-level matching to read memory, resulting in noisy retrieved features forsegmentation; 2) segmenting each object independently without interaction.These shortcomings make the memory-based methods struggle in similar object andmulti-object segmentation. To address these issues, we propose a querymodulation method, termed QMVOS. This method summarizes object features intodynamic queries and then treats them as dynamic filters for mask prediction,thereby providing high-level descriptions and object-level perception for themodel. Efficient and effective multi-object interactions are realized throughinter-query attention. Extensive experiments demonstrate that our method canbring significant improvements to the memory-based SVOS method and achievecompetitive performance on standard SVOS benchmarks. The code is available athttps://github.com/zht8506/QMVOS.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要