ℒ𝒪^2 net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Pattern Analysis and Applications(2023)

引用 0|浏览0
暂无评分
摘要
Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g ℒ𝒪 bal– ℒ𝒪 cal Semantics Coupled Network ( ℒ𝒪^2 Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the ℒ𝒪^2 Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.
更多
查看译文
关键词
Video foreground extraction,Scene-specific training,Deep neural network,Semantic edge,Multi-scale features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要