Universal Relocalizer for Weakly Supervised Referring Expression Grounding

ACM Transactions on Multimedia Computing, Communications, and Applications(2024)

引用 0|浏览9
暂无评分
摘要
This paper introduces the Universal Relocalizer, a novel approach designed for weakly supervised referring expression grounding. Our method strives to pinpoint a target proposal that corresponds to a specific query, eliminating the need for region-level annotations during training. To bolster the localization precision and enrich the semantic understanding of the target proposal, we devise three key modules: the category module, the color module, and the spatial relationship module. The category and color modules assign respective category and color labels to region proposals, enabling the computation of category and color scores. Simultaneously, the spatial relationship module integrates spatial cues, yielding a spatial score for each proposal to enhance localization accuracy further. By adeptly amalgamating the category, color, and spatial scores, we derive a refined grounding score for every proposal. Comprehensive evaluations on the RefCOCO, RefCOCO+, and RefCOCOg datasets manifest the prowess of the Universal Relocalizer, showcasing its formidable performance across the board.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要