Self-enhanced training framework for referring expression grounding

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP(2023)

引用 0|浏览8
暂无评分
摘要
Weakly-supervised referring expression grounding (REG) aims at locating the image region described by a query sentence, where the mapping between the referential region and query is not available during the training stage. Noticing the significant gap between the fully- and weakly-supervised approaches, we develop a Self-Enhanced Training(SET) framework in this paper. Specifically, we first train the network under a weakly-supervised setting. Then, the model outputs are collected and filtered according to the confidence score and serve as pseudo-labels. Finally, with the help of these pseudo-labels, we tune the model under a fully-supervised setting. The SET framework provides a simple way of generating pseudo-labels that build a bridge between weak and full supervision. Experimental results demonstrate that model trained through our SET framework outperforms existing traditional methods on RefCOCO, RefCOCO+, and RefCOCOg datasets. The code is available at https://github.com/HTDL98/SET-framework.
更多
查看译文
关键词
Referring expression grounding,weakly-supervised,fully-supervised,pseudo-label
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要