Self-enhanced training framework for referring expression grounding

Yitao Chen,Ruoyi Du,Kongming Liang,Zhanyu Ma

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP（2023）

引用 0|浏览8

暂无评分

摘要

Weakly-supervised referring expression grounding (REG) aims at locating the image region described by a query sentence, where the mapping between the referential region and query is not available during the training stage. Noticing the significant gap between the fully- and weakly-supervised approaches, we develop a Self-Enhanced Training(SET) framework in this paper. Specifically, we first train the network under a weakly-supervised setting. Then, the model outputs are collected and filtered according to the confidence score and serve as pseudo-labels. Finally, with the help of these pseudo-labels, we tune the model under a fully-supervised setting. The SET framework provides a simple way of generating pseudo-labels that build a bridge between weak and full supervision. Experimental results demonstrate that model trained through our SET framework outperforms existing traditional methods on RefCOCO, RefCOCO+, and RefCOCOg datasets. The code is available at https://github.com/HTDL98/SET-framework.

查看译文

关键词

Referring expression grounding,weakly-supervised,fully-supervised,pseudo-label

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要