DeVLBert: Out-of-distribution Visio-Linguistic Pretraining with Causality

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGITION WORKSHOPS (CVPRW 2021)(2021)

引用 4|浏览49
暂无评分
摘要
In this paper, we propose to investigate out-of-domain visio-linguistic pretraining, where the pretraining data distribution differs from that of downstream data on which the pretrained model will be fine-tuned. Existing methods for this problem are purely likelihood-based, leading to the spurious correlations and hurt the generalization ability when transferred to out-of-domain downstream tasks. By spurious correlation, we mean that the conditional probability of one token (object or word) given another one can be high (due to the dataset biases) without robust (causal) relationships between them. To mitigate such dataset biases, we propose a Deconfounded Visio-Linguistic Bert framework, abbreviated as DeVLBert(1), to perform intervention based-learning. We borrow the idea of the backdoor adjustment from the research field of causality and propose several neural-network based architectures for Bert-style out-of-domain pretraining. The quantitative results on three downstream tasks, Image Retrieval (IR), Zero-shot IR, and Visual Question Answering, show the effectiveness of DeVLBert by boosting generalization ability(2).
更多
查看译文
关键词
DeVLBert,causality,out-of-domain visio-linguistic pretraining,pretraining data distribution,likelihood-based method,spurious correlation,generalization ability,out-of-domain downstream tasks,conditional probability,dataset biases,causal,intervention-based learning,neural-network based architectures,Bert-style out-of-domain pretraining,out-of-distribution visio-linguistic pretraining,deconfounded visio-linguistic Bert framework,image retrieval,zero-shot IR,visual question answering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要