Texture BERT for Cross-modal Texture Image Retrieval

Conference on Information and Knowledge Management(2022)

引用 2|浏览5
暂无评分
摘要
ABSTRACTWe propose Texture BERT, a model describing visual attributes of texture using natural language. To capture the rich details in texture images, we propose a group-wise compact bilinear pooling method, which represents the texture image by a set of visual patterns. The similarity between the texture image and the corresponding language description is determined by the cross-matching between the set of visual patterns from the texture image and the set of word features from the language description. We also exploit the self-attention transformer layers to provide the cross-modal context and enhance the effectiveness of matching. Our efforts achieve state-of-the-art accuracy on both text retrieval and image retrieval tasks, demonstrating the effectiveness of the proposed Texture BERT model in describing texture through natural language.
更多
查看译文
关键词
texture,image,cross-modal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要