Texture BERT for Cross-modal Texture Image Retrieval

Conference on Information and Knowledge Management（2022）

引用 2|浏览5

暂无评分

摘要

ABSTRACTWe propose Texture BERT, a model describing visual attributes of texture using natural language. To capture the rich details in texture images, we propose a group-wise compact bilinear pooling method, which represents the texture image by a set of visual patterns. The similarity between the texture image and the corresponding language description is determined by the cross-matching between the set of visual patterns from the texture image and the set of word features from the language description. We also exploit the self-attention transformer layers to provide the cross-modal context and enhance the effectiveness of matching. Our efforts achieve state-of-the-art accuracy on both text retrieval and image retrieval tasks, demonstrating the effectiveness of the proposed Texture BERT model in describing texture through natural language.

查看译文

关键词

texture,image,cross-modal

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要