A Multi-Head Self-Relation Network For Scene Text Recognition

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)(2020)

引用 0|浏览7
暂无评分
摘要
The text embedded in scene images can be seen everywhere in our lives. However, recognizing text from natural scene images is still a challenge because of its diverse shapes and distorted patterns. Recently, advanced recognition networks generally treat scene text recognition as a sequence prediction task. Although achieving excellent performance, these recognition networks consider the feature map cells as independent individuals and update cells state without utilizing the information of their related cells. And the local receptive field of traditional convolutional neural network (CNN) makes a single cell that cannot cover the whole text region in an image. Due to these issues, the existing recognition networks cannot extract the global context information in a visual scene. To deal with the above problems, we propose a Multi-head Self-relation Network(MSRN) for scene text recognition in this paper. The MSRN consists of several multihead self-relation layers, which are designed for extracting the global context information of a visual scene. Then the information of the related cells can be fused by multi-head self-relation layer. Furthermore, experiments over several public datasets demonstrate that our proposed recognition network achieves superior performance on several benchmark datasets including IC03, IC13, IC15, SVT-Perspective.
更多
查看译文
关键词
scene text recognition,feature map cells,convolutional neural network,global context information,visual scene,multihead self-relation network,distorted patterns,local receptive field,CNN,MSRN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要