Chinese Character Recognition with Radical-Structured Stroke Trees.

Machine learning(2023)

引用 0|浏览49
暂无评分
摘要
The flourishing blossom of deep learning has witnessed the rapid development of Chinese character recognition. However, it remains a great challenge that the characters for testing may have different distributions from those of the training dataset. Existing methods based on a single-level representation (character-level, radical-level, or stroke-level) may be either too sensitive to distribution changes (e.g., induced by blurring, occlusion, and zero-shot problems) or too tolerant to one-to-many ambiguities. In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder decomposes each character into a radical sequence and a Radical-to-Stroke Decoder further decomposes each radical into the corresponding stroke sequence. The generated radical and stroke sequences are encoded as a radical-structured stroke tree (RSST), which is fed into a Tree-to-Character Translator based on the proposed Weighted Edit Distance to match the closest candidate character in the RSST lexicon. We have conducted extensive experiments on various datasets, such as handwritten, printed artistic, scene character datasets. The experimental results demonstrate that the proposed method outperforms the state-of-the-art single-level methods by increasing margins as the distribution difference becomes more severe in the blurring, occlusion, and zero-shot scenarios. For example, compared with the previous SOTA method, our method improve performance by 1.74–7.58% in the handwritten character zero-shot settings.
更多
查看译文
关键词
Chinese character recognition,Radical-structured stroke trees,Chinese character decomposition,Weighted edit distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要