A Two-Stage Method for Text Line Detection in Historical Documents

Tobias Grüning,Gundram Leifert,Tobias Strauß,Johannes Michael,Roger Labahn

International journal on document analysis and recognition（2019）

引用 68|浏览45

暂无评分

摘要

This work presents a two-stage text line detection method for historical documents. Each detected text line is represented by its baseline. In a first stage, a deep neural network called ARU-Net labels pixels to belong to one of the three classes: baseline, separator and other. The separator class marks beginning and end of each text line. The ARU-Net is trainable from scratch with manageably few manually annotated example images ( $$<\,50$$ ). This is achieved by utilizing data augmentation strategies. The network predictions are used as input for the second stage which performs a bottom-up clustering to build baselines. The developed method is capable of handling complex layouts as well as curved and arbitrarily oriented text lines. It substantially outperforms current state-of-the-art approaches. For example, for the complex track of the cBAD: ICDAR2017 Competition on Baseline Detection the F value is increased from 0.859 to 0.922. The framework to train and run the ARU-Net is open source.

查看译文

关键词

Baseline detection,Text line detection,Layout analysis,Historical documents,U-Net,Pixel labeling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要