Adoptive Thresholding and Geometric Features based Physical Layout Analysis of Scanned Arabic Books

Maitham A. Al-Dobais, Fahad Abdulrahman G. Alrasheed,Ghazanfar Latif,Loay Alzubaidi

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)(2018)

引用 3|浏览3
暂无评分
摘要
In the digital age, developing an automated system to convert old printed books into digital form is a challenging task. In this paper we propose a novel technique for the recognition of Arabic scanned documents both with normal and complex layouts. The proposed algorithm is based on the local adaptive thresholding and geometric features which according to the author's knowledge is the first time it is applied to Arabic document image recognition based on the Physical Layout Analysis (PLA). The proposed method was applied to dataset consisting of 90 images collected from 700 books from various publishers and contains a total of 1112 zones; text zone, image zone, and graphic zone. The proposed algorithm achieved promising results with overall average recognition of 86.71% for Text and Image block regions for all three sets. The proposed novel algorithm outperforms the techniques mentioned in previous literature.
更多
查看译文
关键词
Physical Layout Analysis,Adoptive Thresholding,Scanned Arabic Books,Geometric Features,Segmentation,Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要