Digitization Of Romanian Printed Texts Of The 17th Century

PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE'(2016)

引用 0|浏览1
暂无评分
摘要
Problem of digitization of historical-literature heritage is a domain of priority in digital agenda for Europe supported by EU with a lot of European projects. In Moldova and Romania old books were published mainly in the old Romanian Cyrillic script. This script was definitely formed in the 17th century, then from 1830 and until the official introduction of the Latin alphabet for Romanian in 1862 several mixed of Cyrillic and Latin letters transitional scripts were used. In the 20th century a different Cyrillic script was used in Bessarabia. We had described earlier the technology for digitization of printed Romanian Cyrillic texts of the 18th-20th centuries. This work refers to the 17th century as the first books in the Romanian language were printed. The typography repeated the standard of the Romanian and Slavonic Cyrillic manuscripts like insertion of some letters over the line, use of letters for numbers, etc. Therefore OCR of these texts by ABBYY FineReader meets some difficulties whose nature and overcoming is discussed.
更多
查看译文
关键词
historical and cultural heritage, OCR, Romanian Cyrillic script of the 17th century
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要