Annotation-Free Keyword Spotting in Historical Vietnamese Manuscripts Using Graph Matching.

S+SSPR(2022)

引用 0|浏览6
暂无评分
摘要
Finding key terms in scanned historical manuscripts is invaluable for accessing our written cultural heritage. While keyword spotting (KWS) approaches based on machine learning achieve the best spotting results in the current state of the art, they are limited by the fact that annotated learning samples are needed to infer the writing style of a particular manuscript collection. In this paper, we propose an annotation-free KWS method that does not require any labeled handwriting sample but learns from a printed font instead. First, we train a deep convolutional character detection system on synthetic pages using printed characters. Afterwards, the structure of the detected characters is modeled by means of graphs and is compared with search terms using graph matching. We evaluate our method for spotting logographic Chu Nom characters on the newly introduced Kieu database, which is a historical Vietnamese manuscripts containing 719 scanned pages of the famous Tale of Kieu. Our results show that search terms can be found with promising precision both when providing handwritten samples (query by example) as well as printed characters (query by string).
更多
查看译文
关键词
historical vietnamese manuscripts,graph matching,annotation-free
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要