Data-Driven Oracle Bone Rejoining: A Dataset and Practical Self-Supervised Learning Scheme

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(2022)

引用 7|浏览61
暂无评分
摘要
Oracle Bone Inscriptions (OBI) is one of the oldest scripts in the world. The rejoining of Oracle Bone (OB) fragments is of vital importance to the research of ancient scripts and history. Although significant progress has been achieved in the past decades, the rejoining work still heavily relies on domain knowledge and manual work, thus remains a low efficient and time-consuming process Therefore, an automatic and practical algorithm/system for OB rejoining is of great value to the OBI community. To this end, we collect a real-world dataset for rejoining Oracle Bone fragments, namely OB-Rejoin, which consists of 998 OB rubbing images that suffer from low quality image problems, due to intrinsic underground eroding over time and extrinsic imaging conditions in the past. Moreover, a practical Self-Supervised Splicing Network, S3-Net, is proposed to rejoin the OB fragments based on shape similarity of their borderlines. Specifically, we first transform the manually annotated borderline strokes of OB images into times series style shape representations, which are fed as input to a Generative Adversarial Network for augmenting positive pairs of rejoinable OBs for each OB fragment that does not have rejoinable counterparts. A Siamese network is trained on such augmented data in a contrastive learning manner to retrieve the matching OB fragments of an unseen query from an OB fragment gallery. Experiments on the OB-Rejoin benchmark show that our data-driven approach outperforms two recent methods for time-series analysis. In order to demonstrate its practical potential, we deploy the proposed S3-Net method in real tests and ultimately discover dozens of new rejoinings missed by domain experts for decades.
更多
查看译文
关键词
oracle bone rejoining,dataset,learning,data-driven,self-supervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要