Predicting incorrect mappings: a data-driven approach applied to DBpedia.
SAC 2018: Symposium on Applied Computing Pau France April, 2018(2018)
摘要
DBpedia releases consist of more than 70 multilingual datasets that cover data extracted from different language-specific Wikipedia instances. The data extracted from those Wikipedia instances are transformed into RDF using mappings created by the DBpedia community. Nevertheless, not all the mappings are correct and consistent across all the distinct language-specific DBpedia datasets. As these incorrect mappings are spread in a large number of mappings, it is not feasible to inspect all such mappings manually to ensure their correctness. Thus, the goal of this work is to propose a data-driven method to detect incorrect mappings automatically by analyzing the information from both instance data as well as ontological axioms. We propose a machine learning based approach to building a predictive model which can detect incorrect mappings. We have evaluated different supervised classification algorithms for this task and our best model achieves 93% accuracy. These results help us to detect incorrect mappings and achieve a high-quality DBpedia.
更多查看译文
关键词
Linked Data, Data Quality, Mappings, DBpedia, Machine Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络