
Data relative currency repair and anomaly detection based on rules

Xuliang Duan,Zeyan Xiao, Yuhai Liu, Zhiyao Li,Qingsong Zhu, Songsong Lang

Expert Systems with Applications(2024)

引用 0|浏览21
Data currency is a temporal reference of the data, it reflects the degree to which the data is current with the world it models. Currency has a significant impact on the quality and value of the data. Once the time stamp of the data is lost or tampered with, it is difficult to perform an absolute and precise repair. Based on data currency research, the basic currency rules were extended to support the parallelization of rule extraction and incremental updating and theoretically reduce the algorithm time complexity from O(n - 1) to O(log (n)). In practical experiments, the repair efficiency of multithreading is improved by up to 75.2% compared with single-threaded operation. According to the problems and requirements encountered in data cleaning, rulebased methods for relative currency repair and anomaly data detection were proposed, a relative currency repair algorithm was implemented, models for evaluating repair results were established, and the method of applying a repair algorithm to detect abnormal currency data was also discussed. The experimental results and analysis show that the extended currency rules that provide more valuable features are feasible and available, and the relative currency repair algorithm can effectively perform data currency repair and anomaly detection.
Data currency,Currency rule m,Currency repair,PArallel algorithm,data cleaning,Data quality,Data mining,Anomaly detection
AI 理解论文
Chat Paper