Distributed Anomalies Detection Using Isolation Forest and Spark.

International Conference on Computational Collective Intelligence (ICCCI)(2022)

引用 0|浏览0
暂无评分
摘要
Anomaly detection is a major issue for several applications such as industrial failure detection, cybersecurity or transport. Several approaches, such as statistical methods, machine learning and sketch, have been explored by different research communities to detect anomalies in an increasingly challenging context. Indeed, facing the huge volume of data generated at an increasingly fast speed, the response time of the algorithms and their distributivity have become determining criteria, in addition to their accuracy in detecting anomalies. We focus in this paper on the unsupervised anomaly detection algorithm based on binary trees: Isolation Forest. It is a very powerful algorithm with an excellent accuracy and a very low execution time thanks to its linear complexity. In particular, we study the architecture of two distribution solutions of Isolation Forest based on the Apache Spark framework. We then compare the performance of these two solutions by testing them against 4 real commonly used datasets.
更多
查看译文
关键词
Anomalies detection,Isolation Forest,Distribution,Apache Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要