Frequent Itemset Mining On Correlated Probabilistic Databases
DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2018), PT II(2018)
摘要
The problem of mining frequent itemsets from uncertain data (uFIM) has attracted attention in recent years. Most of the work in this field is based on the assumption of stochastic independence, which is clearly unjustified in many real-world applications of uFIM. To address this problem, we introduce a new general model for expressing dependencies in frequent itemset mining. We show that mining itemsets in the general model is NP-complete, but give an efficient algorithm based on dynamic programming to mine itemsets in a simplified version of this model. Our experimental results show that assuming independence in correlated data sets leads to substantially incorrect results.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络