Revisiting bound estimation of pattern measures: A generic framework.

Information Sciences(2016)

引用 16|浏览39
暂无评分
摘要
It is widely recognized that constrained pattern mining helps in the capture of a relatively large amount of semantics among different applications, and thus, increases the effectiveness of mining. One major challenge in this field is how the properties of pattern measures can be pushed deeply into the mining process to achieve improved efficiency. The usual solution to this challenge is to estimate the bound of a given pattern measure, PM, for all the supersets of an itemset, X. However, in most previous studies, the authors estimated the bounds for their proposed pattern measures individually and a generic and unified framework that is applicable to any pattern measure has not been proposed. To this end, we revisit the problem of bound estimation and propose a general framework for it by summarizing the commonality among the estimation methods for different pattern measures. The basic idea is to maximize (or minimize) the measures by assigning any item labels to the items in the original supporting transactions. To achieve a balance between bound tightness and computational efficiency, we also propose techniques for addressing this tradeoff issue in order to improve the overall performance. As a case study, we applied this framework to two typical pattern measures: utility and occupancy. Additionally, we describe the application of our proposed techniques to other measures. The results of our extensive experimental evaluation on real and large synthetic datasets demonstrate the effectiveness of our proposed techniques.
更多
查看译文
关键词
Bound estimation,Utility,Occupancy,Constrained pattern mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要