谷歌浏览器插件
订阅小程序
在清言上使用

Learning Noise in Web Data Prior to Elimination

springer

引用 0|浏览4
暂无评分
摘要
This research work explores how noise in web data is currently addressed. We establish that current research works eliminate noise in web data mainly based on the structure and layout of web pages i.e. they consider noise as any data that does not form part of the main web page. However, not all data that form part of the main web page is of a user interest and not every data considered noise is actually noise to a given user. The ability to determine what is useful from noise data taking into account dynamic change of user interests has not been fully addressed by current research works. We aim to justify a claim that it is important to learn noise prior to elimination, not only to decrease levels of noise but also reduce the loss of useful information otherwise eliminated as noise. This is because if the process of eliminating noise in web data is not user-driven, the interestingness of web data available to a user will not reflect their interests given the time of the request.
更多
查看译文
关键词
Dynamic session identification,Noise web data learning,User interest,User profile,Web log data,Web usage mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要