A Fused Multi-feature Based Co-training Approach for Document Clustering

2016 3rd International Conference on Information Science and Control Engineering (ICISCE)(2016)

引用 4|浏览15
暂无评分
摘要
Document clustering is a popular topic in data mining and information retrieval. Most models and methods for this problem are based on computing the similarity between pair documents modeled in a space of all terms, or a new feature space obtained by applying a topic modeling technique for a given corpus. In this paper, we regard these two ideas as clustering on term feature and on semantic feature, and have an assumption that they can contribute to each other in clustering. Also, we propose a co-training approach for spectral clustering taking two features into account. Experiments on four real-world datasets show the feasibility and efficacy of our proposed approach compared with a number of the baseline methods.
更多
查看译文
关键词
multi-feature,co-training,document clustering,spectral clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要