A Comment on “A Similarity Measure for Text Classification and Clustering”

IEEE Transactions on Knowledge and Data Engineering(2015)

引用 35|浏览43
暂无评分
摘要
AbstractA similarity measure namely, similarity measure for text processing (SMTP) is proposed by Lin et al. [1] for knowledge discovery on text collection. The proposed measure considered the three cases for similarity measurements between the pairs of documents. These cases are based on absence and presence of features in the pair of text documents. The first case covers the features appearing in both of the documents, second case covers the features appears in only one document and the third case covers the features appears in none of the documents. The proposed similarity measure considered to be ideal for finding similarity between the pair of text documents on the basis of presence or absence of features available in text documents, however, while exploring the SMTP similarity measurement it is found that the case of measuring similarity between the pair of similar documents is not covered. The objective of this work is to highlight this gap and propose a minor change to make the SMTP a complete similarity measurement technique for knowledge discovery in line with the other standard similarity techniques.
更多
查看译文
关键词
Measurement techniques,Text processing,Knowledge discovery,Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要