Performance Analysis of K-Means Seeding Algorithms

José Ortiz-Bejar,Eric S. Tellez,Mario Graff, Jesús Ortiz-Bejar,Jaime Cerda Jacobo,Alejandro Zamora-Mendez

2019 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC 2019)（2019）

引用 4|浏览9

暂无评分

摘要

K-Means is one of the most used cluster algorithms. However, because of its optimization process is based on a greedy iterated gradient descent, K-Means is sensitive to the initial set of centers. It has been proved that a bad initial set of centroids can reduce clusters' quality. Therefore, numerous initialization methods have been developed to prevent a lousy performance of K-Means clustering. Nonetheless, we may notice that all of these initialization methods are usually validated by using the Sum of Squared Errors (SSE), as quality measurement. In this study, we evaluate three state-of-the-art initialization methods with three different quality measures, i.e., SSE, the Silhouette Coefficient, and the Adjusted Rand Index. The analysis is carried out with seventeen benchmarks. We provide new insight into the performance of initialization methods that traditionally are left behind; our results describe the high correlation between different initialization methods and fitness functions. These results may help to optimize K-Means for other topological structures beyond those covered by optimizing SSE with low effort.

查看译文

关键词

optimization process,greedy iterated gradient descent,lousy performance,K-Means clustering,SSE,quality measurement,quality measures,initialization methods,performance analysis,cluster algorithms,K-Means seeding algorithms

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要