High-density cluster core-based k-means clustering with an unknown number of clusters

Applied Soft Computing(2024)

引用 0|浏览0
暂无评分
摘要
The k-means algorithm, known for its simplicity and adaptability, faces challenges related to manual cluster number selection and sensitivity to initial centroid placement. This paper introduces an innovative framework aimed at overcoming these challenges. By proposing a data-driven cluster number estimation method and a robust initialization strategy based on high-density cluster cores, our approach revolutionizes k-means, unlocking its full unsupervised potential and ensuring superior performance, even in scenarios involving overlapping clusters. The method employs a novel density-based technique to accurately identify cluster cores, resulting in substantial improvements over existing methods. Rigorous experimentation on synthetic and real-world datasets demonstrates an average performance enhancement of 15% in terms of the Adjusted Rand Index for datasets with overlapping clusters, surpassing the capabilities of state-of-the-art density-based clustering methods and traditional k-means. Moreover, our method autonomously determines the optimal number of clusters, facilitating true unsupervised learning and eliminating the impact of initial centroid placement on clustering outcomes. This leads to stable and consistent results, addressing key limitations of the conventional k-means algorithm. The practical applicability of our approach is exemplified in image segmentation tasks, showcasing its versatility and reliability in real-world scenarios.
更多
查看译文
关键词
k-means,Data clustering,Unsupervised learning,Image segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要