Profiling local optima in K-means clustering: developing a diagnostic technique |
| |
Authors: | Steinley Douglas |
| |
Affiliation: | Department of Psychological Sciences, University of Missouri-Columbia, Columbia, MO 65203, USA. steinleyd@missouri.edu |
| |
Abstract: | Using the cluster generation procedure proposed by D. Steinley and R. Henson (2005), the author investigated the performance of K-means clustering under the following scenarios: (a) different probabilities of cluster overlap; (b) different types of cluster overlap; (c) varying samples sizes, clusters, and dimensions; (d) different multivariate distributions of clusters; and (e) various multidimensional data structures. The results are evaluated in terms of the Hubert-Arabie adjusted Rand index, and several observations concerning the performance of K-means clustering are made. Finally, the article concludes with the proposal of a diagnostic technique indicating when the partitioning given by a K-means cluster analysis can be trusted. By combining the information from several observable characteristics of the data (number of clusters, number of variables, sample size, etc.) with the prevalence of unique local optima in several thousand implementations of the K-means algorithm, the author provides a method capable of guiding key data-analysis decisions. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|