Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures |
| |
Authors: | Douglas Steinley Michael J Brusco |
| |
Institution: | (1) Department of Psychological Sciences, University of Missouri-Columbia, 210 McAlester Hall, Columbia, MO 65211, USA;(2) Florida State University, Tallahassee, FL, USA |
| |
Abstract: | Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range
of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise)
are included in the model. Furthermore, the distribution of the random noise greatly impacts the performance of nearly all
of the variable selection procedures. Overall, a variable selection technique based on a variance-to-range weighting procedure
coupled with the largest decreases in within-cluster sums of squares error performed the best. On the other hand, variable
selection methods used in conjunction with finite mixture models performed the worst. |
| |
Keywords: | cluster analysis variable selection |
本文献已被 SpringerLink 等数据库收录! |
|