Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures |
| |
Authors: | Douglas Steinley Michael J. Brusco |
| |
Affiliation: | (1) Department of Psychological Sciences, University of Missouri-Columbia, 210 McAlester Hall, Columbia, MO 65211, USA;(2) Florida State University, Tallahassee, FL, USA |
| |
Abstract: | Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise) are included in the model. Furthermore, the distribution of the random noise greatly impacts the performance of nearly all of the variable selection procedures. Overall, a variable selection technique based on a variance-to-range weighting procedure coupled with the largest decreases in within-cluster sums of squares error performed the best. On the other hand, variable selection methods used in conjunction with finite mixture models performed the worst. |
| |
Keywords: | cluster analysis variable selection |
本文献已被 SpringerLink 等数据库收录! |
|