首页 | 本学科首页   官方微博 | 高级检索  
     


Imputation of missing categorical data by maximizing internal consistency
Authors:Stef van Buuren  Jan L. A. van Rijckevorsel
Affiliation:(1) TNO Institute of Preventive Health Care, PO Box 124, 2300 AC Leiden, The Netherlands
Abstract:This paper suggests a method to supplant missing categorical data by ldquoreasonablerdquo replacements. These replacements will maximize the consistency of the completed data as measured by Guttman's squared correlation ratio. The text outlines a solution of the optimization problem, describes relationships with the relevant psychometric theory, and studies some properties of the method in detail. The main result is that the average correlation should be at least 0.50 before the method becomes practical. At that point, the technique gives reasonable results up to 10–15% missing data.We thank Anneke Bloemhoff of NIPG-TNO for compiling and making the Dutch Life Style Survey data available to use, and Chantal Houée and Thérèse Bardaine, IUT, Vannes, France, exchange students under the COMETT program of the EC, for computational assistance. We also thank Donald Rubin, the Editors and several anonymous reviewers for constructive suggestions.
Keywords:missing data  correlation ratio  optimal scaling
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号