首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In Ordinary Least Square regression, researchers often are interested in knowing whether a set of parameters is different from zero. With complete data, this could be achieved using the gain in prediction test, hierarchical multiple regression, or an omnibus F test. However, in substantive research scenarios, missing data often exist. In the context of multiple imputation, one of the current state-of-art missing data strategies, there are several different analogous multi-parameter tests of the joint significance of a set of parameters, and these multi-parameter test statistics can be referenced to various distributions to make statistical inferences. However, little is known about the performance of these tests, and virtually no research study has compared the Type 1 error rates and statistical power of these tests in scenarios that are typical of behavioral science data (e.g., small to moderate samples, etc.). This paper uses Monte Carlo simulation techniques to examine the performance of these multi-parameter test statistics for multiple imputation under a variety of realistic conditions. We provide a number of practical recommendations for substantive researchers based on the simulation results, and illustrate the calculation of these test statistics with an empirical example.  相似文献   

2.
Multiple imputation under a two‐way model with error is a simple and effective method that has been used to handle missing item scores in unidimensional test and questionnaire data. Extensions of this method to multidimensional data are proposed. A simulation study is used to investigate whether these extensions produce biased estimates of important statistics in multidimensional data, and to compare them with lower benchmark listwise deletion, two‐way with error and multivariate normal imputation. The new methods produce smaller bias in several psychometrically interesting statistics than the existing methods of two‐way with error and multivariate normal imputation. One of these new methods clearly is preferable for handling missing item scores in multidimensional test data.  相似文献   

3.
The use of item responses from questionnaire data is ubiquitous in social science research. One side effect of using such data is that researchers must often account for item level missingness. Multiple imputation is one of the most widely used missing data handling techniques. The traditional multiple imputation approach in structural equation modeling has a number of limitations. Motivated by Lee and Cai’s approach, we propose an alternative method for conducting statistical inference from multiple imputation in categorical structural equation modeling. We examine the performance of our proposed method via a simulation study and illustrate it with one empirical data set.  相似文献   

4.
Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missing-data methods available to most data analysts have been relatively ad1 hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, have produced anew generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simulation technique that replaces each missing datum with a set of m > 1 plausible values. The rn versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997a) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from the Adolescent Alcohol Prevention Trial (Hansen & Graham, 199 I).  相似文献   

5.
In the diagnostic evaluation of educational systems, self-reports are commonly used to collect data, both cognitive and orectic. For various reasons, in these self-reports, some of the students' data are frequently missing. The main goal of this research is to compare the performance of different imputation methods for missing data in the context of the evaluation of educational systems. On an empirical database of 5,000 subjects, 72 conditions were simulated: three levels of missing data, three types of loss mechanisms, and eight methods of imputation. The levels of missing data were 5%, 10%, and 20%. The loss mechanisms were set at: Missing completely at random, moderately conditioned, and strongly conditioned. The eight imputation methods used were: listwise deletion, replacement by the mean of the scale, by the item mean, the subject mean, the corrected subject mean, multiple regression, and Expectation-Maximization (EM) algorithm, with and without auxiliary variables. The results indicate that the recovery of the data is more accurate when using an appropriate combination of different methods of recovering lost data. When a case is incomplete, the mean of the subject works very well, whereas for completely lost data, multiple imputation with the EM algorithm is recommended. The use of this combination is especially recommended when data loss is greater and its loss mechanism is more conditioned. Lastly, the results are discussed, and some future lines of research are analyzed.  相似文献   

6.
Best practices for missing data management in counseling psychology   总被引:3,自引:0,他引:3  
This article urges counseling psychology researchers to recognize and report how missing data are handled, because consumers of research cannot accurately interpret findings without knowing the amount and pattern of missing data or the strategies that were used to handle those data. Patterns of missing data are reviewed, and some of the common strategies for dealing with them are described. The authors provide an illustration in which data were simulated and evaluate 3 methods of handling missing data: mean substitution, multiple imputation, and full information maximum likelihood. Results suggest that mean substitution is a poor method for handling missing data, whereas both multiple imputation and full information maximum likelihood are recommended alternatives to this approach. The authors suggest that researchers fully consider and report the amount and pattern of missing data and the strategy for handling those data in counseling psychology research and that editors advise researchers of this expectation.  相似文献   

7.
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at random, or not missing at random. Cronbach's alpha, Loevinger's scalability coefficient H, and the item cluster solution from Mokken scale analysis of the complete data were compared with the corresponding results based on the data including imputed scores. The multiple-imputation methods, two-way with normally distributed errors, corrected item-mean substitution with normally distributed errors, and response function, produced discrepancies in Cronbach's coefficient alpha, Loevinger's coefficient H, and the cluster solution from Mokken scale analysis, that were smaller than the discrepancies in upper benchmark multivariate normal imputation.  相似文献   

8.
This article proposes a new procedure to test mediation with the presence of missing data by combining nonparametric bootstrapping with multiple imputation (MI). This procedure performs MI first and then bootstrapping for each imputed data set. The proposed procedure is more computationally efficient than the procedure that performs bootstrapping first and then MI for each bootstrap sample. The validity of the procedure is evaluated using a simulation study under different sample size, missing data mechanism, missing data proportion, and shape of distribution conditions. The result suggests that the proposed procedure performs comparably to the procedure that combines bootstrapping with full information maximum likelihood under most conditions. However, caution needs to be taken when using this procedure to handle missing not-at-random or nonnormal data.  相似文献   

9.
In sparse tables for categorical data well‐known goodness‐of‐fit statistics are not chi‐square distributed. A consequence is that model selection becomes a problem. It has been suggested that a way out of this problem is the use of the parametric bootstrap. In this paper, the parametric bootstrap goodness‐of‐fit test is studied by means of an extensive simulation study; the Type I error rates and power of this test are studied under several conditions of sparseness. In the presence of sparseness, models were used that were likely to violate the regularity conditions. Besides bootstrapping the goodness‐of‐fit usually used (full information statistics), corrected versions of these statistics and a limited information statistic are bootstrapped. These bootstrap tests were also compared to an asymptotic test using limited information. Results indicate that bootstrapping the usual statistics fails because these tests are too liberal, and that bootstrapping or asymptotically testing the limited information statistic works better with respect to Type I error and outperforms the other statistics by far in terms of statistical power. The properties of all tests are illustrated using categorical Markov models.  相似文献   

10.
Incomplete or missing data is a common problem in almost all areas of empirical research. It is well known that simple and ad hoc methods such as complete case analysis or mean imputation can lead to biased and/or inefficient estimates. The method of maximum likelihood works well; however, when the missing data mechanism is not one of missing completely at random (MCAR) or missing at random (MAR), it too can result in incorrect inference. Statistical tests for MCAR have been proposed, but these are restricted to a certain class of problems. The idea of sensitivity analysis as a means to detect the missing data mechanism has been proposed in the statistics literature in conjunction with selection models where conjointly the data and missing data mechanism are modeled. Our approach is different here in that we do not model the missing data mechanism but use the data at hand to examine the sensitivity of a given model to the missing data mechanism. Our methodology is meant to raise a flag for researchers when the assumptions of MCAR (or MAR) do not hold. To our knowledge, no specific proposal for sensitivity analysis has been set forth in the area of structural equation models (SEM). This article gives a specific method for performing postmodeling sensitivity analysis using a statistical test and graphs. A simulation study is performed to assess the methodology in the context of structural equation models. This study shows success of the method, especially when the sample size is 300 or more and the percentage of missing data is 20% or more. The method is also used to study a set of real data measuring physical and social self-concepts in 463 Nigerian adolescents using a factor analysis model.  相似文献   

11.
项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。  相似文献   

12.
Inference methods for null hypotheses formulated in terms of distribution functions in general non‐parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set‐up Wald‐type statistics and ANOVA‐type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal–Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability.  相似文献   

13.
14.
Hierarchical classes models are quasi-order retaining Boolean decomposition models for N-way N-mode binary data. To fit these models to data, rationally started alternating least squares (or, equivalently, alternating least absolute deviations) algorithms have been proposed. Extensive simulation studies showed that these algorithms succeed quite well in recovering the underlying truth but frequently end in a local minimum. In this paper we evaluate whether or not this local minimum problem can be mitigated by means of two common strategies for avoiding local minima in combinatorial data analysis: simulated annealing (SA) and use of a multistart procedure. In particular, we propose a generic SA algorithm for hierarchical classes analysis and three different types of random starts. The effectiveness of the SA algorithm and the random starts is evaluated by reanalyzing data sets of previous simulation studies. The reported results support the use of the proposed SA algorithm in combination with a random multistart procedure, regardless of the properties of the data set under study. Eva Ceulemans is a post-doctoral fellow of the Fund for Scientific Research Flanders (Belgium). Iwin Leenen is a post-doctoral researcher of the Spanish Ministerio de Educación y Ciencia (programa Ramón y Cajal). The research reported in this paper was partially supported by the Research Council of K.U. Leuven (GOA/05/04).  相似文献   

15.
Reverse-scored items on assessment scales increase cognitive processing demands and may therefore lead to measurement problems for older adult respondents. In this study, the objective was to examine possible psychometric inadequacies of reverse-scored items on the Center for Epidemiologic Studies Depression Scale (CES-D) when used to assess ethnically diverse older adults. Using baseline data from a gerontologic clinical trial (n = 460), we tested the hypotheses that the reversed items on the CES-D (a) are less reliable than nonreversed items, (b) disproportionately lead to intraindividually atypical responses that are psychometrically problematic, and (c) evidence improved measurement properties when an imputation procedure based on the scale mean is used to replace atypical responses. In general, the results supported the hypotheses. Relative to nonreversed CES-D items, the 4 reversed items were less internally consistent, were associated with lower item-scale correlations, and were more often answered atypically at an intraindividual level. Further, the atypical responses were negatively correlated with responses to psychometrically sound nonreversed items that had similar content. The use of imputation to replace atypical responses enhanced the predictive validity of the set of reverse-scored items. Among older adult respondents, reverse-scored items are associated with measurement difficulties. It is recommended that appropriate correction procedures such as item readministration or statistical imputation be applied to reduce the difficulties.  相似文献   

16.
Missing data are a common issue in statistical analyses. Multiple imputation is a technique that has been applied in countless research studies and has a strong theoretical basis. Most of the statistical literature on multiple imputation has focused on unbounded continuous variables, with mostly ad hoc remedies for variables with bounded support. These approaches can be unsatisfactory when applied to bounded variables as they can produce misleading inferences. In this paper, we propose a flexible quantile-based imputation model suitable for distributions defined over singly or doubly bounded intervals. Proper support of the imputed values is ensured by applying a family of transformations with singly or doubly bounded range. Simulation studies demonstrate that our method is able to deal with skewness, bimodality, and heteroscedasticity and has superior properties as compared to competing approaches, such as log-normal imputation and predictive mean matching. We demonstrate the application of the proposed imputation procedure by analysing data on mathematical development scores in children from the Millennium Cohort Study, UK. We also show a specific advantage of our methods using a small psychiatric dataset. Our methods are relevant in a number of fields, including education and psychology.  相似文献   

17.
Ke-Hai Yuan 《Psychometrika》2009,74(2):233-256
When data are not missing at random (NMAR), maximum likelihood (ML) procedure will not generate consistent parameter estimates unless the missing data mechanism is correctly modeled. Understanding NMAR mechanism in a data set would allow one to better use the ML methodology. A survey or questionnaire may contain many items; certain items may be responsible for NMAR values in other items. The paper develops statistical procedures to identify the responsible items. By comparing ML estimates (MLE), statistics are developed to test whether the MLEs are changed when excluding items. The items that cause a significant change of the MLEs are responsible for the NMAR mechanism. Normal distribution is used for obtaining the MLEs; a sandwich-type covariance matrix is used to account for distribution violations. The class of nonnormal distributions within which the procedure is valid is provided. Both saturated and structural models are considered. Effect sizes are also defined and studied. The results indicate that more missing data in a sample does not necessarily imply more significant test statistics due to smaller effect sizes. Knowing the true population means and covariances or the parameter values in structural equation models may not make things easier either. The research was supported by NSF grant DMS04-37167, the James McKeen Cattell Fund.  相似文献   

18.
Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including listwise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum likelihood (TS-ML) method. An R package bmem is developed to implement the four methods for mediation analysis with missing data in the structural equation modeling framework, and two real examples are used to illustrate the application of the four methods. The four methods are evaluated and compared under MCAR, MAR, and MNAR missing data mechanisms through simulation studies. Both MI and TS-ML perform well for MCAR and MAR data regardless of the inclusion of auxiliary variables and for AV-MNAR data with auxiliary variables. Although listwise deletion and pairwise deletion have low power and large parameter estimation bias in many studied conditions, they may provide useful information for exploring missing mechanisms.  相似文献   

19.
EM and beyond   总被引:2,自引:0,他引:2  
The basic theme of the EM algorithm, to repeatedly use complete-data methods to solve incomplete data problems, is also a theme of several more recent statistical techniques. These techniques—multiple imputation, data augmentation, stochastic relaxation, and sampling importance resampling—combine simulation techniques with complete-data methods to attack problems that are difficult or impossible for EM.A preliminary version of this article was the Keynote Address at the 1987 European Meeting of the Psychometric Society June 24–26, 1987 in Enschede, The Netherlands. The author wishes to thank the editor and reviewers for helpful comments.  相似文献   

20.
Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号