首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 234 毫秒
陈楠  刘红云 《心理科学》2015,(2):446-451
对含有非随机缺失数据的潜变量增长模型,为了考察基于不同假设的缺失数据处理方法:极大似然(ML)方法与DiggleKenward选择模型的优劣,通过Monte Carlo模拟研究,比较两种方法对模型中增长参数估计精度及其标准误估计的差异,并考虑样本量、非随机缺失比例和随机缺失比例的影响。结果表明,符合前提假设的Diggle-Kenward选择模型的参数估计精度普遍高于ML方法;对于标准误估计值,ML方法存在一定程度的低估,得到的置信区间覆盖比率也明显低于Diggle-Kenward选择模型。  相似文献   

The past decade has seen a noticeable shift in missing data handling techniques that assume a missing at random (MAR) mechanism, where the propensity for missing data on an outcome is related to other analysis variables. Although MAR is often reasonable, there are situations where this assumption is unlikely to hold, leading to biased parameter estimates. One such example is a longitudinal study of substance use where participants with the highest frequency of use also have the highest likelihood of attrition, even after controlling for other correlates of missingness. There is a large body of literature on missing not at random (MNAR) analysis models for longitudinal data, particularly in the field of biostatistics. Because these methods allow for a relationship between the outcome variable and the propensity for missing data, they require a weaker assumption about the missing data mechanism. This article describes 2 classic MNAR modeling approaches for longitudinal data: the selection model and the pattern mixture model. To date, these models have been slow to migrate to the social sciences, in part because they required complicated custom computer programs. These models are now quite easy to estimate in popular structural equation modeling programs, particularly Mplus. The purpose of this article is to describe these MNAR modeling frameworks and to illustrate their application on a real data set. Despite their potential advantages, MNAR-based analyses are not without problems and also rely on untestable assumptions. This article offers practical advice for implementing and choosing among different longitudinal models.  相似文献   

Missing not at random (MNAR) modeling for non-ignorable missing responses usually assumes that the latent variable distribution is a bivariate normal distribution. Such an assumption is rarely verified and often employed as a standard in practice. Recent studies for “complete” item responses (i.e., no missing data) have shown that ignoring the nonnormal distribution of a unidimensional latent variable, especially skewed or bimodal, can yield biased estimates and misleading conclusion. However, dealing with the bivariate nonnormal latent variable distribution with present MNAR data has not been looked into. This article proposes to extend unidimensional empirical histogram and Davidian curve methods to simultaneously deal with nonnormal latent variable distribution and MNAR data. A simulation study is carried out to demonstrate the consequence of ignoring bivariate nonnormal distribution on parameter estimates, followed by an empirical analysis of “don’t know” item responses. The results presented in this article show that examining the assumption of bivariate nonnormal latent variable distribution should be considered as a routine for MNAR data to minimize the impact of nonnormality on parameter estimates.  相似文献   

Examinee‐selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non‐ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two‐dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non‐ignorable and to determine how to apply the new model to the data collected. Two follow‐up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non‐ignorable missing data were mistakenly treated as ignorable.  相似文献   

追踪研究中缺失数据十分常见。本文通过Monte Carlo模拟研究,考察基于不同前提假设的Diggle-Kenward选择模型和ML方法对增长参数估计精度的差异,并考虑样本量、缺失比例、目标变量分布形态以及不同缺失机制的影响。结果表明:(1)缺失机制对基于MAR的ML方法有较大的影响,在MNAR缺失机制下,基于MAR的ML方法对LGM模型中截距均值和斜率均值的估计不具有稳健性。(2)DiggleKenward选择模型更容易受到目标变量分布偏态程度的影响,样本量与偏态程度存在交互作用,样本量较大时,偏态程度的影响会减弱。而ML方法仅在MNAR机制下轻微受到偏态程度的影响。  相似文献   

This article proposes an approach to modelling partially cross‐classified multilevel data where some of the level‐1 observations are nested in one random factor and some are cross‐classified by two random factors. Comparisons between a proposed approach to two other commonly used approaches which treat the partially cross‐classified data as either fully nested or fully cross‐classified are completed with a simulation study. Results show that the proposed approach demonstrates desirable performance in terms of parameter estimates and statistical inferences. Both the fully nested model and the fully cross‐classified model suffer from biased estimates of some variance components and statistical inferences of some fixed effects. Results also indicate that the proposed model is robust against cluster size imbalance.  相似文献   

Cross‐classified random effects modelling (CCREM) is a special case of multi‐level modelling where the units of one level are nested within two cross‐classified factors. Typically, CCREM analyses omit the random interaction effect of the cross‐classified factors. We investigate the impact of the omission of the interaction effect on parameter estimates and standard errors. Results from a Monte Carlo simulation study indicate that, for fixed effects, both coefficients estimates and accompanied standard error estimates are not biased. For random effects, results are affected at level 2 but not at level 1 by the presence of an interaction variance and/or a correlation between the residual of level two factors. Results from the analysis of the Early Childhood Longitudinal Study and the National Educational Longitudinal Study agree with those obtained from simulated data. We recommend that researchers attempt to include interaction effects of cross‐classified factors in their models.  相似文献   

Researchers conducting longitudinal studies with children or adults are inevitably confronted with problems of attrition and missing data. Missing data in longitudinal studies is frequently handled by excluding from analyses those cases for whom data are incomplete. This approach to missing data is not optimal. On the one hand, if data are missing at random, then dropping incomplete cases ignores information collected on those cases that could be used to improve estimates of population parameters (e.g., means, variances, covariances, and growth rates) and improve the power of significance tests of statistical hypotheses. On the other hand, if data are not missing at random, then dropping incomplete cases leads to biased parameter estimates and hypothesis tests that may be internally and externally invalid. This study uses three years of follow-up data from a longitudinal investigation of neuropsychological outcomes of cancer in children to demonstrate the problems presented by missing data in repeated measures designs and some solutions. In evaluating potential biasing effects of attrition, the study extends previous research on neuropsychological outcomes in pediatric cancer by inclusion of patients whose disease had relapsed, and by comparison of surviving and nonsurviving patients. Although the data presented have specific relevance to the study of neuropsychological outcome in pediatric cancer, the problems of missing data and the solutions presented are relevant to a wide variety of diseases and conditions of interest to researchers in child and adult neuropsychology.  相似文献   

A Monte Carlo study was used to compare four approaches to growth curve analysis of subjects assessed repeatedly with the same set of dichotomous items: A two‐step procedure first estimating latent trait measures using MULTILOG and then using a hierarchical linear model to examine the changing trajectories with the estimated abilities as the outcome variable; a structural equation model using modified weighted least squares (WLSMV) estimation; and two approaches in the framework of multilevel item response models, including a hierarchical generalized linear model using Laplace estimation, and Bayesian analysis using Markov chain Monte Carlo (MCMC). These four methods have similar power in detecting the average linear slope across time. MCMC and Laplace estimates perform relatively better on the bias of the average linear slope and corresponding standard error, as well as the item location parameters. For the variance of the random intercept, and the covariance between the random intercept and slope, all estimates are biased in most conditions. For the random slope variance, only Laplace estimates are unbiased when there are eight time points.  相似文献   

宋枝璘  郭磊  郑天鹏 《心理学报》2022,54(4):426-440
数据缺失在测验中经常发生, 认知诊断评估也不例外, 数据缺失会导致诊断结果的偏差。首先, 通过模拟研究在多种实验条件下比较了常用的缺失数据处理方法。结果表明:(1)缺失数据导致估计精确性下降, 随着人数与题目数量减少、缺失率增大、题目质量降低, 所有方法的PCCR均下降, Bias绝对值和RMSE均上升。(2)估计题目参数时, EM法表现最好, 其次是MI, FIML和ZR法表现不稳定。(3)估计被试知识状态时, EM和FIML表现最好, MI和ZR表现不稳定。其次, 在PISA2015实证数据中进一步探索了不同方法的表现。综合模拟和实证研究结果, 推荐选用EM或FIML法进行缺失数据处理。  相似文献   


Drop out is a typical issue in longitudinal studies. When the missingness is non-ignorable, inference based on the observed data only may be biased. This paper is motivated by the Leiden 85+ study, a longitudinal study conducted to analyze the dynamics of cognitive functioning in the elderly. We account for dependence between longitudinal responses from the same subject using time-varying random effects associated with a heterogeneous hidden Markov chain. As several participants in the study drop out prematurely, we introduce a further random effect model to describe the missing data mechanism. The potential dependence between the random effects in the two equations (and, therefore, between the two processes) is introduced through a joint distribution specified via a latent structure approach. The application of the proposal to data from the Leiden 85+ study shows its effectiveness in modeling heterogeneous longitudinal patterns, possibly influenced by the missing data process. Results from a sensitivity analysis show the robustness of the estimates with respect to misspecification of the missing data mechanism. A simulation study provides evidence for the reliability of the inferential conclusions drawn from the analysis of the Leiden 85+ data.  相似文献   

Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings.  相似文献   

Observers in a two‐alternative forced‐choice (2AFC) detection task face the need to produce a response at random (a guess) on trials in which neither presentation appeared to display a stimulus. Observers could alternatively be instructed to use a ‘guess’ key on those trials, a key that would produce a random guess and would also record the resultant correct or wrong response as emanating from a computer‐generated guess. A simulation study shows that ‘denoising’ 2AFC data with information regarding which responses are a result of guesses yields estimates of detection threshold and spread of the psychometric function that are far more precise than those obtained in the absence of this information, and parallel the precision of estimates obtained with yes–no tasks running for the same number of trials. Simulations also show that partial compliance with the instructions to use the ‘guess’ key reduces the quality of the estimates, which nevertheless continue to be more precise than those obtained from conventional 2AFC data if the observers are still moderately compliant. An empirical study testing the validity of simulation results showed that denoised 2AFC estimates of spread were clearly superior to conventional 2AFC estimates and similar to yes–no estimates, but variations in threshold across observers and across sessions hid the benefits of denoising for threshold estimation. The empirical study also proved the feasibility of using a ‘guess’ key in addition to the conventional response keys defined in 2AFC tasks.  相似文献   

Moderation analysis is useful for addressing interesting research questions in social sciences and behavioural research. In practice, moderated multiple regression (MMR) models have been most widely used. However, missing data pose a challenge, mainly because the interaction term is a product of two or more variables and thus is a non-linear function of the involved variables. Normal-distribution-based maximum likelihood (NML) has been proposed and applied for estimating MMR models with incomplete data. When data are missing completely at random, moderation effect estimates are consistent. However, simulation results have found that when data in the predictor are missing at random (MAR), NML can yield inaccurate estimates of moderation effects when the moderation effects are non-null. Simulation studies are subject to the limitation of confounding systematic bias with sampling errors. Thus, the purpose of this paper is to analytically derive asymptotic bias of NML estimates of moderation effects with MAR data. Results show that when the moderation effect is zero, there is no asymptotic bias in moderation effect estimates with either normal or non-normal data. When the moderation effect is non-zero, however, asymptotic bias may exist and is determined by factors such as the moderation effect size, missing-data proportion, and type of missingness dependence. Our analytical results suggest that researchers should apply NML to MMR models with caution when missing data exist. Suggestions are given regarding moderation analysis with missing data.  相似文献   

Structural equation models (SEMs) have become widely used to determine the interrelationships between latent and observed variables in social, psychological, and behavioural sciences. As heterogeneous data are very common in practical research in these fields, the analysis of mixture models has received a lot of attention in the literature. An important issue in the analysis of mixture SEMs is the presence of missing data, in particular of data missing with a non‐ignorable mechanism. However, only a limited amount of work has been done in analysing mixture SEMs with non‐ignorable missing data. The main objective of this paper is to develop a Bayesian approach for analysing mixture SEMs with an unknown number of components and non‐ignorable missing data. A simulation study shows that Bayesian estimates obtained by the proposed Markov chain Monte Carlo methods are accurate and the Bayes factor computed via a path sampling procedure is useful for identifying the correct number of components, selecting an appropriate missingness mechanism, and investigating various effects of latent variables in the mixture SEMs. A real data set on a study of job satisfaction is used to demonstrate the methodology.  相似文献   

Missing values are a practical issue in the analysis of longitudinal data. Multiple imputation (MI) is a well‐known likelihood‐based method that has optimal properties in terms of efficiency and consistency if the imputation model is correctly specified. Doubly robust (DR) weighing‐based methods protect against misspecification bias if one of the models, but not necessarily both, for the data or the mechanism leading to missing data is correct. We propose a new imputation method that captures the simplicity of MI and protection from the DR method. This method integrates MI and DR to protect against misspecification of the imputation model under a missing at random assumption. Our method avoids analytical complications of missing data particularly in multivariate settings, and is easy to implement in standard statistical packages. Moreover, the proposed method works very well with an intermittent pattern of missingness when other DR methods can not be used. Simulation experiments show that the proposed approach achieves improved performance when one of the models is correct. The method is applied to data from the fireworks disaster study, a randomized clinical trial comparing therapies in disaster‐exposed children. We conclude that the new method increases the robustness of imputations.  相似文献   

Results are described for a survey assessing prevalence of missing data and reporting practices in studies with missing data in a random sample of empirical research journal articles from the PsychINFO database for the year 1999, two years prior to the publication of a special section on missing data in Psychological Methods. Analysis indicates missing data problems were found in about one-third of the studies. Further, analytical methods and reporting practices varied widely for studies with missing data. One may consider these results as baseline data to assess progress as reporting standards evolve for studies with missing data. Some potential reporting standards are discussed.  相似文献   

Incomplete or missing data is a common problem in almost all areas of empirical research. It is well known that simple and ad hoc methods such as complete case analysis or mean imputation can lead to biased and/or inefficient estimates. The method of maximum likelihood works well; however, when the missing data mechanism is not one of missing completely at random (MCAR) or missing at random (MAR), it too can result in incorrect inference. Statistical tests for MCAR have been proposed, but these are restricted to a certain class of problems. The idea of sensitivity analysis as a means to detect the missing data mechanism has been proposed in the statistics literature in conjunction with selection models where conjointly the data and missing data mechanism are modeled. Our approach is different here in that we do not model the missing data mechanism but use the data at hand to examine the sensitivity of a given model to the missing data mechanism. Our methodology is meant to raise a flag for researchers when the assumptions of MCAR (or MAR) do not hold. To our knowledge, no specific proposal for sensitivity analysis has been set forth in the area of structural equation models (SEM). This article gives a specific method for performing postmodeling sensitivity analysis using a statistical test and graphs. A simulation study is performed to assess the methodology in the context of structural equation models. This study shows success of the method, especially when the sample size is 300 or more and the percentage of missing data is 20% or more. The method is also used to study a set of real data measuring physical and social self-concepts in 463 Nigerian adolescents using a factor analysis model.  相似文献   

Clinical trials exploring the effectiveness of counseling and psychotherapy in treatment of depression in school‐age youth composed this meta‐analysis. Results were synthesized using a random effects model for mean difference and mean gain effect size estimates. No effects of moderating variables were evident. Counseling and psychotherapy are effective for treatment of depression in school‐age youth both at termination and follow‐up, and in school and nonschool settings.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号