期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Methods for Mediation Analysis with Missing Data

Zhiyong Zhang Lijuan Wang 《Psychometrika》2013,78(1):154-184

Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including listwise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum likelihood (TS-ML) method. An R package bmem is developed to implement the four methods for mediation analysis with missing data in the structural equation modeling framework, and two real examples are used to illustrate the application of the four methods. The four methods are evaluated and compared under MCAR, MAR, and MNAR missing data mechanisms through simulation studies. Both MI and TS-ML perform well for MCAR and MAR data regardless of the inclusion of auxiliary variables and for AV-MNAR data with auxiliary variables. Although listwise deletion and pairwise deletion have low power and large parameter estimation bias in many studied conditions, they may provide useful information for exploring missing mechanisms. 相似文献

2.

A Class of Distribution-Free Models for Longitudinal Mediation Analysis

D. Gunzler W. Tang N. Lu P. Wu X. M. Tu 《Psychometrika》2014,79(4):543-568

Mediation analysis constitutes an important part of treatment study to identify the mechanisms by which an intervention achieves its effect. Structural equation model (SEM) is a popular framework for modeling such causal relationship. However, current methods impose various restrictions on the study designs and data distributions, limiting the utility of the information they provide in real study applications. In particular, in longitudinal studies missing data is commonly addressed under the assumption of missing at random (MAR), where current methods are unable to handle such missing data if parametric assumptions are violated. In this paper, we propose a new, robust approach to address the limitations of current SEM within the context of longitudinal mediation analysis by utilizing a class of functional response models (FRM). Being distribution-free, the FRM-based approach does not impose any parametric assumption on data distributions. In addition, by extending the inverse probability weighted (IPW) estimates to the current context, the FRM-based SEM provides valid inference for longitudinal mediation analysis under the two most popular missing data mechanisms; missing completely at random (MCAR) and missing at random (MAR). We illustrate the approach with both real and simulated data. 相似文献

3.

2PLM下缺失数据处理方法及其比较

汪文义宋丽红罗芬丁树良《心理科学》2016,39(6):1500-1507

项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示：在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。相似文献

4.

认知诊断缺失数据处理方法的比较：零替换、多重插补与极大似然估计法

宋枝璘郭磊郑天鹏《心理学报》2022,54(4):426-440

数据缺失在测验中经常发生, 认知诊断评估也不例外, 数据缺失会导致诊断结果的偏差。首先, 通过模拟研究在多种实验条件下比较了常用的缺失数据处理方法。结果表明：(1)缺失数据导致估计精确性下降, 随着人数与题目数量减少、缺失率增大、题目质量降低, 所有方法的PCCR均下降, Bias绝对值和RMSE均上升。(2)估计题目参数时, EM法表现最好, 其次是MI, FIML和ZR法表现不稳定。(3)估计被试知识状态时, EM和FIML表现最好, MI和ZR表现不稳定。其次, 在PISA2015实证数据中进一步探索了不同方法的表现。综合模拟和实证研究结果, 推荐选用EM或FIML法进行缺失数据处理。相似文献

5.

Missing not at random models for latent growth curve analyses

Enders CK 《心理学方法》2011,16(1):1-16

The past decade has seen a noticeable shift in missing data handling techniques that assume a missing at random (MAR) mechanism, where the propensity for missing data on an outcome is related to other analysis variables. Although MAR is often reasonable, there are situations where this assumption is unlikely to hold, leading to biased parameter estimates. One such example is a longitudinal study of substance use where participants with the highest frequency of use also have the highest likelihood of attrition, even after controlling for other correlates of missingness. There is a large body of literature on missing not at random (MNAR) analysis models for longitudinal data, particularly in the field of biostatistics. Because these methods allow for a relationship between the outcome variable and the propensity for missing data, they require a weaker assumption about the missing data mechanism. This article describes 2 classic MNAR modeling approaches for longitudinal data: the selection model and the pattern mixture model. To date, these models have been slow to migrate to the social sciences, in part because they required complicated custom computer programs. These models are now quite easy to estimate in popular structural equation modeling programs, particularly Mplus. The purpose of this article is to describe these MNAR modeling frameworks and to illustrate their application on a real data set. Despite their potential advantages, MNAR-based analyses are not without problems and also rely on untestable assumptions. This article offers practical advice for implementing and choosing among different longitudinal models. 相似文献

6.

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Ke-Hai Yuan Mortaza Jamshidian Yutaka Kano 《Psychometrika》2018,83(2):425-442

Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal. 相似文献

7.

Full Information Maximum Likelihood Estimation for Latent Variable Interactions With Incomplete Indicators

Heining Cham Evgeniya Reshetnyak Barry Rosenfeld William Breitbart 《Multivariate behavioral research》2017,52(1):12-30

Researchers have developed missing data handling techniques for estimating interaction effects in multiple regression. Extending to latent variable interactions, we investigated full information maximum likelihood (FIML) estimation to handle incompletely observed indicators for product indicator (PI) and latent moderated structural equations (LMS) methods. Drawing on the analytic work on missing data handling techniques in multiple regression with interaction effects, we compared the performance of FIML for PI and LMS analytically. We performed a simulation study to compare FIML for PI and LMS. We recommend using FIML for LMS when the indicators are missing completely at random (MCAR) or missing at random (MAR) and when they are normally distributed. FIML for LMS produces unbiased parameter estimates with small variances, correct Type I error rates, and high statistical power of interaction effects. We illustrated the use of these methods by analyzing the interaction effect between advanced cancer patients’ depression and change of inner peace well-being on future hopelessness levels. 相似文献

8.

SEM with Missing Data and Unknown Population Distributions Using Two-Stage ML: Theory and Its Application

Ke-Hai Yuan Laura Lu 《Multivariate behavioral research》2013,48(4):621-652

This article provides the theory and application of the 2-stage maximum likelihood (ML) procedure for structural equation modeling (SEM) with missing data. The validity of this procedure does not require the assumption of a normally distributed population. When the population is normally distributed and all missing data are missing at random (MAR), the direct ML procedure is nearly optimal for SEM with missing data. When missing data mechanisms are unknown, including auxiliary variables in the analysis will make the missing data mechanism more likely to be MAR. It is much easier to include auxiliary variables in the 2-stage ML than in the direct ML. Based on most recent developments for missing data with an unknown population distribution, the article first provides the least technical material on why the normal distribution-based ML generates consistent parameter estimates when the missing data mechanism is MAR. The article also provides sufficient conditions for the 2-stage ML to be a valid statistical procedure in the general case. For the application of the 2-stage ML, an SAS IML program is given to perform the first-stage analysis and EQS codes are provided to perform the second-stage analysis. An example with open- and closed-book examination data is used to illustrate the application of the provided programs. One aim is for quantitative graduate students/applied psychometricians to understand the technical details for missing data analysis. Another aim is for applied researchers to use the method properly. 相似文献

9.

Abstract: Evaluation of Test Statistics for Robust Structural Equation Modeling With Nonnormal Missing Data

Xin Tong Zhiyong Zhang Ke-Hai Yuan 《Multivariate behavioral research》2013,48(6)

Traditional structural equation modeling (SEM) techniques have trouble dealing with incomplete and/or nonnormal data that are often encountered in practice. Yuan and Zhang (2011a) developed a two-stage procedure for SEM to handle nonnormal missing data and proposed four test statistics for overall model evaluation. Although these statistics have been shown to work well with complete data, their performance for incomplete data has not been investigated in the context of robust statistics.

Focusing on a linear growth curve model, a systematic simulation study is conducted to evaluate the accuracy of the parameter estimates and the performance of five test statistics including the naive statistic derived from normal distribution based maximum likelihood (ML), the Satorra-Bentler scaled chi-square statistic (RML), the mean- and variance-adjusted chi-square statistic (AML), Yuan-Bentler residual-based test statistic (CRADF), and Yuan-Bentler residual-based F statistic (RF). Data are generated and analyzed in R using the package rsem (Yuan & Zhang, 2011b).

Based on the simulation study, we can observe the following: (a) The traditional normal distribution-based method cannot yield accurate parameter estimates for nonnormal data, whereas the robust method obtains much more accurate model parameter estimates for nonnormal data and performs almost as well as the normal distribution based method for normal distributed data. (b) With the increase of sample size, or the decrease of missing rate or the number of outliers, the parameter estimates are less biased and the empirical distributions of test statistics are closer to their nominal distributions. (c) The ML test statistic does not work well for nonnormal or missing data. (d) For nonnormal complete data, CRADF and RF work relatively better than RML and AML. (e) For missing completely at random (MCAR) missing data, in almost all the cases, RML and AML work better than CRADF and RF. (f) For nonnormal missing at random (MAR) missing data, CRADF and RF work better than AML. (g) The performance of the robust method does not seem to be influenced by the symmetry of outliers. 相似文献

10.

Asymptotic bias of normal-distribution-based maximum likelihood estimates of moderation effects with data missing at random

Qian Zhang Ke-Hai Yuan Lijuan Wang 《The British journal of mathematical and statistical psychology》2019,72(2):334-354

Moderation analysis is useful for addressing interesting research questions in social sciences and behavioural research. In practice, moderated multiple regression (MMR) models have been most widely used. However, missing data pose a challenge, mainly because the interaction term is a product of two or more variables and thus is a non-linear function of the involved variables. Normal-distribution-based maximum likelihood (NML) has been proposed and applied for estimating MMR models with incomplete data. When data are missing completely at random, moderation effect estimates are consistent. However, simulation results have found that when data in the predictor are missing at random (MAR), NML can yield inaccurate estimates of moderation effects when the moderation effects are non-null. Simulation studies are subject to the limitation of confounding systematic bias with sampling errors. Thus, the purpose of this paper is to analytically derive asymptotic bias of NML estimates of moderation effects with MAR data. Results show that when the moderation effect is zero, there is no asymptotic bias in moderation effect estimates with either normal or non-normal data. When the moderation effect is non-zero, however, asymptotic bias may exist and is determined by factors such as the moderation effect size, missing-data proportion, and type of missingness dependence. Our analytical results suggest that researchers should apply NML to MMR models with caution when missing data exist. Suggestions are given regarding moderation analysis with missing data. 相似文献

11.

基于增长模型的非随机缺失数据处理:选择模型和极大似然方法

陈楠刘红云《心理科学》2015,(2):446-451

对含有非随机缺失数据的潜变量增长模型,为了考察基于不同假设的缺失数据处理方法:极大似然(ML)方法与DiggleKenward选择模型的优劣,通过Monte Carlo模拟研究,比较两种方法对模型中增长参数估计精度及其标准误估计的差异,并考虑样本量、非随机缺失比例和随机缺失比例的影响。结果表明,符合前提假设的Diggle-Kenward选择模型的参数估计精度普遍高于ML方法;对于标准误估计值,ML方法存在一定程度的低估,得到的置信区间覆盖比率也明显低于Diggle-Kenward选择模型。相似文献

12.

Identifying Variables Responsible for Data not Missing at Random

Ke-Hai Yuan 《Psychometrika》2009,74(2):233-256

When data are not missing at random (NMAR), maximum likelihood (ML) procedure will not generate consistent parameter estimates unless the missing data mechanism is correctly modeled. Understanding NMAR mechanism in a data set would allow one to better use the ML methodology. A survey or questionnaire may contain many items; certain items may be responsible for NMAR values in other items. The paper develops statistical procedures to identify the responsible items. By comparing ML estimates (MLE), statistics are developed to test whether the MLEs are changed when excluding items. The items that cause a significant change of the MLEs are responsible for the NMAR mechanism. Normal distribution is used for obtaining the MLEs; a sandwich-type covariance matrix is used to account for distribution violations. The class of nonnormal distributions within which the procedure is valid is provided. Both saturated and structural models are considered. Effect sizes are also defined and studied. The results indicate that more missing data in a sample does not necessarily imply more significant test statistics due to smaller effect sizes. Knowing the true population means and covariances or the parameter values in structural equation models may not make things easier either. The research was supported by NSF grant DMS04-37167, the James McKeen Cattell Fund. 相似文献

13.

Fisher transformations for correlations corrected for selection and missing data

Jorge L. Mendoza 《Psychometrika》1993,58(4):601-615

The validity of a test is often estimated in a nonrandom sample of selected individuals. To accurately estimate the relation between the predictor and the criterion we correct this correlation for range restriction. Unfortunately, this corrected correlation cannot be transformed using Fisher'sZ transformation, and asymptotic tests of hypotheses based on small or moderate samples are not accurate. We developed a Fisherr toZ transformation for the corrected correlation for each of two conditions: (a) the criterion data were missing due to selection on the predictor (the missing data were MAR); and (b) the criterion was missing at random, not due to selection (the missing data were MCAR). The twoZ transformations were evaluated in a computer simulation. The transformations were accurate, and tests of hypotheses and confidence intervals based on the transformations were superior to those that were not based on the transformations. 相似文献

14.

Tests of Homoscedasticity,Normality, and Missing Completely at Random for Incomplete Multivariate Data

Jamshidian M Jalal S 《Psychometrika》2010,75(4):649-674

Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes n _i, and they usually fail when applied to nonnormal data. Hawkins (Technometrics 23:105–110, 1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when n _i are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and nonnormal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups. 相似文献

15.

Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

Sik-Yum Lee 《Psychometrika》2006,71(3):541-564

A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis–Hastings algorithm is used to produce the joint Bayesian estimates of structural parameters, latent variables, parameters in the nonignorable missing model, as well as their standard errors estimates. A goodness-of-fit statistic for assessing the plausibility of the posited nonlinear structural equation model is introduced, and a procedure for computing the Bayes factor for model comparison is developed via path sampling. Results obtained with respect to different missing data models, and different prior inputs are compared via simulation studies. In particular, it is shown that in the presence of nonignorable missing data, results obtained by the proposed method with a nonignorable missing data model are significantly better than those that are obtained under the missing at random assumption. A real example is presented to illustrate the newly developed Bayesian methodologies. This research is fully supported by a grant (CUHK 4243/03H) from the Research Grant Council of the Hong Kong Special Administration Region. The authors are thankful to the editor and reviewers for valuable comments for improving the paper, and also to ICPSR and the relevant funding agency for allowing the use of the data. Requests for reprints should be sent to Professor S.Y. Lee, Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. 相似文献

16.

Pairwise likelihood estimation for confirmatory factor analysis models with categorical variables and data that are missing at random

Myrsini Katsikatsou Irini Moustaki Haziq Jamil 《The British journal of mathematical and statistical psychology》2022,75(1):23-45

Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework and under a missing at random (MAR) mechanism are proposed. Under a full information likelihood estimation framework and MAR, ignorability of the missing data mechanism does not lead to biased estimates. However, this is not the case for pseudo-likelihood approaches such as the PL. We develop and study the performance of three strategies for incorporating missing values into confirmatory factor analysis under the PL framework, the complete-pairs (CP), the available-cases (AC) and the doubly robust (DR) approaches. The CP and AC require only a model for the observed data and standard errors are easy to compute. Doubly-robust versions of the PL estimation require a predictive model for the missing responses given the observed ones and are computationally more demanding than the AC and CP. A simulation study is used to compare the proposed methods. The proposed methods are employed to analyze the UK data on numeracy and literacy collected as part of the OECD Survey of Adult Skills. 相似文献

17.

Missing data techniques for structural equation modeling 总被引：2，自引：0，他引：2

Allison PD 《Journal of abnormal psychology》2003,112(4):545-557

As with other statistical methods, missing data often create major problems for the estimation of structural equation models (SEMs). Conventional methods such as listwise or pairwise deletion generally do a poor job of using all the available information. However, structural equation modelers are fortunate that many programs for estimating SEMs now have maximum likelihood methods for handling missing data in an optimal fashion. In addition to maximum likelihood, this article also discusses multiple imputation. This method has statistical properties that are almost as good as those for maximum likelihood and can be applied to a much wider array of models and estimation methods. 相似文献

18.

Bias in longitudinal data analysis with missing data using typical linear mixed‐effects modelling and pattern‐mixture approach: An analytical illustration

下载免费PDF全文

Manshu Yang Lijuan Wang Scott E. Maxwell 《The British journal of mathematical and statistical psychology》2015,68(2):246-267

We analytically derive the fixed‐effects estimates in unconditional linear growth curve models by typical linear mixed‐effects modelling (TLME) and by a pattern‐mixture (PM) approach with random‐slope‐dependent two‐missing‐pattern missing not at random (MNAR) longitudinal data. Results showed that when the missingness mechanism is random‐slope‐dependent MNAR, TLME estimates of both the mean intercept and mean slope are biased because of incorrect weights used in the estimation. More specifically, the estimate of the mean slope is biased towards the mean slope for completers, whereas the estimate of the mean intercept is biased towards the opposite direction as compared to the estimate of the mean slope. We also discuss why the PM approach can provide unbiased fixed‐effects estimates for random‐coefficients‐dependent MNAR data but does not work well for missing at random or outcome‐dependent MNAR data. A small simulation study was conducted to illustrate the results and to compare results from TLME and PM. Results from an empirical data analysis showed that the conceptual finding can be generalized to other real conditions even when some assumptions for the analytical derivation cannot be met. Implications from the analytical and empirical results were discussed and sensitivity analysis was suggested for longitudinal data analysis with missing data. 相似文献

19.

Dual imputation model for incomplete longitudinal data

Shahab Jolani Laurence E. Frank Stef van Buuren 《The British journal of mathematical and statistical psychology》2014,67(2):197-212

Missing values are a practical issue in the analysis of longitudinal data. Multiple imputation (MI) is a well‐known likelihood‐based method that has optimal properties in terms of efficiency and consistency if the imputation model is correctly specified. Doubly robust (DR) weighing‐based methods protect against misspecification bias if one of the models, but not necessarily both, for the data or the mechanism leading to missing data is correct. We propose a new imputation method that captures the simplicity of MI and protection from the DR method. This method integrates MI and DR to protect against misspecification of the imputation model under a missing at random assumption. Our method avoids analytical complications of missing data particularly in multivariate settings, and is easy to implement in standard statistical packages. Moreover, the proposed method works very well with an intermittent pattern of missingness when other DR methods can not be used. Simulation experiments show that the proposed approach achieves improved performance when one of the models is correct. The method is applied to data from the fireworks disaster study, a randomized clinical trial comparing therapies in disaster‐exposed children. We conclude that the new method increases the robustness of imputations. 相似文献

20.

On structural equation modeling with data that are not missing completely at random 总被引：1，自引：0，他引：1

Bengt Muthén David Kaplan Michael Hollis 《Psychometrika》1987,52(3):431-462

A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior.The research of the first author was supported by grant No. SES-8312583 from the National Science Foundation and by a Spencer Foundation grant. We wish to thank Chuen-Rong Chan for drawing the path diagram. 相似文献