期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

LGM模型中缺失数据处理方法的比较:ML方法与Diggle-Kenward选择模型

张杉杉陈楠刘红云《心理学报》2017,49(5)

追踪研究中缺失数据十分常见。本文通过Monte Carlo模拟研究,考察基于不同前提假设的Diggle-Kenward选择模型和ML方法对增长参数估计精度的差异,并考虑样本量、缺失比例、目标变量分布形态以及不同缺失机制的影响。结果表明:(1)缺失机制对基于MAR的ML方法有较大的影响,在MNAR缺失机制下,基于MAR的ML方法对LGM模型中截距均值和斜率均值的估计不具有稳健性。(2)DiggleKenward选择模型更容易受到目标变量分布偏态程度的影响,样本量与偏态程度存在交互作用,样本量较大时,偏态程度的影响会减弱。而ML方法仅在MNAR机制下轻微受到偏态程度的影响。相似文献

2.

贝叶斯题组随机效应模型的必要性及影响因素

刘玥刘红云《心理学报》2012,44(2):263-275

题组模型可以解决传统IRT模型由于题目间局部独立性假设违背时所导致的参数估计偏差。为探讨题组随机效应模型的适用范围, 采用Monte Carlo模拟研究, 分别使用2-PL贝叶斯题组随机效应模型(BTRM)和2-PL贝叶斯模型(BM)对数据进行拟合, 考虑了题组效应、题组长度、题目数量和局部独立题目比例的影响。结果显示：(1) BTRM不受题组效应和题组长度影响, BM对参数估计的误差随题组效应和题组长度增加而增加。(2) BTRM具有一定的普遍性, 且当题组效应大, 题组长, 题目数量大时使用该模型能减少估计误差, 但是当题目数量较小时, 两个模型得到的能力估计误差都较大。(3)当局部独立题目的比例较大时, 两种模型得到的参数估计差异不大。相似文献

3.

多阶段混合增长模型的方法及研究现状

王婧唐文清张敏强张文怡郭凯茵《心理科学进展》2017,(10):1696-1704

多阶段混合增长模型(PGMM)可对发展过程中的阶段性及群体异质性特征进行分析,在能力发展、行为发展及干预、临床心理等研究领域应用广泛。PGMM可在结构方程模型和随机系数模型框架下定义,通常使用基于EM算法的极大似然估计和基于马尔科夫链蒙特卡洛模拟的贝叶斯推断两种方法进行参数估计。样本量、测量时间点数、潜在类别距离等因素对模型及参数估计有显著影响。未来应加强PGMM与其它增长模型的比较研究;在相同或不同的模型框架下研究数据特征、类别属性等对参数估计方法的影响。相似文献

4.

多阶段混合增长模型的影响因素：距离与形态 总被引：1，自引：0，他引：1

刘源骆方刘红云《心理学报》2014,46(9):1400-1412

通过模拟研究, 考察潜类别距离和发展形态等因素对多阶段混合增长模型的模型选择和参数估计的影响：(1)潜类别距离越大, 模型选择和分类效果越好。(2)混合模型的选择, 应以一定样本量(至少200)为前提, 首先考虑BIC选出正确的分类模型, 再通过熵值、ARI等选择分类确定性较高的模型。(3)多阶段的发展形态对正确模型的选择和分类的确定性均有一定程度影响。(4)潜类别距离和样本量越大, 参数估计精度越高。(5)在判断分类准确性的指标中, ARI的选择更偏向于真实的模型。相似文献

5.

当结构假设和分布假设不满足时的验证性因子分析:稳健极大似然估计和贝叶斯估计的比较研究（英文）

《心理科学》2016,(5)

结构方程模型已被广泛应用于心理学、教育学、以及社会科学领域的统计分析中。结构方程模型分析中最常用的估计方法是基于正态分布的估计量,比如极大似然估计法。这些方法需要满足两个假设。第一,理论模型必须正确地反映变量与变量之间的关系,称为结构假设。第二,数据必须符合多元正态分布,称为分布假设。如果这些假设不满足,基于正态分布的估计量就有可能导致不正确的卡方指数、不正确的拟合度、以及有偏差的参数估计和参数估计的标准误。在实际应用中,几乎所有的理论模型都不能准确地解释变量与变量之间的关系,数据也常常呈非多元正态分布。为此,一些新的估计方法得以发展。这些方法要么在理论上不要求数据呈多元正态分布,要么对因数据呈非正态分布而导致的不正确结果进行纠正。当前较为流行的两种方法是稳健极大似然估计和贝叶斯估计。稳健极大似然估计是应用Satorra and Bentler(1994)的方法对不正确的卡方指数和参数估计的标准误进行调整,而参数估计和用极大似然方法得出的完全等同。贝叶斯估计方法则是基于贝叶斯定理,其要点是:参数的后验分布是由参数的先验分布和数据似然值相乘而得来。后验分布常用马尔科夫蒙特卡洛算法来进行模拟。对于稳健极大似然估计和贝叶斯估计这两种方法之间的优劣比较,先前的研究只局限于理论模型是正确的情境。而本研究则着重于理论模型是错误的情境,同时也考虑到数据呈非正态分布的情境。本研究所采用的模型是验证性因子模型,数据全部由计算机模拟而来。数据的生成取决于三个因素:8类因子结构,3种变量分布,和3组样本量。这三个因素产生72个模拟条件(72=8x3x3)。每个模拟条件下生成2000个数据组,每个数据组都拟合两个模型,一个是正确模型、一个是错误模型。每个模型都用两种估计方法来拟合:稳健极大似然估计法和贝叶斯估计方法。贝叶斯估计方法中所使用的先验分布是无信息先验分布。结果分析主要着重于模型拒绝率、拟合度、参数估计、和参数估计的标准误。研究的结果表明:在样本量充足的情况下,两种方法得出的参数估计非常相似。当数据呈非正态分布时,贝叶斯估计法比稳健极大似然估计法更好地拒绝错误模型。但是,当样本量不足且数据呈正态分布时,贝叶斯估计在拒绝错误模型和参数估计上几乎没有优势,甚至在一些条件下,比稳健极大似然法要差。相似文献

6.

多阶段增长模型的方法比较 总被引：1，自引：0，他引：1

刘源赵骞刘红云《心理学探新》2013,(5):415-422,450

多阶段增长模型（Piecewise Growth Modeling,PGM）可以解决发展趋势中具有转折点的情形,并且相对其他复杂的曲线增长模型,解释更简单.已有的统计方法主要通过多层线性模型和潜变量增长模型对多阶段模型进行估计.通过模拟研究,用HLM6.0和Mplus6.0对上述两种模型分别进行估计,结果发现在参数估计的精度上,两种估计方法没有差异,只是在犯一类错误的概率上后者略小.进一步通过对错误模型的探讨发现,在样本量小（n=50）,斜率变化小（△b=0.2）时,用线性模型拟合数据而非PGM所犯错误概率较小,整体拟合更佳.但随着样本的增加和斜率变化的增加,错误模型的犯错概率明显增大.故在实际应用中,为了能更好拟合数据,研究者应根据数据本身的情况选择恰当的模型. 相似文献

7.

2PL模型的两种马尔可夫蒙特卡洛缺失数据处理方法比较 总被引：1，自引：0，他引：1

曾莉辛涛张淑梅《心理学报》2009,41(3):276-282

马尔科夫蒙特卡洛（MCMC）是项目反应理论中处理缺失数据的一种典型方法。文章通过模拟研究比较了在不同被试人数,项目数,缺失比例下两种MCMC方法（M-H within Gibbs和DA-T Gibbs）参数估计的精确性,并结合了实证研究。研究结果表明,两种方法是有差异的,项目参数估计均受被试人数影响很大,受缺失比例影响相对更小。在样本较大缺失比例较小时,M-H within Gibbs参数估计的均方误差（RMSE）相对略小,随着样本数的减少或缺失比例的增加,DA-T Gibbs方法逐渐优于M-H within Gibbs方法相似文献

8.

当结构假设和分布假设不满足时的验证性因子分析：稳健极大似然法估计和贝叶斯估计的比较研究

梁莘娅杨艳云《心理科学》2016,39(5):1256-1267

结构方程模型已被广泛应用于心理学、教育学、以及社会科学领域的统计分析中。结构方程模型分析中最常用的估计方法是基于正态分布的估计量,比如极大似然估计法。这些方法需要满足两个假设。第一, 理论模型必须正确地反映变量与变量之间的关系,称为结构假设。第二,数据必须符合多元正态分布,称为分布假设。如果这些假设不满足,基于正态分布的估计量就有可能导致不正确的卡方指数、不正确的拟合度、以及有偏差的参数估计和参数估计的标准误。在实际应用中,几乎所有的理论模型都不能准确地解释变量与变量之间的关系, 数据也常常呈非多元正态分布。为此,一些新的估计方法得以发展。这些方法要么在理论上不要求数据呈多元正态分布,要么对因数据呈非正态分布而导致的不正确结果进行纠正。当前较为流行的两种方法是稳健极大似然估计和贝叶斯估计。稳健极大似然估计是应用 Satorra and Bentler (1994) 的方法对不正确的卡方指数和参数估计的标准误进行调整,而参数估计和用极大似然方法得出的完全等同。贝叶斯估计方法则是基于贝叶斯定理,其要点是：参数的后验分布是由参数的先验分布和数据似然值相乘而得来。后验分布常用马尔科夫蒙特卡洛算法来进行模拟。对于稳健极大似然估计和贝叶斯估计这两种方法之间的优劣比较,先前的研究只局限于理论模型是正确的情境。而本研究则着重于理论模型是错误的情境,同时也考虑到数据呈非正态分布的情境。本研究所采用的模型是验证性因子模型,数据全部由计算机模拟而来。数据的生成取决于三个因素：8 类因子结构,3 种变量分布,和3 组样本量。这三个因素产生72 个模拟条件（72=8x3x3）。每个模拟条件下生成2000 个数据组,每个数据组都拟合两个模型,一个是正确模型、一个是错误模型。每个模型都用两种估计方法来拟合：稳健极大似然估计法和贝叶斯估计方法。贝叶斯估计方法中所使用的先验分布是无信息先验分布。结果分析主要着重于模型拒绝率、拟合度、参数估计、和参数估计的标准误。研究的结果表明：在样本量充足的情况下,两种方法得出的参数估计非常相似。当数据呈非正态分布时,贝叶斯估计法比稳健极大似然估计法更好地拒绝错误模型。但是,当样本量不足且数据呈正态分布时,贝叶斯估计在拒绝错误模型和参数估计上几乎没有优势,甚至在一些条件下,比稳健极大似然法要差。相似文献

9.

多题多做测验模型及其应用

丁树良罗芬戴海琦朱玮《心理学报》2007,39(4):730-736

在IRT框架下,建立了0-1评分方式下单维双参数Logistic多题多做（MAMI）测验模型。与Spray给出的一题多做（MASI）模型相比,MAMI不仅模型更加精致,而且扩展了适用范围,参数估计方法也不同,采用EM算法求取项目参数。Monte Carlo模拟结果显示,应用MAMI测验模型与测验题量作相应增加的作法相比,两者给出的能力估计精度相同,但MAMI模型给出的项目参数估计精度更高。如果将MAMI测验模型与被试人数相应增加的作法相比,项目参数的估计精度相同,但MAMI给出的能力参数估计精度更高。这个发现表明,在一定条件下若允许修改答案,并采用累加式记分方式,纵使题量不变,也可使能力估计的精度相当于题量增加一倍的估计精度,而项目参数估计精度也会提高。这些发现不仅对技能评价和认知能力评价有参考价值,而且对数据的处理方式也有参考价值相似文献

10.

双因子模型MCAT中多级评分项目选题策略的比较

毛秀珍刘欢唐倩《心理科学》2019,(1):187-193

双因子模型假设测验考察一个一般因子和多个组因子,符合很多教育和心理测验的因素结构。“维度缩减”方法将参数估计中多维积分计算化简为多个迭代二维积分,是双因子模型的重要特征。本文针对考察多级评分项目的计算机化自适应测验,首先推导双因子等级反应模型下Fisher信息量的计算,然后推导“维度缩减”方法在项目选择方法中的应用,最后在低、中、高双因子模式题库中比较D-优化方法、后验加权Fisher信息D优化方法(PDO)、后验加权Kullback-Leibler方法(PKL)、连续熵(CEM)和互信息(MI)方法在能力估计的相关、均方根误差、绝对值偏差和欧氏距离的表现。模拟研究表明：(1)双因子模式越强,即一般因子和组因子在项目上的区分度的差异越小,一般因子估计精度降低,组因子估计精度增加,整体能力的估计精度提高;(2)相同实验条件下,连续熵方法的测量精度最高,PKL方法的能力估计精度最低,其它方法的测量精度没有显著差异。相似文献

11.

Missing not at random models for latent growth curve analyses

Enders CK 《心理学方法》2011,16(1):1-16

The past decade has seen a noticeable shift in missing data handling techniques that assume a missing at random (MAR) mechanism, where the propensity for missing data on an outcome is related to other analysis variables. Although MAR is often reasonable, there are situations where this assumption is unlikely to hold, leading to biased parameter estimates. One such example is a longitudinal study of substance use where participants with the highest frequency of use also have the highest likelihood of attrition, even after controlling for other correlates of missingness. There is a large body of literature on missing not at random (MNAR) analysis models for longitudinal data, particularly in the field of biostatistics. Because these methods allow for a relationship between the outcome variable and the propensity for missing data, they require a weaker assumption about the missing data mechanism. This article describes 2 classic MNAR modeling approaches for longitudinal data: the selection model and the pattern mixture model. To date, these models have been slow to migrate to the social sciences, in part because they required complicated custom computer programs. These models are now quite easy to estimate in popular structural equation modeling programs, particularly Mplus. The purpose of this article is to describe these MNAR modeling frameworks and to illustrate their application on a real data set. Despite their potential advantages, MNAR-based analyses are not without problems and also rely on untestable assumptions. This article offers practical advice for implementing and choosing among different longitudinal models. 相似文献

12.

Estimating Incremental Validity Under Missing Data

Dustin A. Fife Jorge L. Mendoza Christopher M. Berry 《Multivariate behavioral research》2017,52(2):164-177

A common form of missing data is caused by selection on an observed variable (e.g., Z). If the selection variable was measured and is available, the data are regarded as missing at random (MAR). Selection biases correlation, reliability, and effect size estimates when these estimates are computed on listwise deleted (LD) data sets. On the other hand, maximum likelihood (ML) estimates are generally unbiased and outperform LD in most situations, at least when the data are MAR. The exception is when we estimate the partial correlation. In this situation, LD estimates are unbiased when the cause of missingness is partialled out. In other words, there is no advantage of ML estimates over LD estimates in this situation. We demonstrate that under a MAR condition, even ML estimates may become biased, depending on how partial correlations are computed. Finally, we conclude with recommendations about how future researchers might estimate partial correlations even when the cause of missingness is unknown and, perhaps, unknowable. 相似文献

13.

认知诊断缺失数据处理方法的比较：零替换、多重插补与极大似然估计法

宋枝璘郭磊郑天鹏《心理学报》2022,54(4):426-440

数据缺失在测验中经常发生, 认知诊断评估也不例外, 数据缺失会导致诊断结果的偏差。首先, 通过模拟研究在多种实验条件下比较了常用的缺失数据处理方法。结果表明：(1)缺失数据导致估计精确性下降, 随着人数与题目数量减少、缺失率增大、题目质量降低, 所有方法的PCCR均下降, Bias绝对值和RMSE均上升。(2)估计题目参数时, EM法表现最好, 其次是MI, FIML和ZR法表现不稳定。(3)估计被试知识状态时, EM和FIML表现最好, MI和ZR表现不稳定。其次, 在PISA2015实证数据中进一步探索了不同方法的表现。综合模拟和实证研究结果, 推荐选用EM或FIML法进行缺失数据处理。相似文献

14.

Bias in longitudinal data analysis with missing data using typical linear mixed‐effects modelling and pattern‐mixture approach: An analytical illustration

下载免费PDF全文

Manshu Yang Lijuan Wang Scott E. Maxwell 《The British journal of mathematical and statistical psychology》2015,68(2):246-267

We analytically derive the fixed‐effects estimates in unconditional linear growth curve models by typical linear mixed‐effects modelling (TLME) and by a pattern‐mixture (PM) approach with random‐slope‐dependent two‐missing‐pattern missing not at random (MNAR) longitudinal data. Results showed that when the missingness mechanism is random‐slope‐dependent MNAR, TLME estimates of both the mean intercept and mean slope are biased because of incorrect weights used in the estimation. More specifically, the estimate of the mean slope is biased towards the mean slope for completers, whereas the estimate of the mean intercept is biased towards the opposite direction as compared to the estimate of the mean slope. We also discuss why the PM approach can provide unbiased fixed‐effects estimates for random‐coefficients‐dependent MNAR data but does not work well for missing at random or outcome‐dependent MNAR data. A small simulation study was conducted to illustrate the results and to compare results from TLME and PM. Results from an empirical data analysis showed that the conceptual finding can be generalized to other real conditions even when some assumptions for the analytical derivation cannot be met. Implications from the analytical and empirical results were discussed and sensitivity analysis was suggested for longitudinal data analysis with missing data. 相似文献

15.

SEM with Missing Data and Unknown Population Distributions Using Two-Stage ML: Theory and Its Application

Ke-Hai Yuan Laura Lu 《Multivariate behavioral research》2013,48(4):621-652

This article provides the theory and application of the 2-stage maximum likelihood (ML) procedure for structural equation modeling (SEM) with missing data. The validity of this procedure does not require the assumption of a normally distributed population. When the population is normally distributed and all missing data are missing at random (MAR), the direct ML procedure is nearly optimal for SEM with missing data. When missing data mechanisms are unknown, including auxiliary variables in the analysis will make the missing data mechanism more likely to be MAR. It is much easier to include auxiliary variables in the 2-stage ML than in the direct ML. Based on most recent developments for missing data with an unknown population distribution, the article first provides the least technical material on why the normal distribution-based ML generates consistent parameter estimates when the missing data mechanism is MAR. The article also provides sufficient conditions for the 2-stage ML to be a valid statistical procedure in the general case. For the application of the 2-stage ML, an SAS IML program is given to perform the first-stage analysis and EQS codes are provided to perform the second-stage analysis. An example with open- and closed-book examination data is used to illustrate the application of the provided programs. One aim is for quantitative graduate students/applied psychometricians to understand the technical details for missing data analysis. Another aim is for applied researchers to use the method properly. 相似文献

16.

Examining Nonnormal Latent Variable Distributions for Non-Ignorable Missing Data

Chen-Wei Liu 《应用心理检测》2021,45(3):159

Missing not at random (MNAR) modeling for non-ignorable missing responses usually assumes that the latent variable distribution is a bivariate normal distribution. Such an assumption is rarely verified and often employed as a standard in practice. Recent studies for “complete” item responses (i.e., no missing data) have shown that ignoring the nonnormal distribution of a unidimensional latent variable, especially skewed or bimodal, can yield biased estimates and misleading conclusion. However, dealing with the bivariate nonnormal latent variable distribution with present MNAR data has not been looked into. This article proposes to extend unidimensional empirical histogram and Davidian curve methods to simultaneously deal with nonnormal latent variable distribution and MNAR data. A simulation study is carried out to demonstrate the consequence of ignoring bivariate nonnormal distribution on parameter estimates, followed by an empirical analysis of “don’t know” item responses. The results presented in this article show that examining the assumption of bivariate nonnormal latent variable distribution should be considered as a routine for MNAR data to minimize the impact of nonnormality on parameter estimates. 相似文献

17.

2PLM下缺失数据处理方法及其比较

汪文义宋丽红罗芬丁树良《心理科学》2016,39(6):1500-1507

项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示：在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。相似文献

18.

Non‐ignorable missingness item response theory models for choice effects in examinee‐selected items

下载免费PDF全文

Chen‐Wei Liu Wen‐Chung Wang 《The British journal of mathematical and statistical psychology》2017,70(3):499-524

Examinee‐selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non‐ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two‐dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non‐ignorable and to determine how to apply the new model to the data collected. Two follow‐up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non‐ignorable missing data were mistakenly treated as ignorable. 相似文献

19.

Methods for Mediation Analysis with Missing Data

Zhiyong Zhang Lijuan Wang 《Psychometrika》2013,78(1):154-184

Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including listwise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum likelihood (TS-ML) method. An R package bmem is developed to implement the four methods for mediation analysis with missing data in the structural equation modeling framework, and two real examples are used to illustrate the application of the four methods. The four methods are evaluated and compared under MCAR, MAR, and MNAR missing data mechanisms through simulation studies. Both MI and TS-ML perform well for MCAR and MAR data regardless of the inclusion of auxiliary variables and for AV-MNAR data with auxiliary variables. Although listwise deletion and pairwise deletion have low power and large parameter estimation bias in many studied conditions, they may provide useful information for exploring missing mechanisms. 相似文献