共查询到20条相似文献,搜索用时 0 毫秒
1.
追踪研究中缺失数据十分常见。本文通过Monte Carlo模拟研究,考察基于不同前提假设的Diggle-Kenward选择模型和ML方法对增长参数估计精度的差异,并考虑样本量、缺失比例、目标变量分布形态以及不同缺失机制的影响。结果表明:(1)缺失机制对基于MAR的ML方法有较大的影响,在MNAR缺失机制下,基于MAR的ML方法对LGM模型中截距均值和斜率均值的估计不具有稳健性。(2)DiggleKenward选择模型更容易受到目标变量分布偏态程度的影响,样本量与偏态程度存在交互作用,样本量较大时,偏态程度的影响会减弱。而ML方法仅在MNAR机制下轻微受到偏态程度的影响。 相似文献
2.
Standard procedures for estimating item parameters in item response theory (IRT) ignore collateral information that may be available about examinees, such as their standing on demographic and educational variables. This paper describes circumstances under which collateral information about examineesmay be used to make inferences about item parameters more precise, and circumstances under which itmust be used to obtain correct inferences.This work was supported by Contract No. N00014-85-K-0683, project designation NR 150-539, from the Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research. Reproduction in whole or in part is permitted for any purpose of the United States Government. We are indebted to Tim Davey, Eugene Johnson, and three anonymous referees for their comments on earlier versions of the paper. 相似文献
3.
数据缺失在测验中经常发生, 认知诊断评估也不例外, 数据缺失会导致诊断结果的偏差。首先, 通过模拟研究在多种实验条件下比较了常用的缺失数据处理方法。结果表明:(1)缺失数据导致估计精确性下降, 随着人数与题目数量减少、缺失率增大、题目质量降低, 所有方法的PCCR均下降, Bias绝对值和RMSE均上升。(2)估计题目参数时, EM法表现最好, 其次是MI, FIML和ZR法表现不稳定。(3)估计被试知识状态时, EM和FIML表现最好, MI和ZR表现不稳定。其次, 在PISA2015实证数据中进一步探索了不同方法的表现。综合模拟和实证研究结果, 推荐选用EM或FIML法进行缺失数据处理。 相似文献
4.
Heining Cham Evgeniya Reshetnyak Barry Rosenfeld William Breitbart 《Multivariate behavioral research》2017,52(1):12-30
Researchers have developed missing data handling techniques for estimating interaction effects in multiple regression. Extending to latent variable interactions, we investigated full information maximum likelihood (FIML) estimation to handle incompletely observed indicators for product indicator (PI) and latent moderated structural equations (LMS) methods. Drawing on the analytic work on missing data handling techniques in multiple regression with interaction effects, we compared the performance of FIML for PI and LMS analytically. We performed a simulation study to compare FIML for PI and LMS. We recommend using FIML for LMS when the indicators are missing completely at random (MCAR) or missing at random (MAR) and when they are normally distributed. FIML for LMS produces unbiased parameter estimates with small variances, correct Type I error rates, and high statistical power of interaction effects. We illustrated the use of these methods by analyzing the interaction effect between advanced cancer patients’ depression and change of inner peace well-being on future hopelessness levels. 相似文献
5.
6.
Suppose a collection of standard tests is given to all subjects in a random sample, but a different new test is given to each group of subjects in nonoverlapping subsamples. A simple method is developed for displaying the information that the data set contains about the correlational structure of the new tests. This is possible to some extent, even though each subject takes only one new test. The method uses plausible values of the partial correlations among the new tests given the standard tests in order to generate plausible simple correlations among the new tests and plausible multiple correlations between composites of the new tests and the standard tests. The real data example included suggests that the method can be useful in practical problems. 相似文献
7.
Oliver Lüdtke Alexander Robitzsch Stephen G. West 《Multivariate behavioral research》2020,55(3):361-381
AbstractWhen estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a multivariate normal distribution, which is also the default in many statistical software packages. This distribution will in general be misspecified if predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x·z). In the present article, we introduce a factored regression modeling approach for estimating regression models with missing data that is based on maximum likelihood estimation. In this approach, the model likelihood is factorized into a part that is due to the model of interest and a part that is due to the model for the incomplete predictors. In three simulation studies, we showed that the factored regression modeling approach produced valid estimates of interaction and nonlinear effects in regression models with missing values on categorical or continuous predictor variables under a broad range of conditions. We developed the R package mdmb, which facilitates a user-friendly application of the factored regression modeling approach, and present a real-data example that illustrates the flexibility of the software. 相似文献
8.
Gina L. Mazza Craig K. Enders Linda S. Ruehlman 《Multivariate behavioral research》2013,48(5):504-519
Often when participants have missing scores on one or more of the items comprising a scale, researchers compute prorated scale scores by averaging the available items. Methodologists have cautioned that proration may make strict assumptions about the mean and covariance structures of the items comprising the scale (Schafer &; Graham, 2002; Graham, 2009; Enders, 2010). We investigated proration empirically and found that it resulted in bias even under a missing completely at random (MCAR) mechanism. To encourage researchers to forgo proration, we describe a full information maximum likelihood (FIML) approach to item-level missing data handling that mitigates the loss in power due to missing scale scores and utilizes the available item-level data without altering the substantive analysis. Specifically, we propose treating the scale score as missing whenever one or more of the items are missing and incorporating items as auxiliary variables. Our simulations suggest that item-level missing data handling drastically increases power relative to scale-level missing data handling. These results have important practical implications, especially when recruiting more participants is prohibitively difficult or expensive. Finally, we illustrate the proposed method with data from an online chronic pain management program. 相似文献
9.
本研究通过蒙特卡洛模拟考查了采用全息极大似然估计进行缺失数据建模时辅助变量的作用。具体考查了辅助变量与研究变量的共缺机制、共缺率、相关程度、辅助变量数目与样本量等因素对参数估计结果精确性的影响。结果表明,当辅助与研究变量共缺时:(1)对于完全随机缺失的辅助变量,结果更容易出现偏差;(2)对于MAR-MAR组合机制,纳入单个辅助变量是有益的;对于MAR-MCAR或MAR-MNAR组合机制,纳入多于一个辅助变量的效果更好;(3)纳入与研究变量低相关的辅助变量对结果也是有益的。 相似文献
10.
Maria T. Barendse Yves Rosseel 《The British journal of mathematical and statistical psychology》2023,76(2):327-352
Pairwise maximum likelihood (PML) estimation is a promising method for multilevel models with discrete responses. Multilevel models take into account that units within a cluster tend to be more alike than units from different clusters. The pairwise likelihood is then obtained as the product of bivariate likelihoods for all within-cluster pairs of units and items. In this study, we investigate the PML estimation method with computationally intensive multilevel random intercept and random slope structural equation models (SEM) in discrete data. In pursuing this, we first reconsidered the general ‘wide format’ (WF) approach for SEM models and then extend the WF approach with random slopes. In a small simulation study we the determine accuracy and efficiency of the PML estimation method by varying the sample size (250, 500, 1000, 2000), response scales (two-point, four-point), and data-generating model (mediation model with three random slopes, factor model with one and two random slopes). Overall, results show that the PML estimation method is capable of estimating computationally intensive random intercept and random slopes multilevel models in the SEM framework with discrete data and many (six or more) latent variables with satisfactory accuracy and efficiency. However, the condition with 250 clusters combined with a two-point response scale shows more bias. 相似文献
11.
Dorothy T. Thayer 《Psychometrika》1983,48(2):293-297
Consider an old testX consisting ofs sections and two new testsY andZ similar toX consisting ofp andq sections respectively. All subjects are given testX plus two variable sections from either testY orZ. Different pairings of variable sections are given to each subsample of subjects. We present a method of estimating the covariance matrix of the combined test (X
1, ...,X
s
,Y
1, ...,Y
p
,Z
1, ...,Z
q
) and describe an application of these estimation techniques to linear, observed-score, test equating.The author is indebted to Paul W. Holland and Donald B. Rubin for their encouragement and many helpful comments and suggestions that contributed significantly to the development of this paper.This research was supported by the Program Statistics Research Project of the ETS Research Statistics Group. 相似文献
12.
This article compares a variety of imputation strategies for ordinal missing data on Likert scale variables (number of categories = 2, 3, 5, or 7) in recovering reliability coefficients, mean scale scores, and regression coefficients of predicting one scale score from another. The examined strategies include imputing using normal data models with naïve rounding/without rounding, using latent variable models, and using categorical data models such as discriminant analysis and binary logistic regression (for dichotomous data only), multinomial and proportional odds logistic regression (for polytomous data only). The result suggests that both the normal model approach without rounding and the latent variable model approach perform well for either dichotomous or polytomous data regardless of sample size, missing data proportion, and asymmetry of item distributions. The discriminant analysis approach also performs well for dichotomous data. Naïvely rounding normal imputations or using logistic regression models to impute ordinal data are not recommended as they can potentially lead to substantial bias in all or some of the parameters. 相似文献
13.
Guogen Shan Charles Bernick Sarah Banks 《The British journal of mathematical and statistical psychology》2018,71(1):60-74
This research was motivated by a clinical trial design for a cognitive study. The pilot study was a matched-pairs design where some data are missing, specifically the missing data coming at the end of the study. Existing approaches to determine sample size are all based on asymptotic approaches (e.g., the generalized estimating equation (GEE) approach). When the sample size in a clinical trial is small to medium, these asymptotic approaches may not be appropriate for use due to the unsatisfactory Type I and II error rates. For this reason, we consider the exact unconditional approach to compute the sample size for a matched-pairs study with incomplete data. Recommendations are made for each possible missingness pattern by comparing the exact sample sizes based on three commonly used test statistics, with the existing sample size calculation based on the GEE approach. An example from a real surgeon-reviewers study is used to illustrate the application of the exact sample size calculation in study designs. 相似文献
14.
对含有非随机缺失数据的潜变量增长模型,为了考察基于不同假设的缺失数据处理方法:极大似然(ML)方法与DiggleKenward选择模型的优劣,通过Monte Carlo模拟研究,比较两种方法对模型中增长参数估计精度及其标准误估计的差异,并考虑样本量、非随机缺失比例和随机缺失比例的影响。结果表明,符合前提假设的Diggle-Kenward选择模型的参数估计精度普遍高于ML方法;对于标准误估计值,ML方法存在一定程度的低估,得到的置信区间覆盖比率也明显低于Diggle-Kenward选择模型。 相似文献
15.
Chen-Wei Liu 《应用心理检测》2021,45(3):159
Missing not at random (MNAR) modeling for non-ignorable missing responses usually assumes that the latent variable distribution is a bivariate normal distribution. Such an assumption is rarely verified and often employed as a standard in practice. Recent studies for “complete” item responses (i.e., no missing data) have shown that ignoring the nonnormal distribution of a unidimensional latent variable, especially skewed or bimodal, can yield biased estimates and misleading conclusion. However, dealing with the bivariate nonnormal latent variable distribution with present MNAR data has not been looked into. This article proposes to extend unidimensional empirical histogram and Davidian curve methods to simultaneously deal with nonnormal latent variable distribution and MNAR data. A simulation study is carried out to demonstrate the consequence of ignoring bivariate nonnormal distribution on parameter estimates, followed by an empirical analysis of “don’t know” item responses. The results presented in this article show that examining the assumption of bivariate nonnormal latent variable distribution should be considered as a routine for MNAR data to minimize the impact of nonnormality on parameter estimates. 相似文献
16.
Michael W. Browne 《Psychometrika》1988,53(4):585-589
Algebraic properties of the normal theory maximum likelihood solution in factor analysis regression are investigated. Two commonly employed measures of the within sample predictive accuracy of the factor analysis regression function are considered: the variance of the regression residuals and the squared correlation coefficient between the criterion variable and the regression function. It is shown that this within sample residual variance and within sample squared correlation may be obtained directly from the factor loading and unique variance estimates, without use of the original observations or the sample covariance matrix. 相似文献
17.
Henk A. L. Kiers 《Psychometrika》1997,62(2):251-266
A general approach for fitting a model to a data matrix by weighted least squares (WLS) is studied. This approach consists of iteratively performing (steps of) existing algorithms for ordinary least squares (OLS) fitting of the same model. The approach is based on minimizing a function that majorizes the WLS loss function. The generality of the approach implies that, for every model for which an OLS fitting algorithm is available, the present approach yields a WLS fitting algorithm. In the special case where the WLS weight matrix is binary, the approach reduces to missing data imputation.This research has been made possible by a fellowship from the Royal Netherlands Academy of Arts and Sciences to the author. 相似文献
18.
19.
20.
Existing test statistics for assessing whether incomplete data represent a missing completely at random sample from a single population are based on a normal likelihood rationale and effectively test for homogeneity of means and covariances across missing data patterns. The likelihood approach cannot be implemented adequately if a pattern of missing data contains very few subjects. A generalized least squares rationale is used to develop parallel tests that are expected to be more stable in small samples. Three factors were varied for a simulation: number of variables, percent missing completely at random, and sample size. One thousand data sets were simulated for each condition. The generalized least squares test of homogeneity of means performed close to an ideal Type I error rate for most of the conditions. The generalized least squares test of homogeneity of covariance matrices and a combined test performed quite well also.Preliminary results on this research were presented at the 1999 Western Psychological Association convention, Irvine, CA, and in the UCLA Statistics Preprint No. 265 (http://www.stat.ucla.edu). The assistance of Ke-Hai Yuan and several anonymous reviewers is gratefully acknowledged. 相似文献