期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multiple Imputation of Item Scores in Test and Questionnaire Data,and Influence on Psychometric Results

Joost R. van Ginkel L. Andries van der Ark Klaas Sijtsma 《Multivariate behavioral research》2013,48(2):387-414

The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at random, or not missing at random. Cronbach's alpha, Loevinger's scalability coefficient H, and the item cluster solution from Mokken scale analysis of the complete data were compared with the corresponding results based on the data including imputed scores. The multiple-imputation methods, two-way with normally distributed errors, corrected item-mean substitution with normally distributed errors, and response function, produced discrepancies in Cronbach's coefficient alpha, Loevinger's coefficient H, and the cluster solution from Mokken scale analysis, that were smaller than the discrepancies in upper benchmark multivariate normal imputation. 相似文献

2.

Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data

Kadengye DT Cools W Ceulemans E Van den Noortgate W 《Behavior research methods》2012,44(2):516-531

Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings. 相似文献

3.

Missing data imputation and corrected statistics for large-scale behavioral databases

Courrieu P Rey A 《Behavior research methods》2011,43(2):310-330

This article presents a new methodology for solving problems resulting from missing data in large-scale item performance behavioral databases. Useful statistics corrected for missing data are described, and a new method of imputation for missing data is proposed. This methodology is applied to the Dutch Lexicon Project database recently published by Keuleers, Diependaele, and Brysbaert (Frontiers in Psychology, 1, 174, 2010), which allows us to conclude that this database fulfills the conditions of use of the method recently proposed by Courrieu, Brand-D’Abrescia, Peereman, Spieler, and Rey (2011) for testing item performance models. Two application programs in MATLAB code are provided for the imputation of missing data in databases and for the computation of corrected statistics to test models. 相似文献

4.

Dual imputation model for incomplete longitudinal data

Shahab Jolani Laurence E. Frank Stef van Buuren 《The British journal of mathematical and statistical psychology》2014,67(2):197-212

Missing values are a practical issue in the analysis of longitudinal data. Multiple imputation (MI) is a well‐known likelihood‐based method that has optimal properties in terms of efficiency and consistency if the imputation model is correctly specified. Doubly robust (DR) weighing‐based methods protect against misspecification bias if one of the models, but not necessarily both, for the data or the mechanism leading to missing data is correct. We propose a new imputation method that captures the simplicity of MI and protection from the DR method. This method integrates MI and DR to protect against misspecification of the imputation model under a missing at random assumption. Our method avoids analytical complications of missing data particularly in multivariate settings, and is easy to implement in standard statistical packages. Moreover, the proposed method works very well with an intermittent pattern of missingness when other DR methods can not be used. Simulation experiments show that the proposed approach achieves improved performance when one of the models is correct. The method is applied to data from the fireworks disaster study, a randomized clinical trial comparing therapies in disaster‐exposed children. We conclude that the new method increases the robustness of imputations. 相似文献

5.

Multilevel multidimensional item response model with a multilevel latent covariate

下载免费PDF全文

Sun‐Joo Cho Brian Bottge 《The British journal of mathematical and statistical psychology》2015,68(3):410-433

In a pre‐test–post‐test cluster randomized trial, one of the methods commonly used to detect an intervention effect involves controlling pre‐test scores and other related covariates while estimating an intervention effect at post‐test. In many applications in education, the total post‐test and pre‐test scores, ignoring measurement error, are used as response variable and covariate, respectively, to estimate the intervention effect. However, these test scores are frequently subject to measurement error, and statistical inferences based on the model ignoring measurement error can yield a biased estimate of the intervention effect. When multiple domains exist in test data, it is sometimes more informative to detect the intervention effect for each domain than for the entire test. This paper presents applications of the multilevel multidimensional item response model with measurement error adjustments in a response variable and a covariate to estimate the intervention effect for each domain. 相似文献

6.

2PLM下缺失数据处理方法及其比较

汪文义宋丽红罗芬丁树良《心理科学》2016,39(6):1500-1507

项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示：在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。相似文献

7.

Tests of Homoscedasticity,Normality, and Missing Completely at Random for Incomplete Multivariate Data

Jamshidian M Jalal S 《Psychometrika》2010,75(4):649-674

Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes n _i, and they usually fail when applied to nonnormal data. Hawkins (Technometrics 23:105–110, 1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when n _i are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and nonnormal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups. 相似文献

8.

Evaluation of Multi-parameter Test Statistics for Multiple Imputation

Yu Liu Craig K. Enders 《Multivariate behavioral research》2017,52(3):371-390

In Ordinary Least Square regression, researchers often are interested in knowing whether a set of parameters is different from zero. With complete data, this could be achieved using the gain in prediction test, hierarchical multiple regression, or an omnibus F test. However, in substantive research scenarios, missing data often exist. In the context of multiple imputation, one of the current state-of-art missing data strategies, there are several different analogous multi-parameter tests of the joint significance of a set of parameters, and these multi-parameter test statistics can be referenced to various distributions to make statistical inferences. However, little is known about the performance of these tests, and virtually no research study has compared the Type 1 error rates and statistical power of these tests in scenarios that are typical of behavioral science data (e.g., small to moderate samples, etc.). This paper uses Monte Carlo simulation techniques to examine the performance of these multi-parameter test statistics for multiple imputation under a variety of realistic conditions. We provide a number of practical recommendations for substantive researchers based on the simulation results, and illustrate the calculation of these test statistics with an empirical example. 相似文献

9.

一种简单有效的Q矩阵修正新方法

李佳毛秀珍韦嘉《心理学报》2022,54(8):996-1008

Q矩阵的正确性是影响题目参数估计和被试分类准确性的重要因素。针对Q矩阵修正问题, 首先提出了一种简单有效的新方法(ORDP)。然后, 模拟研究通过改变被试知识状态的分布、样本容量(N)、测验长度(L)、Q矩阵错误率(M)、项目质量(Iq)和属性层级结构, 比较了ORDP与已有方法(R、RMSEA和HD)的表现。研究表明：(1) 当知识状态服从均匀分布时, ORDP方法在所有层级结构下最优; 当知识状态服从多元正态分布时, RMSEA和ORDP表现没有明显差异, 除独立结构外, RMSEA方法均稍优于ORDP方法; (2) 各方法在多元正态分布下的修正效果不及均匀分布时的修正结果; (3) N、L、M、Iq和属性层级结构对4种方法的表现均有明显影响; (4) 基于Tatsuoka (1984)分数减法数据的修正结果表明, 采用ORDP方法修正的Q矩阵与数据拟合最优。相似文献

10.

A Proposed Number Correct Scoring Procedure Based on Classical True-Score Theory and Multidimensional Item Response Theory

《International Journal of Testing》2013,13(2):131-141

A hybrid procedure for number correct scoring is proposed. The proposed scoring procedure is based on both classical true-score theory (CTT) and multidimensional item response theory (MIRT). Specifically, the hybrid scoring procedure uses test item weights based on MIRT and the total test scores are computed based on CTT. Thus, what makes the hybrid scoring method attractive is that this method accounts for the dimensionality of the test items while test scores remain easy to compute. Further, the hybrid scoring does not require large sample sizes once the item parameters are known. Monte Carlo techniques were used to compare and contrast the proposed hybrid scoring method with three other scoring procedures. Results indicated that all scoring methods in this study generated estimated and true scores that were highly correlated. However, the hybrid scoring procedure had significantly smaller error variances between the estimated and true scores relative to the other procedures. 相似文献

11.

Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective

《Multivariate behavioral research》2013,48(4):545-571

Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missing-data methods available to most data analysts have been relatively ad1 hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, have produced anew generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simulation technique that replaces each missing datum with a set of m > 1 plausible values. The rn versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997a) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from the Adolescent Alcohol Prevention Trial (Hansen & Graham, 199 I). 相似文献

12.

认知诊断测验中基于信息矩阵的多群组DIF检验

孙小坚刘彦楼王诗梦辛涛宋乃庆周蔓《心理科学》2022,45(3):710-717

基于改进的Wald统计量,将适用于两群组的DIF检测方法拓展至多群组的项目功能差异（DIF）检验;改进的Wald统计量将分别通过计算观察信息矩阵（Obs）和经验交叉相乘信息矩阵（XPD）而得到。模拟研究探讨了此二者与传统计算方法在多个群组下的DIF检验情况,结果表明：（1）Obs和XPD的一类错误率明显低于传统方法,DINA模型估计下Obs和XPD的一类错误率接近理论水平;（2）样本量和DIF量较大时,Obs和XPD具有与传统Wald统计量大体相同的统计检验力。相似文献

13.

2PL模型的两种马尔可夫蒙特卡洛缺失数据处理方法比较 总被引：1，自引：0，他引：1

曾莉辛涛张淑梅《心理学报》2009,41(3):276-282

马尔科夫蒙特卡洛（MCMC）是项目反应理论中处理缺失数据的一种典型方法。文章通过模拟研究比较了在不同被试人数,项目数,缺失比例下两种MCMC方法（M-H within Gibbs和DA-T Gibbs）参数估计的精确性,并结合了实证研究。研究结果表明,两种方法是有差异的,项目参数估计均受被试人数影响很大,受缺失比例影响相对更小。在样本较大缺失比例较小时,M-H within Gibbs参数估计的均方误差（RMSE）相对略小,随着样本数的减少或缺失比例的增加,DA-T Gibbs方法逐渐优于M-H within Gibbs方法相似文献

14.

Limited‐information goodness‐of‐fit testing of hierarchical item factor models

Li Cai Mark Hansen 《The British journal of mathematical and statistical psychology》2013,66(2):245-276

In applications of item response theory, assessment of model fit is a critical issue. Recently, limited‐information goodness‐of‐fit testing has received increased attention in the psychometrics literature. In contrast to full‐information test statistics such as Pearson’s X² or the likelihood ratio G², these limited‐information tests utilize lower‐order marginal tables rather than the full contingency table. A notable example is Maydeu‐Olivares and colleagues’M₂ family of statistics based on univariate and bivariate margins. When the contingency table is sparse, tests based on M₂ retain better Type I error rate control than the full‐information tests and can be more powerful. While in principle the M₂ statistic can be extended to test hierarchical multidimensional item factor models (e.g., bifactor and testlet models), the computation is non‐trivial. To obtain M₂, a researcher often has to obtain (many thousands of) marginal probabilities, derivatives, and weights. Each of these must be approximated with high‐dimensional numerical integration. We propose a dimension reduction method that can take advantage of the hierarchical factor structure so that the integrals can be approximated far more efficiently. We also propose a new test statistic that can be substantially better calibrated and more powerful than the original M₂ statistic when the test is long and the items are polytomous. We use simulations to demonstrate the performance of our new methods and illustrate their effectiveness with applications to real data. 相似文献

15.

Sufficiency and Conditional Estimation of Person Parameters in the Polytomous Rasch Model

David Andrich 《Psychometrika》2010,75(2):292-308

Rasch models are characterised by sufficient statistics for all parameters. In the Rasch unidimensional model for two ordered categories, the parameterisation of the person and item is symmetrical and it is readily established that the total scores of a person and item are sufficient statistics for their respective parameters. In contrast, in the unidimensional polytomous Rasch model for more than two ordered categories, the parameterisation is not symmetrical. Specifically, each item has a vector of item parameters, one for each category, and each person only one person parameter. In addition, different items can have different numbers of categories and, therefore, different numbers of parameters. The sufficient statistic for the parameters of an item is itself a vector. In estimating the person parameters in presently available software, these sufficient statistics are not used to condition out the item parameters. This paper derives a conditional, pairwise, pseudo-likelihood and constructs estimates of the parameters of any number of persons which are independent of all item parameters and of the maximum scores of all items. It also shows that these estimates are consistent. Although Rasch’s original work began with equating tests using test scores, and not with items of a test, the polytomous Rasch model has not been applied in this way. Operationally, this is because the current approaches, in which item parameters are estimated first, cannot handle test data where there may be many scores with zero frequencies. A small simulation study shows that, when using the estimation equations derived in this paper, such a property of the data is no impediment to the application of the model at the level of tests. This opens up the possibility of using the polytomous Rasch model directly in equating test scores. 相似文献

16.

测验等值：从IRT到MIRT

谢晶张厚粲《心理学探新》2009,29(5):67-71

等值作为保证测验公平性的技术手段,一直是测验理论研究的重要方面。MIRT理论的发展证明了题目和测验是复杂的,传统的单维模型已经不能满足对人和题目／测验之间关系的探讨需求。目前MIRT等值研究主要有两种取向,其中一种取向是研究多维数据对IRT等值会产生什么样的影响;第二种取向是通过开发新的计算方法和计算工具研究MIRT等值过程。MIRT等值研究最重要的是对等值方法和过程实现的研究,目前已取得一些进展,在进行这些研究的过程中最重要的考虑因素是控制其误差影响因素。相似文献

17.

Imputation methods for missing data in educational diagnostic evaluation

Fernández-Alonso R Suárez-Álvarez J Muñiz J 《Psicothema》2012,24(1):167-175

In the diagnostic evaluation of educational systems, self-reports are commonly used to collect data, both cognitive and orectic. For various reasons, in these self-reports, some of the students' data are frequently missing. The main goal of this research is to compare the performance of different imputation methods for missing data in the context of the evaluation of educational systems. On an empirical database of 5,000 subjects, 72 conditions were simulated: three levels of missing data, three types of loss mechanisms, and eight methods of imputation. The levels of missing data were 5%, 10%, and 20%. The loss mechanisms were set at: Missing completely at random, moderately conditioned, and strongly conditioned. The eight imputation methods used were: listwise deletion, replacement by the mean of the scale, by the item mean, the subject mean, the corrected subject mean, multiple regression, and Expectation-Maximization (EM) algorithm, with and without auxiliary variables. The results indicate that the recovery of the data is more accurate when using an appropriate combination of different methods of recovering lost data. When a case is incomplete, the mean of the subject works very well, whereas for completely lost data, multiple imputation with the EM algorithm is recommended. The use of this combination is especially recommended when data loss is greater and its loss mechanism is more conditioned. Lastly, the results are discussed, and some future lines of research are analyzed. 相似文献

18.

多维计算机化自适应测验中项目曝光控制选题策略的比较

下载免费PDF全文

毛秀珍王娅婷杨睿《心理学探新》2019,(1):47-56

在MCAT中考查四种项目选择指标在有无曝光控制条件下的选题表现。项目选择指标分别是:(1)贝叶斯的D优化方法(D-optimality)、后验期望Kullback-Leibler方法(KLP)、基于等权重复合分数的最小误差方差方法(the minimized error variance of the linear combination score with equal weight,V1)和基于最优权重复合分数的最小误差方差方法(the minimized error variance of the composite score with optimized weight,V2)。将针对认知诊断CAT项目曝光控制的的限制阈值方法(Restrictive Threshold,RT)和限制进度(Restrictive Progressive,RPG)方法、单维CAT中的最大优先指标方法(Maximum Priority Index,MPI)推广到MCAT。模拟研究表明:(1)KLP,D-优化和V1对领域分数估计准确,能力返真性比V2更好。(2)尽管V1和V2方法相比KLP和D-优化方法提高了题库利用率,但这四种选题指标都产生不均匀的项目曝光率分布。(2)三种曝光控制策略都极大地提高项目曝光均匀性,且不明显降低测量精度。(3)MPI与RPG方法在曝光控制方面表现类似,且比RT的方法表现更好。相似文献

19.

不同铆测验设计下多维IRT等值方法的比较

刘玥刘红云《心理学报》2013,45(4):466

实际应用中测验往往具有多维结构, 如果仍采用单维IRT方法进行等值, 会得到不准确的结果。因此对于多维结构的测验, 需要使用多维IRT等值方法来实现参数的转换。基于共同题设计, 文章通过模拟研究的方法, 考察了不同铆测验设计下几种多维IRT等值方法的表现, 同时考虑了测验长度、两个维度题目数量的比例、铆测验长度、铆测验的选择策略、两个维度之间的相关和等值群体的能力水平差异六个因素的影响。所比较的多维IRT等值方法有：均值/均值(MM)方法, 均值/标准差(MS)方法, Stoking-Lord (SL)方法, Haebara (HB)方法, 最小平方(LS)方法。结果显示：(1) SL, HB和LS方法得到的等值误差均方根最小, 且在各条件下表现较为稳定。(2) MM和MS方法在非等组条件下呈现出很大的误差均方根。(3)铆测验设计对SL, HB和LS方法的等值结果没有显著影响。(4)在两个维度之间的相关较高, 测验长度和铆测验长度较长, 等值群体的能力水平没有差异的条件下, SL, HB和LS方法得到的等值误差均方根最小。相似文献

20.

Alternative Multiple Imputation Inference for Categorical Structural Equation Modeling

Seungwon Chung Li Cai 《Multivariate behavioral research》2019,54(3):323-337

The use of item responses from questionnaire data is ubiquitous in social science research. One side effect of using such data is that researchers must often account for item level missingness. Multiple imputation is one of the most widely used missing data handling techniques. The traditional multiple imputation approach in structural equation modeling has a number of limitations. Motivated by Lee and Cai’s approach, we propose an alternative method for conducting statistical inference from multiple imputation in categorical structural equation modeling. We examine the performance of our proposed method via a simulation study and illustrate it with one empirical data set. 相似文献