期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

涂冬波蔡艳戴海琦丁树良《心理学报》2011,43(11):1329-1340

本研究介绍并引进了现代测量理论中的前沿技术—— 多维项目反应理论, 采用MCMC算法实现了其参数估计; 并将MIRT应用于瑞文高级推理测验, 以探讨MIRT在心理测验中的具体应用。研究结果表明：(1)本研究自主编制的MIRT参数估计程序基本可行, 其估计的精度与国外研究结论相当甚至更好。(2)在测验维度和样本容量两因素完全随机实验设计下(2×3), 随着被试和题目样本容量的增加, MIRT参数估计的精度越高且估计的稳定性越强; 但随着测验维度的增加, MIRT参数估计精度和稳定性均随之降低。(3)MIRT对心理测验的分析比UIRT能提供更为精确和细致的信息。它对心理测验的编制、开发及评价具有重要的指导和参考价值, 值得引进及借鉴。相似文献

2.

多维项目反应理论补偿性模型参数估计:基于广义回归神经网络集合

下载免费PDF全文

王鹏孟维璇朱干成张登浩张利会董一萱司英栋《心理学探新》2019,(3):244-249

运用广义回归神经网络(GRNN)方法对小样本多维项目反应理论(MIRT)补偿性模型的项目参数进行估计,尝试解决传统参数估计方法样本数量要求较大的问题。MIRT双参数Logistic补偿模型被设置为二级计分的二维模型。首先,模拟二维能力参数、项目参数值与考生作答矩阵。其次,把通过主成分分析得到的前两个因子在每个题目上的载荷作为区分度的初始值以及题目通过率作为难度的初始值,这两个指标的初始值作为神经网络的输入。集成100个神经网络,其输出值的均值作为MIRT的项目参数估计值。最后,设置2×2种(能力相关水平:0.3和0.7; 两种估计方法:GRNN和MCMC方法)实验处理,对GRNN和MCMC估计方法的返真性进行比较。结果表明,小样本的情况下,基于GRNN集成方法的参数估计结果优于MCMC方法。相似文献

3.

单维项目因素分析：CCFA与IRT估计方法的比较

下载免费PDF全文

刘红云李美娟骆方李小山《心理科学》2012,35(2):441-445

当观测指标变量为二分分类数据时,传统的因素分析方法不再适用。作者简要回顾了SEM框架下的分类数据因素分析模型和IRT框架下的测验题目和潜在能力的关系模型,并对两种框架下主要采用的参数估计方法进行了总结。通过两个模拟研究,比较了SEM框架下GLSc和MGLSc估计方法与IRT框架下MML/EM估计方法的差异。研究结果表明：（1）三种方法中,GLSc得到参数估计的偏差最大,MGLSc和MML/EM估计方法相差不大;（2）随着样本量增大,各种项目参数估计的精度均提高;（3）项目因素载荷和难度估计的精度受测验长度的影响;（4）项目因素载荷和区分度估计的精度受总体因素载荷（区分度）高低的影响;（5）测验项目中阈值的分布会影响参数估计的精度,其中受影响最大的是项目区分度。（6）总体来看,SEM框架下的项目参数估计精度较IRT框架下项目参数估计的精度高。此外,文章还将两种方法在实际应用中应该注意的问题提供了一些建议。相似文献

4.

MIRT模型中多维能力及其相关矩阵估计的影响因素

蔡艳涂冬波丁树良《心理学探新》2014,34(5):426-430

多维项目反应理论因其模型本身的天然优势及其兼具因素分析与项目反应理论于一身的优点,而被广大研究者及应用者所重视.本研究在前人研究基础上,重点讨论MIRT多维能力及能力间相关矩阵的参数估计问题.研究采用Monte Carlo模拟方法进行,在三因素完全随机设计（4 ×3×3）下,使用MCMC算法,探讨测验维度数、维度间的相关大小和测验项目数三个因素对MIRT能力及其相关矩阵估计的影响. 相似文献

5.

改进3PL模型参数估计的MCMC算法

《心理科学》2010,(5)

本文首先用马尔科夫链蒙特卡洛(MCMC)算法和EM算法进行IRT模型参数估计模拟实验,并探讨了两种算法的参数估计精度,然后在分析三参数Logistic(3PL)模型参数估计精度的基础上改进模型并对其进行参数估计。结果表明,MCMC算法估计IRT模型的参数精度均优于EM算法,并且MCMC算法在估计3PL模型参数方面具有更明显的优势;在样本量较小的情况下,MCMC算法能较好地估计3PL模型参数,估计精度略低于2PL模型;3PL模型的项目参数确定性低是参数估计精度略低于2PL模型的主要原因;采用改进模型可以提高项目参数的确定性,进而得到更优的参数估计精度。相似文献

6.

等级数据的测量不变性检验及影响因素模拟研究

下载免费PDF全文

李冲刘红云《心理科学》2011,34(6):1482-1487

研究介绍了针对等级数据的模型建构（LRV,潜在反应变量模型）和参数估计（WLSMV）方法,以及在此基础上的测量不变性检验（DIFFTEST）方法,同时采用蒙特卡洛模拟研究方法,考察样本总量大小、组间样本量对比情况、阈值差异程度、量表长度等因素,对DIFTEST进行针对等级数据的测量不变性检验效果的影响情况,以及WLSMV估计方法下的参数复原情况。研究结果发现WLSMV估计方法参数的复原效果很好;DIFFTEST的一类错误概率达到可接受水平,在大样本情况下、组间样本量基本相等、阈值差异程度较大时,DIFFTEST检测力较好。在控制测量不变性遭受破坏的项目个数情况下,随着测验长度的增加,DIFFTEST的检测力下降。相似文献

7.

IRT模型参数估计的新方法——MCMC算法 总被引：1，自引：0，他引：1

涂冬波漆书青蔡艳戴海琦丁树良《心理科学》2008,31(1):177-180

本研究主要探讨MCMC算法在IRT模型参数估计中的实现及其估计精度.通过模拟多种实验条件(人少题少、人题适中、人多题多、被试数及其参数固定情况下项目数变化、项目数及其参数固定情况下人数变化),考察两参数和叁参数Logistic模型的MCMC算法对其参数估计的精度,并与国际通用测量程序-Bilog程序(E-M算法)进行比较研究.模拟实验研究表明,上述各种实验条件下,MCMC算法均可用于IRT模型参数估计,且其估计的精度均较Bilog程序(E-M算法)高,值得推广. 相似文献

8.

R-RUM的参数估计及性能评价研究

赵顶位戴海琦《心理学探新》2017,(3):231-236

认知诊断作为21世纪一种新的测量范式,在国内外越来越受到重视。该文运用MCMC算法实现了R-RUM的参数估计,并采用Monte Carlo模拟方法探讨其性能。研究结果表明:(1)R-RUM参数估计方法可行,估计精度较高;(2)Q矩阵复杂性和模型参数水平对模型参数估计精度有较大影响,随着r_(jk)*值的增大和Q矩阵复杂性的增加,项目参数和被试参数估计精度逐渐下降;(3)在特定情形下,R-RUM具有一定的稳健性。相似文献

9.

双因子模型MCAT中多级评分项目选题策略的比较

毛秀珍刘欢唐倩《心理科学》2019,(1):187-193

双因子模型假设测验考察一个一般因子和多个组因子,符合很多教育和心理测验的因素结构。“维度缩减”方法将参数估计中多维积分计算化简为多个迭代二维积分,是双因子模型的重要特征。本文针对考察多级评分项目的计算机化自适应测验,首先推导双因子等级反应模型下Fisher信息量的计算,然后推导“维度缩减”方法在项目选择方法中的应用,最后在低、中、高双因子模式题库中比较D-优化方法、后验加权Fisher信息D优化方法(PDO)、后验加权Kullback-Leibler方法(PKL)、连续熵(CEM)和互信息(MI)方法在能力估计的相关、均方根误差、绝对值偏差和欧氏距离的表现。模拟研究表明：(1)双因子模式越强,即一般因子和组因子在项目上的区分度的差异越小,一般因子估计精度降低,组因子估计精度增加,整体能力的估计精度提高;(2)相同实验条件下,连续熵方法的测量精度最高,PKL方法的能力估计精度最低,其它方法的测量精度没有显著差异。相似文献

10.

IRT框架下追踪数据的测量不变性分析——————以4至5岁儿童认知能力测验为例

欧阳湘子田伟辛涛詹沛达《心理科学》2016,39(3):606-613

本研究以4岁~5岁儿童认知能力测验为例,在IRT框架下探讨了如何进行追踪数据的测量不变性分析。分析模型采用项目间多维项目反应理论模型(between-item MIRT model)和项目内（within-item MIRT model）多维two-tier model,被试为来自全国的882名48个月的儿童,工具为自编4岁~5岁儿童认知能力测验。经测验水平分析和项目水平分析,结果表明：(1)本文对追踪数据的测量不变性分析方法合理有效; (2)该测验在两个时间点上满足部分测量不变性要求,测验的潜在结构稳定; (3)“方位题”的区分度和难度参数都发生变化,另有4题难度参数出现浮动; (4)儿童在4岁~5岁期间认知能力总体呈快速发展趋势,能力增长显著。相似文献

11.

Robustness of Parameter Estimation to Assumptions of Normality in the Multidimensional Graded Response Model

Chun Wang Shiyang Su David J. Weiss 《Multivariate behavioral research》2018,53(3):403-418

A central assumption that is implicit in estimating item parameters in item response theory (IRT) models is the normality of the latent trait distribution, whereas a similar assumption made in categorical confirmatory factor analysis (CCFA) models is the multivariate normality of the latent response variables. Violation of the normality assumption can lead to biased parameter estimates. Although previous studies have focused primarily on unidimensional IRT models, this study extended the literature by considering a multidimensional IRT model for polytomous responses, namely the multidimensional graded response model. Moreover, this study is one of few studies that specifically compared the performance of full-information maximum likelihood (FIML) estimation versus robust weighted least squares (WLS) estimation when the normality assumption is violated. The research also manipulated the number of nonnormal latent trait dimensions. Results showed that FIML consistently outperformed WLS when there were one or multiple skewed latent trait distributions. More interestingly, the bias of the discrimination parameters was non-ignorable only when the corresponding factor was skewed. Having other skewed factors did not further exacerbate the bias, whereas biases of boundary parameters increased as more nonnormal factors were added. The item parameter standard errors recovered well with both estimation algorithms regardless of the number of nonnormal dimensions. 相似文献

12.

2PL模型的两种马尔可夫蒙特卡洛缺失数据处理方法比较 总被引：1，自引：0，他引：1

曾莉辛涛张淑梅《心理学报》2009,41(3):276-282

马尔科夫蒙特卡洛（MCMC）是项目反应理论中处理缺失数据的一种典型方法。文章通过模拟研究比较了在不同被试人数,项目数,缺失比例下两种MCMC方法（M-H within Gibbs和DA-T Gibbs）参数估计的精确性,并结合了实证研究。研究结果表明,两种方法是有差异的,项目参数估计均受被试人数影响很大,受缺失比例影响相对更小。在样本较大缺失比例较小时,M-H within Gibbs参数估计的均方误差（RMSE）相对略小,随着样本数的减少或缺失比例的增加,DA-T Gibbs方法逐渐优于M-H within Gibbs方法相似文献

13.

Latent variables should remain as such: Evidence from a Monte Carlo study

Karina Rdz-Navarro 《The Journal of general psychology》2013,140(4):417-442

Use of subject scores as manifest variables to assess the relationship between latent variables produces attenuated estimates. This has been demonstrated for raw scores from classical test theory (CTT) and factor scores derived from factor analysis. Conclusions on scores have not been sufficiently extended to item response theory (IRT) theta estimates, which are still recommended for estimation of relationships between latent variables. This is because IRT estimates appear to have preferable properties compared to CTT, while structural equation modeling (SEM) is often advised as an alternative to scores for estimation of the relationship between latent variables. The present research evaluates the consequences of using subject scores as manifest variables in regression models to test the relationship between latent variables. Raw scores and three methods for obtaining theta estimates were used and compared to latent variable SEM modeling. A Monte Carlo study was designed by manipulating sample size, number of items, type of test, and magnitude of the correlation between latent variables. Results show that, despite the advantage of IRT models in other areas, estimates of the relationship between latent variables are always more accurate when SEM models are used. Recommendations are offered for applied researchers. 相似文献

14.

融合反应时的多级评分IRT模型开发及其应用研究

下载免费PDF全文

汪大勋郭莹莹《心理学探新》2022,(3)

当前大多数融合反应时的IRT模型仅适用于0-1评分数据资料,极大的限制了IRT反应时模型在实际中的应用。本文在传统的二级计分反应时IRT模型基础上,拟开发一种多级评分反应时模型。在层次建模框架下,分别采用拓广分部评分模型(GPCM)和对数正态模型构建融合反应时的多级评分IRT模型(本文记为JRT-GPCM),并采用全息贝叶斯MCMC算法实现新模型的参数估计。为验证新开发的JRT-GPCM模型的可行性及其在实践中的应用,本文开展了两项研究:研究1为模拟实验研究,研究2为新模型在大五人格-神经质分量表中的应用。研究1结果表明,JRT-GPCM模型的估计精度较高,且具有较好的稳健性。研究2表明,被试的潜在特质与作答速度具有一定的正相关,且本研究结果支持Ferrando和Lorenzo-Seva(2007)提出的“距离-困难度假设”,即当被试的潜在特质与项目的难度阈限距离越远,那么被试会花费更多的时间对项目进行作答。总之,本研究为拓展反应时信息在心理测量及教育中的应用提供新的方法支持。相似文献

15.

Latent growth curve analysis with dichotomous items: Comparing four approaches

下载免费PDF全文

Feifei Ye 《The British journal of mathematical and statistical psychology》2016,69(1):43-61

A Monte Carlo study was used to compare four approaches to growth curve analysis of subjects assessed repeatedly with the same set of dichotomous items: A two‐step procedure first estimating latent trait measures using MULTILOG and then using a hierarchical linear model to examine the changing trajectories with the estimated abilities as the outcome variable; a structural equation model using modified weighted least squares (WLSMV) estimation; and two approaches in the framework of multilevel item response models, including a hierarchical generalized linear model using Laplace estimation, and Bayesian analysis using Markov chain Monte Carlo (MCMC). These four methods have similar power in detecting the average linear slope across time. MCMC and Laplace estimates perform relatively better on the bias of the average linear slope and corresponding standard error, as well as the item location parameters. For the variance of the random intercept, and the covariance between the random intercept and slope, all estimates are biased in most conditions. For the random slope variance, only Laplace estimates are unbiased when there are eight time points. 相似文献

16.

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Michael C. Edwards 《Psychometrika》2010,75(3):474-497

Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show that these methods can be implemented in a flexible way which requires minimal technical sophistication on the part of the end user. After providing an overview of item factor analysis and MCMC, results from several examples (simulated and real) will be discussed. The bulk of these examples focus on models that are problematic for current “gold-standard” estimators. The results demonstrate that it is possible to obtain accurate parameter estimates using MCMC in a relatively user-friendly package. 相似文献

17.

Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data

Kadengye DT Cools W Ceulemans E Van den Noortgate W 《Behavior research methods》2012,44(2):516-531

Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings. 相似文献

18.

适用于多维迫选测验的IRT计分模型

刘娟郑蝉金李云川连旭《心理科学进展》2022,30(6):1410-1428

迫选(forced-choice, FC)测验由于可以控制传统李克特方法带来的反应偏差, 被广泛应用于非认知测验中, 而迫选测验的传统计分方式会产生自模式数据, 这种数据由于不适合于个体间的比较, 一直备受批评。近年来, 多种迫选IRT模型的发展使研究者能够从迫选测验中获得接近常模性的数据, 再次引起了研究者与实践人员对迫选IRT模型的兴趣。首先, 依据所采纳的决策模型和题目反应模型对6种较为主流的迫选IRT模型进行分类和介绍。然后, 从模型构建思路、参数估计方法两个角度对各模型进行比较与总结。其次, 从参数不变性检验、计算机化自适应测验(computerized adaptive testing, CAT)和效度研究3个应用研究方面进行述评。最后提出未来研究可以在模型拓展、参数不变性检验、迫选CAT测验和效度研究4个方向深入。相似文献