首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
刘红云  骆方  王玥  张玉 《心理学报》2012,44(1):121-132
作者简要回顾了SEM框架下分类数据因素分析(CCFA)模型和MIRT框架下测验题目和潜在能力的关系模型, 对两种框架下的主要参数估计方法进行了总结。通过模拟研究, 比较了SEM框架下WLSc和WLSMV估计方法与MIRT框架下MLR和MCMC估计方法的差异。研究结果表明:(1) WLSc得到参数估计的偏差最大, 且存在参数收敛的问题; (2)随着样本量增大, 各种项目参数估计的精度均提高, WLSMV方法与MLR方法得到的参数估计精度差异很小, 大多数情况下不比MCMC方法差; (3)除WLSc方法外, 随着每个维度测验题目的增多参数估计的精度逐渐增高; (4)测验维度对区分度参数和难度参数的影响较大, 而测验维度对项目因素载荷和阈值的影响相对较小; (5)项目参数的估计精度受项目测量维度数的影响, 只测量一个维度的项目参数估计精度较高。另外文章还对两种方法在实际应用中应该注意的问题提供了一些建议。  相似文献   

2.
涂冬波  蔡艳  戴海琦  丁树良 《心理学报》2011,43(11):1329-1340
本研究介绍并引进了现代测量理论中的前沿技术—— 多维项目反应理论, 采用MCMC算法实现了其参数估计; 并将MIRT应用于瑞文高级推理测验, 以探讨MIRT在心理测验中的具体应用。研究结果表明:(1)本研究自主编制的MIRT参数估计程序基本可行, 其估计的精度与国外研究结论相当甚至更好。(2)在测验维度和样本容量两因素完全随机实验设计下(2×3), 随着被试和题目样本容量的增加, MIRT参数估计的精度越高且估计的稳定性越强; 但随着测验维度的增加, MIRT参数估计精度和稳定性均随之降低。(3)MIRT对心理测验的分析比UIRT能提供更为精确和细致的信息。它对心理测验的编制、开发及评价具有重要的指导和参考价值, 值得引进及借鉴。  相似文献   

3.
2PL模型的两种马尔可夫蒙特卡洛缺失数据处理方法比较   总被引:1,自引:0,他引:1  
曾莉  辛涛  张淑梅 《心理学报》2009,41(3):276-282
马尔科夫蒙特卡洛(MCMC)是项目反应理论中处理缺失数据的一种典型方法。文章通过模拟研究比较了在不同被试人数,项目数,缺失比例下两种MCMC方法(M-H within Gibbs和DA-T Gibbs)参数估计的精确性,并结合了实证研究。研究结果表明,两种方法是有差异的,项目参数估计均受被试人数影响很大,受缺失比例影响相对更小。在样本较大缺失比例较小时,M-H within Gibbs参数估计的均方误差(RMSE)相对略小,随着样本数的减少或缺失比例的增加,DA-T Gibbs方法逐渐优于M-H within Gibbs方法  相似文献   

4.
Hooker, Finkelman, and Schwartzman (Psychometrika, 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to occur leads to the undesirable possibility of a subject’s best answer being detrimental to them. This paper considers the existence of paradoxical results in tests composed of item bundles when compensatory models are used. We demonstrate that paradoxical results can occur when bundle effects are modeled as nuisance parameters for each subject. However, when these nuisance parameters are modeled as random effects, or used in a Bayesian analysis, it is possible to design tests comprised of many short bundles that avoid paradoxical results and we provide an algorithm for doing so. We also examine alternative models for handling dependence between item bundles and show that using fixed dependency effects is always guaranteed to avoid paradoxical results.  相似文献   

5.
多维项目反应理论等级反应模型   总被引:2,自引:0,他引:2  
杜文久  肖涵敏 《心理学报》2012,44(10):1402-1407
基于因子分析和单维项目反应理论的多维项目反应理论是测量理论的新发展方向之一。但是, 多维项目反应理论仍处于不成熟的发展阶段, 多数研究也只是以二级评分为主。本文首先介绍了逻辑斯蒂形式的多维等级反应模型, 并以二维等级反应模型为例, 分析了模型的数学函数图像及其性质。然后, 推导出了多维等级反应模型的项目信息函数, 并结合实例进行了讨论。进一步地, 本文阐述了使用联合极大似然估计和马尔科夫链蒙特卡洛方法估计多维等级反应模型参数的思想。最后, 指出了一些有待研究的问题。  相似文献   

6.
Factor analysis models have played a central role in formulating conceptual models in personality and personality assessment, as well as in empirical examinations of personality measurement instruments. Yet, the use of item-level data presents special problems for factor analysis, applications. In this article, we review recent developments in factor analysis that are appropriate for the type of item-level data often collected in personality. Included in this review are discussions of how these developments have been addressed in the context of two different (but formally related) statistical models item response theory (IRT: Hambleton, Swaminathan, & Rogers, 1991) and structural, equation modeling (Bollen 1989) for item-level data. We also discuss the relevance of item scaling in the context of these models. Using the restandardization data for the Minnesota Multiphasic Personality Inventory-2 Scale (cf. Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989), we show brief examples of the utility of these approaches to address basic questions about responses to personality scale items regarding: (a) scale, dimensionality and general item properties, (b) the "appropriateness" of the observed responses, and (c) differential item functioning across subsamples. implications for analyses of personality item-level data in the IRT and factor analytic traditions are discussed.  相似文献   

7.
A plausibles-factor solution for many types of psychological and educational tests is one that exhibits a general factor ands − 1 group or method related factors. The bi-factor solution results from the constraint that each item has a nonzero loading on the primary dimension and at most one of thes − 1 group factors. This paper derives a bi-factor item-response model for binary response data. In marginal maximum likelihood estimation of item parameters, the bi-factor restriction leads to a major simplification of likelihood equations and (a) permits analysis of models with large numbers of group factors; (b) permits conditional dependence within identified subsets of items; and (c) provides more parsimonious factor solutions than an unrestricted full-information item factor analysis in some cases. Supported by the Cognitive Science Program, Office of Naval Research, Under grant #N00014-89-J-1104. We would like to thank Darrell Bock for several helpful suggestions.  相似文献   

8.
运用广义回归神经网络(GRNN)方法对小样本多维项目反应理论(MIRT)补偿性模型的项目参数进行估计,尝试解决传统参数估计方法样本数量要求较大的问题。MIRT双参数Logistic补偿模型被设置为二级计分的二维模型。首先,模拟二维能力参数、项目参数值与考生作答矩阵。其次,把通过主成分分析得到的前两个因子在每个题目上的载荷作为区分度的初始值以及题目通过率作为难度的初始值,这两个指标的初始值作为神经网络的输入。集成100个神经网络,其输出值的均值作为MIRT的项目参数估计值。最后,设置2×2种(能力相关水平:0.3和0.7; 两种估计方法:GRNN和MCMC方法)实验处理,对GRNN和MCMC估计方法的返真性进行比较。结果表明,小样本的情况下,基于GRNN集成方法的参数估计结果优于MCMC方法。  相似文献   

9.
In multidimensional item response theory (MIRT), it is possible for the estimate of a subject’s ability in some dimension to decrease after they have answered a question correctly. This paper investigates how and when this type of paradoxical result can occur. We demonstrate that many response models and statistical estimates can produce paradoxical results and that in the popular class of linearly compensatory models, maximum likelihood estimates are guaranteed to do so. In light of these findings, the appropriateness of multidimensional item response methods for assigning scores in high-stakes testing is called into question.  相似文献   

10.
Item factor analysis: current approaches and future directions   总被引:2,自引:0,他引:2  
The rationale underlying factor analysis applies to continuous and categorical variables alike; however, the models and estimation methods for continuous (i.e., interval or ratio scale) data are not appropriate for item-level data that are categorical in nature. The authors provide a targeted review and synthesis of the item factor analysis (IFA) estimation literature for ordered-categorical data (e.g., Likert-type response scales) with specific attention paid to the problems of estimating models with many items and many factors. Popular IFA models and estimation methods found in the structural equation modeling and item response theory literatures are presented. Following this presentation, recent developments in the estimation of IFA parameters (e.g., Markov chain Monte Carlo) are discussed. The authors conclude with considerations for future research on IFA, simulated examples, and advice for applied researchers.  相似文献   

11.
Cognitive diagnosis models of educational test performance rely on a binary Q‐matrix that specifies the associations between individual test items and the cognitive attributes (skills) required to answer those items correctly. Current methods for fitting cognitive diagnosis models to educational test data and assigning examinees to proficiency classes are based on parametric estimation methods such as expectation maximization (EM) and Markov chain Monte Carlo (MCMC) that frequently encounter difficulties in practical applications. In response to these difficulties, non‐parametric classification techniques (cluster analysis) have been proposed as heuristic alternatives to parametric procedures. These non‐parametric classification techniques first aggregate each examinee's test item scores into a profile of attribute sum scores, which then serve as the basis for clustering examinees into proficiency classes. Like the parametric procedures, the non‐parametric classification techniques require that the Q‐matrix underlying a given test be known. Unfortunately, in practice, the Q‐matrix for most tests is not known and must be estimated to specify the associations between items and attributes, risking a misspecified Q‐matrix that may then result in the incorrect classification of examinees. This paper demonstrates that clustering examinees into proficiency classes based on their item scores rather than on their attribute sum‐score profiles does not require knowledge of the Q‐matrix, and results in a more accurate classification of examinees.  相似文献   

12.
Maximum likelihood and Bayesian ability estimation in multidimensional item response models can lead to paradoxical results as proven by Hooker, Finkelman, and Schwartzman (Psychometrika 74(3): 419–442, 2009): Changing a correct response on one item into an incorrect response may produce a higher ability estimate in one dimension. Furthermore, the conditions under which this paradox arises are very general, and may in fact be fulfilled by many of the multidimensional scales currently in use. This paper tries to emphasize and extend the generality of the results of Hooker et al. by (1) considering the paradox in a generalized class of IRT models, (2) giving a weaker sufficient condition for the occurrence of the paradox with relations to an important concept of statistical association, and by (3) providing some additional specific results for linearly compensatory models with special emphasis on the factor analysis model.  相似文献   

13.
Measurement models, such as factor analysis and item response theory models, are commonly implemented within educational, psychological, and behavioral science research to mitigate the negative effects of measurement error. These models can be formulated as an extension of generalized linear mixed models within a unifying framework that encompasses various kinds of multilevel models and longitudinal models, such as partially nonlinear latent growth models. We introduce the R package PLmixed, which implements profile maximum likelihood estimation to estimate complex measurement and growth models that can be formulated within the general modeling framework using the existing R package lme4 and function optim. We demonstrate the use of PLmixed through two examples before concluding with a brief overview of other possible models.  相似文献   

14.
一种多级评分的认知诊断模型:P-DINA模型的开发   总被引:2,自引:2,他引:0  
涂冬波  蔡艳  戴海琦  丁树良 《心理学报》2010,42(10):1011-1020
当前绝大多数认知诊断计量模型仅适用于0-1评分数据资料, 大大限制了认知诊断在实际中的应用, 也限制了认知诊断的进一步推广和发展。本文对具有较好发展前景的DINA模型进行拓展, 开发出适合多种评分(含0-1二级评分和多级评分)数据资料的P-DINA模型, 同时采用MCMC算法实现模型参数的估计, 并对该模型性能进行研究。结果表明:(1)本文开发的P-DINA模型无论是在无结构型属性层级关系下还是在结构型属性层级关系下, 参数估计的精度均较高, 参数估计的稳健性较强, 说明开发的P-DINA模型基本合理、可行。(2)P-DINA模型可采用MCMC算法实现参数估计, 且参数估计的精度较高。(3)整体来看, 无结构型属性层级关系和结构型属性层级关系下, P-DINA模型在项目参数的估计精度上两者基本相当; 但在被试属性判准率(MMR和PMR)上无结构型属性层级关系表现的稍差一些。(4)无结构型属性阶层关系下:模型诊断的属性个数越多, 参数 估计的精度越差、属性诊断的正确率(MMR和PMR)越低, 但参数 的估计精度越好; 若想保证属性模式判准率在80%以上, 建议诊断的属性个数不宜超过7个。总之, 本研究为拓展认知诊断在教育学和心理学中的应用提供了一种新方法、新模型。  相似文献   

15.
A Monte Carlo study was used to compare four approaches to growth curve analysis of subjects assessed repeatedly with the same set of dichotomous items: A two‐step procedure first estimating latent trait measures using MULTILOG and then using a hierarchical linear model to examine the changing trajectories with the estimated abilities as the outcome variable; a structural equation model using modified weighted least squares (WLSMV) estimation; and two approaches in the framework of multilevel item response models, including a hierarchical generalized linear model using Laplace estimation, and Bayesian analysis using Markov chain Monte Carlo (MCMC). These four methods have similar power in detecting the average linear slope across time. MCMC and Laplace estimates perform relatively better on the bias of the average linear slope and corresponding standard error, as well as the item location parameters. For the variance of the random intercept, and the covariance between the random intercept and slope, all estimates are biased in most conditions. For the random slope variance, only Laplace estimates are unbiased when there are eight time points.  相似文献   

16.
Computerized adaptive testing under nonparametric IRT models   总被引:1,自引:0,他引:1  
Nonparametric item response models have been developed as alternatives to the relatively inflexible parametric item response models. An open question is whether it is possible and practical to administer computerized adaptive testing with nonparametric models. This paper explores the possibility of computerized adaptive testing when using nonparametric item response models. A central issue is that the derivatives of item characteristic Curves may not be estimated well, which eliminates the availability of the standard maximum Fisher information criterion. As alternatives, procedures based on Shannon entropy and Kullback–Leibler information are proposed. For a long test, these procedures, which do not require the derivatives of the item characteristic eurves, become equivalent to the maximum Fisher information criterion. A simulation study is conducted to study the behavior of these two procedures, compared with random item selection. The study shows that the procedures based on Shannon entropy and Kullback–Leibler information perform similarly in terms of root mean square error, and perform much better than random item selection. The study also shows that item exposure rates need to be addressed for these methods to be practical. The authors would like to thank Hua Chang for his help in conducting this research.  相似文献   

17.
Woods CM 《心理学方法》2006,11(3):253-270
Popular methods for fitting unidimensional item response theory (IRT) models to data assume that the latent variable is normally distributed in the population of respondents, but this can be unreasonable for some variables. Ramsay-curve IRT (RC-IRT) was developed to detect and correct for this nonnormality. The primary aims of this article are to introduce RC-IRT less technically than it has been described elsewhere; to evaluate RC-IRT for ordinal data via simulation, including new approaches for model selection; and to illustrate RC-IRT with empirical examples. The empirical examples demonstrate the utility of RC-IRT for real data, and the simulation study indicates that when the latent distribution is skewed, RC-IRT results can be more accurate than those based on the normal model. Along with a plot of candidate curves, the Hannan-Quinn criterion is recommended for model selection.  相似文献   

18.
A key challenge for cognitive psychology is the investigation of mental representations, such as object categories, subjective probabilities, choice utilities, and memory traces. In many cases, these representations can be expressed as a non-negative function defined over a set of objects. We present a behavioral method for estimating these functions. Our approach uses people as components of a Markov chain Monte Carlo (MCMC) algorithm, a sophisticated sampling method originally developed in statistical physics. Experiments 1 and 2 verified the MCMC method by training participants on various category structures and then recovering those structures. Experiment 3 demonstrated that the MCMC method can be used estimate the structures of the real-world animal shape categories of giraffes, horses, dogs, and cats. Experiment 4 combined the MCMC method with multidimensional scaling to demonstrate how different accounts of the structure of categories, such as prototype and exemplar models, can be tested, producing samples from the categories of apples, oranges, and grapes.  相似文献   

19.
多维题组效应Rasch模型   总被引:2,自引:0,他引:2  
首先, 本文诠释了“题组”的本质即一个存在共同刺激的项目集合。并基于此, 将题组效应划分为项目内单维题组效应和项目内多维题组效应。其次, 本文基于Rasch模型开发了二级评分和多级评分的多维题组效应Rasch模型, 以期较好地处理项目内多维题组效应。最后, 模拟研究结果显示新模型有效合理, 与Rasch题组模型、分部评分模型对比研究后表明:(1)测验存在项目内多维题组效应时, 仅把明显的捆绑式题组效应进行分离而忽略其他潜在的题组效应, 仍会导致参数的偏差估计甚或高估测验信度; (2)新模型更具普适性, 即便当被试作答数据不存在题组效应或只存在项目内单维题组效应, 采用新模型进行测验分析也能得到较好的参数估计结果。  相似文献   

20.
Psychometric models for item-level data are broadly useful in psychology. A recurring issue for estimating item factor analysis (IFA) models is low-item endorsement (item sparseness), due to limited sample sizes or extreme items such as rare symptoms or behaviors. In this paper, I demonstrate that under conditions characterized by sparseness, currently available estimation methods, including maximum likelihood (ML), are likely to fail to converge or lead to extreme estimates and low empirical power. Bayesian estimation incorporating prior information is a promising alternative to ML estimation for IFA models with item sparseness. In this article, I use a simulation study to demonstrate that Bayesian estimation incorporating general prior information improves parameter estimate stability, overall variability in estimates, and power for IFA models with sparse, categorical indicators. Importantly, the priors proposed here can be generally applied to many research contexts in psychology, and they do not impact results compared to ML when indicators are not sparse. I then apply this method to examine the relationship between suicide ideation and insomnia in a sample of first-year college students. This provides an important alternative for researchers who may need to model items with sparse endorsement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号