首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
项目反应理论中参数估计程序的实现,一直是研究现代测量理论的学者们关注的问题。该研究从理论上探讨了多级模型参数估计的实现途径,并模拟了5批不同数量及分布情形的项目及被试参数,生成相应的原始得分矩阵,对自编程序及国际流行的相关程序进行了严格的比较校验,验证结果证明本程序具有精确、稳定的性能,并且发现被试量太少将影响参数估计的精确性及稳定性  相似文献   

2.
Item response theory (IRT) is supplanting classical test theory as the basis for measures development. This study demonstrated the utility of IRT for evaluating DSM-IV diagnostic criteria. Data on alcohol, cannabis, and cocaine symptoms from 372 adult clinical participants interviewed with the Composite International Diagnostic Interview--Expanded Substance Abuse Module (CIDI-SAM) were analyzed with Mplus (B. Muthen & L. Muthen, 1998) and MULTILOG (D. Thissen, 1991) software. Tolerance and legal problems criteria were dropped because of poor fit with a unidimensional model. Item response curves, test information curves, and testing of variously constrained models suggested that DSM-IV criteria in the CIDI-SAM discriminate between only impaired and less impaired cases and may not be useful to scale case severity. IRT can be used to study the construct validity of DSM-IV diagnoses and to identify diagnostic criteria with poor performance.  相似文献   

3.
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

4.
When categorical ordinal item response data are collected over multiple timepoints from a repeated measures design, an item response theory (IRT) modeling approach whose unit of analysis is an item response is suitable. This study proposes a few longitudinal IRT models and illustrates how a popular compensatory multidimensional IRT model can be utilized to formulate such longitudinal IRT models, which permits an investigation of ability growth at both individual and population levels. The equivalence of an existing multidimensional IRT model and those longitudinal IRT models is also elaborated so that one can make use of an existing multidimensional IRT model to implement the longitudinal IRT models.  相似文献   

5.
This article reviews Newton procedures for the analysis of mean and covariance structures that may be functions of parameters that enter a model nonlinearly. The kind of model considered is a mixed-effects model that is conditionally linear with regard to its parameters. This means parameters entering the model nonlinearly must be fixed, whereas those entering linearly may vary across individuals. This framework encompasses several models, including hierarchical linear models, linear and nonlinear factor analysis models, and nonlinear latent curve models. A full maximum-likelihood estimation procedure is described. Mx, a statistical software package often used to estimate structural equation models, is considered for estimation of these models. An example with Mx syntax is provided.  相似文献   

6.
应用项目反应理论对《中国士兵人格问卷》的项目分析   总被引:4,自引:0,他引:4  
采用项目反应理论(IRT)对《中国士兵人格问卷》进行项目分析。计算机呈现中国士兵人格问卷(CSPQ)对100,523名适龄男性青年进行测验,随机抽取2676名任一维度标准分均低于70的定为合格组;将任一维度大于70分并经专业人员访谈不合格的274名定为不合格组;从精神病院抽取男性年龄相当的221名缓解期精神分裂症患者定为精神病组,并完成CSPQ测验。运用基于IRT的双参数Logistic模型进行分析;结果发现,区分度参数超过区间(0.30,4.00)的条目删除前后,被试的能力值与标准分均存在显著相关;精神病组的测验分数经IRT分析,图形曲线与不合格组有高度吻合。研究结果说明,在测验精度基本相同的条件下,应用IRT可以减少施测条目,提高测验效率,可在一定程度上更精确地区分被试的特质水平  相似文献   

7.
迫选(forced-choice, FC)测验由于可以控制传统李克特方法带来的反应偏差, 被广泛应用于非认知测验中, 而迫选测验的传统计分方式会产生自模式数据, 这种数据由于不适合于个体间的比较, 一直备受批评。近年来, 多种迫选IRT模型的发展使研究者能够从迫选测验中获得接近常模性的数据, 再次引起了研究者与实践人员对迫选IRT模型的兴趣。首先, 依据所采纳的决策模型和题目反应模型对6种较为主流的迫选IRT模型进行分类和介绍。然后, 从模型构建思路、参数估计方法两个角度对各模型进行比较与总结。其次, 从参数不变性检验、计算机化自适应测验(computerized adaptive testing, CAT)和效度研究3个应用研究方面进行述评。最后提出未来研究可以在模型拓展、参数不变性检验、迫选CAT测验和效度研究4个方向深入。  相似文献   

8.
项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。  相似文献   

9.
Bayes modal estimation in item response models   总被引:1,自引:0,他引:1  
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one-, two-, and three-parameter binary logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.This research was supported by a grant from the Spencer Foundation, Chicago, IL. Comments and suggestions on earlier drafts by Charles Lewis, Frederic Lord, Rosenbaum, James Ramsey, Hiroshi Watanabe, the editor, and two anonymous referees are gratefully acknowledged.  相似文献   

10.
涂冬波  蔡艳  戴海琦  丁树良 《心理科学》2011,34(5):1189-1194
IRT中的计量模型较多,不同计量模型适合不同特点的数据资料,实际工作者应根据实际情况选择适当的IRT模型来分析数据。我国是个考试、测评大国,测评的题型丰富多样,在实际应用IRT时,一个模型往往很难反应所有数据资料本身的特点,这时可考虑应用多个IRT模型(即“混合模型”)来分析,以达到对数据的最佳拟合。本文对混合模型的思想方法及原理、参数估计的实现、以及模型性能进行了研究,发现:(1)本文自主开发的混合模型参数估计程序Mix_Tu具有较高的返真性,且与国际知名测量软件Parscale相当。(2)在“项目异常”情况下,Mix_Tu程序对参数b和c的估计受数据异常程度的影响要大于Parscale程序,而对参数a的估计受数据异常程度的影响要小于Parscale程序,而在参数theta上两个程序相当。(3)在“被试异常”情况下,Mix_Tu程序对所有参数的估计受数据异常程度的影响均要小于Parscale程序,Mix_Tu程序表现的更为稳健。  相似文献   

11.
迫选测验的传统计分方式会产生自模式数据, 不能进行传统的信效度检验、因素分析和方差分析等。近年来研究者提出了一些基于项目反应理论的计分模型, 如瑟斯顿IRT模型和MUPP模型等, 它们可以规避自模式数据的弊端。瑟斯顿IRT模型方便进行参数估计, 模型定义灵活; 而MUPP模型的拓展性较差, 参数估计的方法有待提高。另一方面, 已有研究者基于MUPP模型开发了一些抗作假的迫选测验, 而瑟斯顿IRT模型距离这种应用还比较远。此外, 两个模型的适用性和有效性都有待更多的实证研究来检验。  相似文献   

12.
For testlet response data, traditional item response theory (IRT) models are often not appropriate due to local dependence presented among items within a common testlet. Several testlet‐based IRT models have been developed to model examinees' responses. In this paper, a new two‐parameter normal ogive testlet response theory (2PNOTRT) model for dichotomous items is proposed by introducing testlet discrimination parameters. A Bayesian model parameter estimation approach via a data augmentation scheme is developed. Simulations are conducted to evaluate the performance of the proposed 2PNOTRT model. The results indicated that the estimation of item parameters is satisfactory overall from the viewpoint of convergence. Finally, the proposed 2PNOTRT model is applied to a set of real testlet data.  相似文献   

13.
Log-Multiplicative Association Models as Item Response Models   总被引:1,自引:0,他引:1  
Log-multiplicative association (LMA) models, which are special cases of log-linear models, have interpretations in terms of latent continuous variables. Two theoretical derivations of LMA models based on item response theory (IRT) arguments are presented. First, we show that Anderson and colleagues (Anderson &; Vermunt, 2000; Anderson &; Böckenholt, 2000; Anderson, 2002), who derived LMA models from statistical graphical models, made the equivalent assumptions as Holland (1990) when deriving models for the manifest probabilities of response patterns based on an IRT approach. We also present a second derivation of LMA models where item response functions are specified as functions of rest-scores. These various connections provide insights into the behavior of LMA models as item response models and point out philosophical issues with the use of LMA models as item response models. We show that even for short tests, LMA and standard IRT models yield very similar to nearly identical results when data arise from standard IRT models. Log-multiplicative association models can be used as item response models and do not require numerical integration for estimation.  相似文献   

14.
Various different item response theory (IRT) models can be used in educational and psychological measurement to analyze test data. One of the major drawbacks of these models is that efficient parameter estimation can only be achieved with very large data sets. Therefore, it is often worthwhile to search for designs of the test data that in some way will optimize the parameter estimates. The results from the statistical theory on optimal design can be applied for efficient estimation of the parameters.A major problem in finding an optimal design for IRT models is that the designs are only optimal for a given set of parameters, that is, they are locally optimal. Locally optimal designs can be constructed with a sequential design procedure. In this paper minimax designs are proposed for IRT models to overcome the problem of local optimality. Minimax designs are compared to sequentially constructed designs for the two parameter logistic model and the results show that minimax design can be nearly as efficient as sequentially constructed designs.  相似文献   

15.
当观测指标变量为二分分类数据时,传统的因素分析方法不再适用。作者简要回顾了SEM框架下的分类数据因素分析模型和IRT框架下的测验题目和潜在能力的关系模型,并对两种框架下主要采用的参数估计方法进行了总结。通过两个模拟研究,比较了SEM框架下GLSc和MGLSc估计方法与IRT框架下MML/EM估计方法的差异。研究结果表明:(1)三种方法中,GLSc得到参数估计的偏差最大,MGLSc和MML/EM估计方法相差不大;(2)随着样本量增大,各种项目参数估计的精度均提高;(3)项目因素载荷和难度估计的精度受测验长度的影响;(4)项目因素载荷和区分度估计的精度受总体因素载荷(区分度)高低的影响;(5)测验项目中阈值的分布会影响参数估计的精度,其中受影响最大的是项目区分度。(6)总体来看,SEM框架下的项目参数估计精度较IRT框架下项目参数估计的精度高。此外,文章还将两种方法在实际应用中应该注意的问题提供了一些建议。  相似文献   

16.
Although Thurstonian models provide an attractive representation of choice behavior, they have not been extensively used in ranking applications since only recently efficient estimation methods for these models have been developed. These, however, require the use of special-purpose estimation programs, which limits their applicability. Here we introduce a formulation of Thurstonian ranking models that turns an idiosyncratic estimation problem into an estimation problem involving mean and covariance structures with dichotomous indicators. Well-known standard solutions for the latter can be readily applied to this specific problem, and as a result any Thurstonian model for ranking data can be fitted using existing general purpose software for mean and covariance structure analysis. Although the most popular programs for covariance structure analysis (e.g., LISREL and EQS) cannot be presently used to estimate Thurstonian ranking models, other programs such as MECOSA already exist that can be straightforwardly used to estimate these models.This paper is based on the author's doctoral dissertation. Ulf Böckenholt was my advisor. The author is indebted to Ulf Böckenholt for his comments on a previous version of this paper and to Gerhard Arminger for his extensive support on the use of MECOSA. The final stages of this research took place while the author was at the Department of Statistics and Econometrics, Universidad Carlos III de Madrid. Conversations with my colleague there, Adolfo Hernández, helped to greatly improve this paper.  相似文献   

17.
题组作为众多测验中的一种常见题型,由于项目间存在一定程度的依赖性而违背了局部独立性假设,若用项目反应模型进行参数估计将会出现较大的偏差.题组反应理论将被试与题组的交互作用纳入到模型中,解决了项目间相依性的问题.笔者对题组反应理论的发展、基本原理及其相关研究进行了综述,并将其应用在中学英语考试中.与项目反应理论相对比,结果发现:(1)题组反应模型与项目反应模型在各参数估计值的相关系数较强,尤其是能力参数和难度参数;(2)在置信区间宽度的比较上,题组反应模型在各个参数上均窄于项目反应模型,即题组反应模型的估计精度优于项目反应模型.  相似文献   

18.
This article analyzes latent variable models from a cognitive psychology perspective. We start by discussing work by Tuerlinckx and De Boeck (2005), who proved that a diffusion model for 2-choice response processes entails a 2-parameter logistic item response theory (IRT) model for individual differences in the response data. Following this line of reasoning, we discuss the appropriateness of IRT for measuring abilities and bipolar traits, such as pro versus contra attitudes. Surprisingly, if a diffusion model underlies the response processes, IRT models are appropriate for bipolar traits but not for ability tests. A reconsideration of the concept of ability that is appropriate for such situations leads to a new item response model for accuracy and speed based on the idea that ability has a natural zero point. The model implies fundamentally new ways to think about guessing, response speed, and person fit in IRT. We discuss the relation between this model and existing models as well as implications for psychology and psychometrics.  相似文献   

19.
项目反应理论是测量被试潜在特质的现代测量理论, 潜在类别分析是基于模型的潜在特质分类技术。混合项目反应理论将项目反应理论与潜在类别分析相结合, 能够同时对被试分类并量化其潜在特质。在阐述混合项目反应理论概念、原理的基础上, 介绍了MRM、mNRM和mPCM等几种常见混合模型及其参数估计方法, 并从心理与行为特征分类、项目功能差异检测、测验效度评价等方面评述了其在心理测验中的应用发展轨迹。  相似文献   

20.
A Monte Carlo study was used to compare four approaches to growth curve analysis of subjects assessed repeatedly with the same set of dichotomous items: A two‐step procedure first estimating latent trait measures using MULTILOG and then using a hierarchical linear model to examine the changing trajectories with the estimated abilities as the outcome variable; a structural equation model using modified weighted least squares (WLSMV) estimation; and two approaches in the framework of multilevel item response models, including a hierarchical generalized linear model using Laplace estimation, and Bayesian analysis using Markov chain Monte Carlo (MCMC). These four methods have similar power in detecting the average linear slope across time. MCMC and Laplace estimates perform relatively better on the bias of the average linear slope and corresponding standard error, as well as the item location parameters. For the variance of the random intercept, and the covariance between the random intercept and slope, all estimates are biased in most conditions. For the random slope variance, only Laplace estimates are unbiased when there are eight time points.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号