首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
GIRM(Generalizability in Item Response Modeling)是一种将概化理论GT和项目反应理论IRT相结合后计算概化理论中方差分量的一种方法.当GIRM方法下θp和βi的抽样分布与GIRM方法中的MCMC先验分布一致时,GIRM方法对方差分量估计具有较高的准确性.为了进一步检验GIRM方法对IRT参数分布形态的敏感性,研究在将MCMC先验分布固定的情况下,探讨不同IRT参数分布形态下GIRM方法的适用性,并将所得结果与传统GT方法相比较.结果表明:(1)在各种参数分布形态下,采用GIRM方法估计IRT模型的参数是可行的;(2)GIRM方法在被试能力参数为标准正态分布时对σ2(p)估计的准确性高于传统GT方法,但在均匀分布和偏态分布下略差于传统GT方法;(3) GIRM方法在题目难度参数为偏态分布情况下对σ2(i)的估计准确性显著差于传统GT方法;(4)两种方法对于σ2(pie)估计的准确性在任何参数分布形态下都大致相当,优劣并无统一规律.  相似文献   

2.
余嘉元 《心理学报》2002,34(5):80-86
运用联结主义中的级连相关模型对于小样本条件下的连续记分项目反应理论 (IRT)模型的项目参数和被试能力进行了估计。一组被试对于一组项目的反应矩阵作为级连相关模型的输入 ,这组被试的能力θ或该组项目的参数a、b和c作为该模型的输出 ,对神经网络进行训练使之具备了估计θ,a ,b或c的能力。计算机模拟的实验表明 ,如果测验中有少量项目取自于题库 ,就可以运用联结主义方法对IRT参数和被试能力进行较好的估计  相似文献   

3.
余嘉元 《心理学报》1990,23(2):95-100
本研究运用蒙特卡罗模拟方法。在80种不同的实验条件下产生被试的反应矩阵,再分别用信号检测理论(SDT)和项目反应理论(IRT)中的Rasch模型、三参数逻辑斯谛模型对被试的能力进行估计,其结果表明,用这三种不同方法所得到的能力估计值均与能力的真实值有较高的相关。  相似文献   

4.
项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。  相似文献   

5.
经典测验理论和项目反应理论对题目分析的对比研究   总被引:1,自引:1,他引:0  
李伟明  陈富国 《心理学报》1987,20(3):94-100
项目反应理论(Item Resporse Theory IRT)被一些人认为是当代心理测量三大发展方向之一。在某些方面,它比经典测验理论(Classical TestTheory CTT)更为合理。本文利用1986年上海市师范类高等院校招生考试的理科数学试卷中选择题的实际考试资料,先用CTT计算有关题目指数,再用IRT对题目绘制题目特征曲线ICC(Item Charicteriotic Curve)并按Logstic模型对题目参数作出了估计,从而对题目分析结果作了对比研究。最后,本文对IRT与CTT在我国的运用、提出了若干看法。  相似文献   

6.
陈平  辛涛 《心理学报》2011,43(7):836-850
项目的增补对认知诊断计算机化自适应测验(CD-CAT)题库的开发与维护至关重要。借鉴单维项目反应理论(IRT)中联合极大似然估计方法(JMLE)的思路, 提出联合估计算法(JEA), 仅依赖被试在旧题和新题上的作答反应联合地、自动地估计新题的属性向量和新题的项目参数。研究结果表明:当项目参数相对较小且样本量相对较大时, JEA算法在新题属性向量和新题项目参数估计精度方面表现不错; 而且样本大小、项目参数大小以及项目参数初值都影响着JEA算法的表现。  相似文献   

7.
本文首先用马尔科夫链蒙特卡洛(MCMC)算法和EM算法进行IRT模型参数估计模拟实验,并探讨了两种算法的参数估计精度,然后在分析三参数Logistic(3PL)模型参数估计精度的基础上改进模型并对其进行参数估计。结果表明,MCMC算法估计IRT模型的参数精度均优于EM算法,并且MCMC算法在估计3PL模型参数方面具有更明显的优势;在样本量较小的情况下,MCMC算法能较好地估计3PL模型参数,估计精度略低于2PL模型;3PL模型的项目参数确定性低是参数估计精度略低于2PL模型的主要原因;采用改进模型可以提高项目参数的确定性,进而得到更优的参数估计精度。  相似文献   

8.
近二十年以来,考试理论(Testing Theories)的研究取得了长足进展,这种进展表现在两个方面一方面,在上个世纪六十年代由Lord提出的项目反应理论(Item Response Theory,IRT)得到了很大的扩展,出现了多维度项目反应理论(multi-dimensional IRT)、非参数项目反应理论(Nonparametric IRT)以及认知诊断理论(Cognitively Diagnostic Theory)等;另一方面,项目反应理论在考试实践中得到了广泛的应用,使考试实践产生了革命性的变化,出现了计算机自适应考试(Computerized Adaptive Testing,CAT).  相似文献   

9.
测验垂直等值是指将测试同一心理特质的不同水平的测验转换到同一个分数量尺上的过程。IRT与MIRT是实现垂直等值的主要方法。IRT无需假设被试的能力分布, 参数估计不依赖于样本, 是构建垂直量表的有效方法, 但测验不满足单维假设时其应用受到限制。MIRT结合IRT和因素分析的特点对IRT进行了拓展, 可更有效估计多维测验的项目参数和被试能力参数, 在垂直等值中有重要应用。已有研究主要探讨IRT和MIRT在垂直等值应用中的适用性、标定方法和参数估计方法, 比较研究两种方法的特性。未来研究应纳入更多变量条件进行比较研究, 拓展方法的应用。  相似文献   

10.
学习(潜能)和(行为水平)变化的多维Rasch模型(MRMLC)是一种常见的动态评估项目反应理论(IRT)模型。本文根据该模型的基本特征提出了一次性估计和分步估计两种能力参数估计方法。并且采用蒙特卡罗计算机模拟研究对这两种估计方法进行了比较。模拟研究结果表明,一次性估计法比分步估计法的准确性和稳定性要好。  相似文献   

11.
丁树良  罗芬  戴海琦  朱玮 《心理学报》2007,39(4):730-736
在IRT框架下,建立了0-1评分方式下单维双参数Logistic多题多做(MAMI)测验模型。与Spray给出的一题多做(MASI)模型相比,MAMI不仅模型更加精致,而且扩展了适用范围,参数估计方法也不同,采用EM算法求取项目参数。Monte Carlo模拟结果显示,应用MAMI测验模型与测验题量作相应增加的作法相比,两者给出的能力估计精度相同,但MAMI模型给出的项目参数估计精度更高。如果将MAMI测验模型与被试人数相应增加的作法相比,项目参数的估计精度相同,但MAMI给出的能力参数估计精度更高。这个发现表明,在一定条件下若允许修改答案,并采用累加式记分方式,纵使题量不变,也可使能力估计的精度相当于题量增加一倍的估计精度,而项目参数估计精度也会提高。这些发现不仅对技能评价和认知能力评价有参考价值,而且对数据的处理方式也有参考价值  相似文献   

12.
谭青蓉  汪大勋  罗芬  蔡艳  涂冬波 《心理学报》2021,53(11):1286-1300
项目增补(Item Replenishing)对认知诊断计算机自适应测验(CD-CAT)题库的维护有着至关重要的作用, 而在线标定是一种重要的项目增补方式。基于数据挖掘中特征选择(Feature Selection)的思路, 提出一种高效的基于熵的信息增益的在线标定方法(记为IGEOCM), 该方法利用被试在新旧题上的作答联合估计新题的Q矩阵和项目参数。研究采用Monte Carlo模拟实验验证所开发新方法的效果, 并同时与已有的在线标定方法SIE、SIE-R-BIC和RMSEA-N进行比较。结果表明:新开发的IGEOCM在各实验条件下均具有较好的项目标定精度和项目估计效率, 且整体上优于已有的SIE等方法; 同时, IGEOCM标定新题所需的时间低于SIE等方法。总之, 研究为CD-CAT题库中项目的增补提供了一种更为高效、准确的方法。  相似文献   

13.
设计项目参数、被试得分已知的测验情境,在两、三、四参数Logistic加权模型下进行能力估计,发现被试得分等级之间的能力步长存在着均匀的步长间距,被试得分能较好的反映多级记分的分数加权作用。两参数Logistic加权模型下会出现被试参数估计扰动现象,猜测现象会导致能力高估现象,失误现象会导致能力低估现象;三参数Logistic加权模型c型下能力高估现象未出现或不明显;三参数Logistic加权模型γ型下能力低估现象未出现或不明显;四参数Logistic加权模型下被试能力高估现象和低估现象都未出现或不明显,四参数Logistic加权模型是被试能力稳健性估计较好的方法。  相似文献   

14.
目前参数估计多采用统计方法,存在耗时长、要求被试样本容量大和项目数多等缺点。本文将BP神经网络和降维法相结合,对GRM的项目参数和考生能力参数进行估计。蒙特卡洛模拟结果显示:(1)不管是人多题少还是题多人少,该网络设计下的参数估计精度都较高;(2)可以应用到多个不同等级评分的参数估计中,甚至是超过15个等级的项目参数,估计精度也较高,这是其他参数估计方法所不可比拟的;(3)运行的时长和统计估计方法相比大大缩减。  相似文献   

15.
简小珠  戴步云  戴海琦 《心理学报》2016,48(12):1625-1630
试题难度、试题考查重要性程度加权是多级记分试题的两个基本属性, 因而在IRT项目特征函数中需用不同参数来表示。以往多级记分模型用多个难度参数来描述多级记分试题的难度, 不能有效的表达多级记分试题的分数权重作用。从多级记分试题的分数加权作用角度, 本文提出Logistic加权模型并论述了理论构建思想。在Logistic加权模型下对项目参数估计的EM算法进行推导并编写了相应的参数估计程序。在Logistic加权模型下进行测验模拟, 发现项目参数估计的模拟返真性能良好。  相似文献   

16.
This study investigates using response times (RTs) with item responses in a computerized adaptive test (CAT) setting to enhance item selection and ability estimation and control for differential speededness. Using van der Linden’s hierarchical framework, an extended procedure for joint estimation of ability and speed parameters for use in CAT is developed following van der Linden; this is called the joint expected a posteriori estimator (J-EAP). It is shown that the J-EAP estimate of ability and speededness outperforms the standard maximum likelihood estimator (MLE) of ability and speededness in terms of correlation, root mean square error, and bias. It is further shown that under the maximum information per time unit item selection method (MICT)—a method which uses estimates for ability and speededness directly—using the J-EAP further reduces average examinee time spent and variability in test times between examinees above the resulting gains of this selection algorithm with the MLE while maintaining estimation efficiency. Simulated test results are further corroborated with test parameters derived from a real data example.  相似文献   

17.
Information functions are used to find the optimum ability levels and maximum contributions to information for estimating item parameters in three commonly used logistic item response models. For the three and two parameter logistic models, examinees who contribute maximally to the estimation of item difficulty contribute little to the estimation of item discrimination. This suggests that in applications that depend heavily upon the veracity of individual item parameter estimates (e.g. adaptive testing or text construction), better item calibration results may be obtained (for fixed sample sizes) from examinee calibration samples in which ability is widely dispersed.This work was supported by Contract No. N00014-83-C-0457, project designation NR 150-520, from Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research and Educational Testing Service through the Program Research Planning Council. Reproduction in whole or in part is permitted for any purpose of the United States Government. The author wishes to acknowledge the invaluable assistance of Maxine B. Kingston in carrying out this study, and to thank Charles Lewis for his many insightful comments on earlier drafts of this paper.  相似文献   

18.
Parameters of the two‐parameter logistic model are generally estimated via the expectation–maximization (EM) algorithm by the maximum‐likelihood (ML) method. In so doing, it is beneficial to estimate the common prior distribution of the latent ability from data. Full non‐parametric ML (FNPML) estimation allows estimation of the latent distribution with maximum flexibility, as the distribution is modelled non‐parametrically on a number of (freely moving) support points. It is generally assumed that EM estimation of the two‐parameter logistic model is not influenced by initial values, but studies on this topic are unavailable. Therefore, the present study investigates the sensitivity to initial values in FNPML estimation. In contrast to the common assumption, initial values are found to have notable influence: for a standard convergence criterion, item discrimination and difficulty parameter estimates as well as item characteristic curve (ICC) recovery were influenced by initial values. For more stringent criteria, item parameter estimates were mainly influenced by the initial latent distribution, whilst ICC recovery was unaffected. The reason for this might be a flat surface of the log‐likelihood function, which would necessitate setting a sufficiently tight convergence criterion for accurate recovery of item parameters.  相似文献   

19.
In real testing, examinees may manifest different types of test‐taking behaviours. In this paper we focus on two types that appear to be among the more frequently occurring behaviours – solution behaviour and rapid guessing behaviour. Rapid guessing usually happens in high‐stakes tests when there is insufficient time, and in low‐stakes tests when there is lack of effort. These two qualitatively different test‐taking behaviours, if ignored, will lead to violation of the local independence assumption and, as a result, yield biased item/person parameter estimation. We propose a mixture hierarchical model to account for differences among item responses and response time patterns arising from these two behaviours. The model is also able to identify the specific behaviour an examinee engages in when answering an item. A Monte Carlo expectation maximization algorithm is proposed for model calibration. A simulation study shows that the new model yields more accurate item and person parameter estimates than a non‐mixture model when the data indeed come from two types of behaviour. The model also fits real, high‐stakes test data better than a non‐mixture model, and therefore the new model can better identify the underlying test‐taking behaviour an examinee engages in on a certain item.  相似文献   

20.
Bayes modal estimation in item response models   总被引:1,自引:0,他引:1  
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one-, two-, and three-parameter binary logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.This research was supported by a grant from the Spencer Foundation, Chicago, IL. Comments and suggestions on earlier drafts by Charles Lewis, Frederic Lord, Rosenbaum, James Ramsey, Hiroshi Watanabe, the editor, and two anonymous referees are gratefully acknowledged.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号