期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

肖玮苗丹民朱宁宁张青华《心理科学》2006,29(2):389-391

使用BILOG-MG3.0软件,边际极大似然估计,3参数Logistic模型对354名不同能力水平的男性青年的瑞文测验联合型数据进行了分析。结果显示:大多数瑞文测验联合型的题目都适合3参数Logistic模型(有6道题不适合)。整个测验的信息函数峰值的位置在难度量表的-3到-2之间,其值为16.82。共有18道题的信息函数峰值在0.2以下。从区分度来看,72道题目的区分度均大于0.5,比较理想。难度参数显示所有题目均较低,绝大部分都在0以下,最高的只有1.01。题目的难度主要由所需的操作水平决定。伪猜测参数在0.07-0.24之间。综合分析表明瑞文测验联合型对正常青年的智力评价精度较差。相似文献

2.

高中英语阅读测验中题组模型的选择与应用

马洁刘红云《心理科学》2018,(6):1374-1381

本研究通过高中英语阅读测验实测数据,对比分析双参数逻辑斯蒂克模型 (2PL-IRT)和加入不同数量题组的双参数逻辑斯蒂克模型 (2PL-TRT), 探究题组数量对参数估计及模型拟合的影响。结果表明：(1) 2PL-IRT模型对能力介于-1.50到0.50的被试,能力参数估计偏差较大;(2)将题组效应大于0.50的题组作为局部独立题目纳入模型,会导致部分题目区分度参数的低估和大部分题目难度参数的高估;(3)题组效应越大,将其当作局部独立题目纳入模型估计项目参数的偏差越大。相似文献

3.

应用项目反应理论创建图形推理测验题库

肖玮苗丹民朱宁宁张青华《心理学报》2006,38(6):934-940

自编235个图形推理测验题目。采用铆测验等值设计,以72个联合型瑞文测验题目为铆题,对初中到大学各能力层次的1733名男性进行了测验。使用BILOG MG3.0（边际极大似然估计）对实测数据进行了分析,采用Logsitic 3参数模型。剔除数据与模型拟合不好的题目以及信息函数最大值小于0.3的题目,最终建立一个包含181道题目的题库。该题库可以用于淘汰智力较低的应征青年相似文献

4.

解释性项目反应理论模型：理论与应用

陈冠宇陈平《心理科学进展》2019,27(5):937-950

解释性项目反应理论模型(Explanatory Item Response Theory Models, EIRTM)是指基于广义线性混合模型和非线性混合模型构建的项目反应理论(Item Response Theory, IRT)模型。EIRTM能在IRT模型的基础上直接加入预测变量, 从而解决各类测量问题。首先介绍EIRTM的相关概念和参数估计方法, 然后展示如何使用EIRTM处理题目位置效应、测验模式效应、题目功能差异、局部被试依赖和局部题目依赖, 接着提供实例对EIRTM的使用进行说明, 最后对EIRTM的不足之处和应用前景进行讨论。相似文献

5.

项目反应理论等级反应模型项目信息量 总被引：7，自引：1，他引：6

罗照盛欧阳雪莲漆书青戴海琦丁树良《心理学报》2008,40(11):1212-1220

信息函数作为项目反应理论中的一个重要概念,在进行项目和测验分析的工作中,以及在指导测验编制的工作中,有着非常重要的应用价值。信息函数的应用在计算机化自适应测验中更是重中之重,也受到最大关注。然而,关于多级记分项目信息函数特性的研究还比较少。本研究模拟了被试特质水平参数数据和项目参数数据,其中被试特质水平参数生成了121个被试特质水平参数点,项目参数生成了4批不同区分度参数数据,每批数据有126个不同难度等级参数组合模式的项目,每个项目有5个难度等级。通过数据分析后发现,等级反应模型项目提供最大信息量所对应的被试特质水平,是与该项目几个相互临近的难度等级组相适应,既不是只与其中一个难度等级对应,也不一定是与所有难度等级对应。本研究称这种规律为“临近难度等级占优”。这个发现无疑对测验质量分析和测验编制工作,包括计算机化自适应测验编制,具有重要的指导意义相似文献

6.

图形项目记忆与位置来源提取的ERP研究 总被引：1，自引：0，他引：1

聂爱情郭春彦沈模卫《心理学报》2007,39(1):50-57

采用事件相关电位（ERP）方法研究大学生图形项目记忆与位置来源提取新/旧效应的时、空分布特征。学习屏幕左侧（或右侧）呈现的图形后在屏幕中央呈现某一测验项目（已学图形或新图形）,对受试者进行两类测验：其一是判断该项目是否已学过;其二是将从某侧学过的测验项目判断为目标,而将其它测验项目判断为非目标的来源测验（排除任务）。实验结果表明：图形位置来源提取比项目再认新/旧效应的头皮分布更广,持续时程更长;与前人相同实验范式的研究结果相比,本研究位置来源提取效应的头皮分布范围较广;非目标-旧图形与目标图形新/旧效应的头皮激活程度不同。上述结果说明：图形位置来源提取比项目再认激活的大脑区域更多,这与双重加工模型的观点一致;实验范式和来源知觉的特性共同调节来源记忆新/旧效应的时、空分布特征;意识水平对来源信息提取的新/旧效应存在一定的影响相似文献

7.

多维测验项目参数的估计：基于SEM与MIRT方法的比较

刘红云骆方王玥张玉《心理学报》2012,44(1):121-132

作者简要回顾了SEM框架下分类数据因素分析(CCFA)模型和MIRT框架下测验题目和潜在能力的关系模型, 对两种框架下的主要参数估计方法进行了总结。通过模拟研究, 比较了SEM框架下WLSc和WLSMV估计方法与MIRT框架下MLR和MCMC估计方法的差异。研究结果表明：(1) WLSc得到参数估计的偏差最大, 且存在参数收敛的问题; (2)随着样本量增大, 各种项目参数估计的精度均提高, WLSMV方法与MLR方法得到的参数估计精度差异很小, 大多数情况下不比MCMC方法差; (3)除WLSc方法外, 随着每个维度测验题目的增多参数估计的精度逐渐增高; (4)测验维度对区分度参数和难度参数的影响较大, 而测验维度对项目因素载荷和阈值的影响相对较小; (5)项目参数的估计精度受项目测量维度数的影响, 只测量一个维度的项目参数估计精度较高。另外文章还对两种方法在实际应用中应该注意的问题提供了一些建议。相似文献

8.

测验模式效应：来源、检测与应用

陈平代艺黄颖诗《心理科学进展》2023,(10):1966-1980

测验模式效应(Test Mode Effect, TME)是指同一测验采用不同测验形式施测而产生的测验功能差异。TME的存在会对测验公平、选拔标准和测验等值等产生影响,因此对TME进行准确检测和合理解释具有重要意义。通过对TME的来源、检测(包括实验设计和检测方法)以及研究结果进行系统梳理,全面展示TME研究的方法论。对TME模型进行进一步解释、对TME研究中的测验形式进行拓展以及将TME的研究成果应用于我国的大规模教育测评项目,都是TME领域的未来重要发展方向。相似文献

9.

测验范式对位置来源提取的影响—— 事件相关电位研究

聂爱情郭春彦沈模卫《心理学报》2011,43(5):473-482

采用事件相关电位方法, 通过两个实验分别考察不同测验范式(排除范式和三键范式)对图形位置来源提取头皮分布特性的影响, 以检验前人得出的测验范式对颜色来源提取ERPs影响较小的结论是否适用于其他来源类型这一问题。结果显示, 相比于项目再认, 实验一中与图形位置来源提取关联的正走向新/旧效应的头皮分布更广; 实验二则得出反映位置来源提取的晚期负走向新/旧效应。可见, 两类测验范式条件下与位置来源提取关联新/旧效应的头皮分布时空特征有所不同。上述结果表明：与颜色来源研究不同, 测验范式对位置来源提取关联的神经机制具有明显调节作用。相似文献

10.

题目参数漂移:概念厘定及相关研究

《心理科学进展》2015,(10)

题目参数漂移(Item Parameter Drift,IPD)指锚题参数值在若干连续性测试场合或测验水平之间的变化。该概念和项目功能差异(Differential Item Function,DIF)有实质区别。当前的IPD研究涵盖五个方面:IPD的实际存在、产生原因、检测方法、链接结果效应,及极端锚题的处理策略。横向IPD领域需要更进一步和具有综合性的研究,尤其需研究DIF检测方法在IPD检测中的语境适用性,开发针对IPD的检测方法,以及发展修正的链接程序。纵向IPD研究需要进行系统、深入探索。相似文献

11.

IRT与MIRT在测验垂直等值中的应用

王怡唐文清刘晶张敏强李明黎光明《心理科学进展》2014,22(5):881-888

测验垂直等值是指将测试同一心理特质的不同水平的测验转换到同一个分数量尺上的过程。IRT与MIRT是实现垂直等值的主要方法。IRT无需假设被试的能力分布, 参数估计不依赖于样本, 是构建垂直量表的有效方法, 但测验不满足单维假设时其应用受到限制。MIRT结合IRT和因素分析的特点对IRT进行了拓展, 可更有效估计多维测验的项目参数和被试能力参数, 在垂直等值中有重要应用。已有研究主要探讨IRT和MIRT在垂直等值应用中的适用性、标定方法和参数估计方法, 比较研究两种方法的特性。未来研究应纳入更多变量条件进行比较研究, 拓展方法的应用。相似文献

12.

等级反应模型下项目特征曲线等值法在大型考试中的应用 总被引：2，自引：1，他引：1

周骏欧东明徐淑媛戴海琦漆书青《心理学报》2005,37(6):832-838

在中国最大的资格考试之一的经济专业资格考试中,为保证不同年度间考试的可比性、进行题库建设和为计算机自适应考试做准备,应用项目反应理论中等级反应模型下的项目特征曲线等值法,采用铆测验等值设计,实现了4个年度考试资料的项目参数和能力参数的等值,并成功地组建了经济专业题库。在此基础上,利用等值技术对不同年份试卷的划界分数进行了比较,为经济考试的合格标准制定、确保考试的公平性提供了实证依据。相似文献

13.

Some standard errors in item response theory 总被引：2，自引：0，他引：2

David Thissen Howard Wainer 《Psychometrika》1982,47(4):397-412

The mathematics required to calculate the asymptotic standard errors of the parameters of three commonly used logistic item response models is described and used to generate values for some common situations. It is shown that the maximum likelihood estimation of a lower asymptote can wreak havoc with the accuracy of estimation of a location parameter, indicating that if one needs to have accurate estimates of location parameters (say for purposes of test linking/equating or computerized adaptive testing) the sample sizes required for acceptable accuracy may be unattainable in most applications. It is suggested that other estimation methods be used if the three parameter model is applied in these situations.The research reported here was supported, in part, by contract #F41689-81-6-0012 from the Air Force Human Resources Laboratory to McFann-Gray & Associates, Benjamin A. Fairbank, Jr., Principal Investigator. Further support of Wainer's effort was supplied by the Educational Testing Service, Program Statistics Research Project. 相似文献

14.

Specifying optimum examinees for item parameter estimation in item response theory

Martha L. Stocking 《Psychometrika》1990,55(3):461-475

Information functions are used to find the optimum ability levels and maximum contributions to information for estimating item parameters in three commonly used logistic item response models. For the three and two parameter logistic models, examinees who contribute maximally to the estimation of item difficulty contribute little to the estimation of item discrimination. This suggests that in applications that depend heavily upon the veracity of individual item parameter estimates (e.g. adaptive testing or text construction), better item calibration results may be obtained (for fixed sample sizes) from examinee calibration samples in which ability is widely dispersed.This work was supported by Contract No. N00014-83-C-0457, project designation NR 150-520, from Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research and Educational Testing Service through the Program Research Planning Council. Reproduction in whole or in part is permitted for any purpose of the United States Government. The author wishes to acknowledge the invaluable assistance of Maxine B. Kingston in carrying out this study, and to thank Charles Lewis for his many insightful comments on earlier drafts of this paper. 相似文献

15.

Consequences of Ignoring Guessing Effects on Measurement Invariance Analysis

Ismail Cuhadar Yanyun Yang Insu Paek 《应用心理检测》2021,45(4):283

Pseudo-guessing parameters are present in item response theory applications for many educational assessments. When sample size is not sufficiently large, the guessing parameters may be ignored from the analysis. This study examines the impact of ignoring pseudo-guessing parameters on measurement invariance analysis, specifically, on item difficulty, item discrimination, and mean and variance of ability distribution. Results show that when non-zero guessing parameters are ignored from the measurement invariance analysis, item discrimination estimates tend to decrease particularly for more difficult items, and item difficulty estimates decrease unless the items are highly discriminating and difficult. As the guessing parameter increases, the size of the decrease in item discrimination and difficulty tends to increase, and the estimated mean and variance of ability distribution tend to be inaccurate. When two groups have heterogeneous ability distributions, ignoring the guessing parameter affects the reference group and the focal group differently. Implications of result findings are discussed. 相似文献

16.

Evaluation on types of invariance in studying extreme response bias with an IRTree approach

Minjeong Jeon Paul De Boeck 《The British journal of mathematical and statistical psychology》2019,72(3):517-537

In recent years, item response tree (IRTree) approaches have received increasing attention in the response style literature for their ability to partial out response style latent variables as well as associated item parameters. When an IRTree approach is adopted to measure extreme response styles, directional and content invariance could be assumed at the latent variable and item parameter levels. In this study, we propose to evaluate the empirical validity of these invariance assumptions by employing a general IRTree model with relaxed invariance assumptions. This would allow us to examine extreme response biases, beyond extreme response styles. With three empirical applications of the proposed evaluation, we find that relaxing some of the invariance assumptions improves the model fit, which suggests that not all assumed invariances are empirically supported. Specifically, at the latent variable level, we find reasonable evidence for directional invariance but mixed evidence for content invariance, although we also find that estimated correlations between content-specific extreme response latent variables are high, hinting at the potential presence of a general extreme response tendency. At the item parameter level, we find no directional or content invariance for thresholds and no content invariance for slopes. We discuss how the variant item parameter estimates obtained from a general IRTree model can offer useful insight to help us understand response bias related to extreme responding measured within the IRTree framework. 相似文献

17.

单维项目因素分析：CCFA与IRT估计方法的比较

下载免费PDF全文

刘红云李美娟骆方李小山《心理科学》2012,35(2):441-445

当观测指标变量为二分分类数据时,传统的因素分析方法不再适用。作者简要回顾了SEM框架下的分类数据因素分析模型和IRT框架下的测验题目和潜在能力的关系模型,并对两种框架下主要采用的参数估计方法进行了总结。通过两个模拟研究,比较了SEM框架下GLSc和MGLSc估计方法与IRT框架下MML/EM估计方法的差异。研究结果表明：（1）三种方法中,GLSc得到参数估计的偏差最大,MGLSc和MML/EM估计方法相差不大;（2）随着样本量增大,各种项目参数估计的精度均提高;（3）项目因素载荷和难度估计的精度受测验长度的影响;（4）项目因素载荷和区分度估计的精度受总体因素载荷（区分度）高低的影响;（5）测验项目中阈值的分布会影响参数估计的精度,其中受影响最大的是项目区分度。（6）总体来看,SEM框架下的项目参数估计精度较IRT框架下项目参数估计的精度高。此外,文章还将两种方法在实际应用中应该注意的问题提供了一些建议。相似文献

18.

多水平IRT的发展与应用述评

刘慧简小珠张敏强熊悦欣《心理科学进展》2012,20(4):627-632

阶层线性模型是处理阶层结构数据的高级统计方法, 项目反应理论是精确测量被试能力的现代测量理论。多水平项目反应理论将阶层线性模型和项目反应理论相结合, 将项目反应模型嵌套在阶层线性模型内, 实现了项目参数和不同水平能力参数的估计, 对回归系数和误差项变异的估计也更加精确。作者概述了多水平项目反应理论的发展历程, 并从项目功能差异、测验等值、学校效能研究等方面评述了多水平项目反应理论在心理与教育测量中的应用, 总结了多水平项目反应理论的价值, 同时展望了今后的研究趋势。相似文献

19.

基于联结主义的连续记分IRT模型的项目参数和被试能力估计 总被引：5，自引：0，他引：5

余嘉元《心理学报》2002,34(5):80-86

运用联结主义中的级连相关模型对于小样本条件下的连续记分项目反应理论 (IRT)模型的项目参数和被试能力进行了估计。一组被试对于一组项目的反应矩阵作为级连相关模型的输入 ,这组被试的能力θ或该组项目的参数a、b和c作为该模型的输出 ,对神经网络进行训练使之具备了估计θ,a ,b或c的能力。计算机模拟的实验表明 ,如果测验中有少量项目取自于题库 ,就可以运用联结主义方法对IRT参数和被试能力进行较好的估计相似文献

20.

A semi-parametric within-subject mixture approach to the analyses of responses and response times

Dylan Molenaar Maria Bolsinova Jeroen K. Vermunt 《The British journal of mathematical and statistical psychology》2018,71(2):205-228

In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach. 相似文献