期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

程小扬丁树良《心理科学》2011,34(4):965-969

摘要: 在计算机自适应测验中, 对0-1评分模型按a-分层选题是高效安全的策略,但多级评分模型的项目难度/步骤参数有多个而无法直接应用这种选题策略。信息函数能够很好地综合项目所有参数及能力参数,但最大信息量选题策略会影响考试安全。本文提出一种变加权选题策略,它通过调用一个与信息量相关联的函数,该函数与信息量成正比,与区分度的某个幂函数成反比,从而达到既能综合项目所有参数又按a分层的效果。在GPCM模型下用蒙特卡罗实验进行比较研究,结果显示新的选题策略总体效果比已有相关结果好。相似文献

2.

等级反应模型下计算机化自适应测验选题策略 总被引：7，自引：3，他引：4

陈平丁树良林海菁周婕《心理学报》2006,38(3):461-467

计算机化自适应测验(CAT)中的选题策略,一直是国内外相关学者关注的问题。然而对多级评分的CAT的选题策略的研究却很少报导。本研究采用计算机模拟程序对等级反应模型(Graded Response Model)下CAT的四种选题策略进行研究。研究表明：等级难度值与当前能力估计值匹配选题策略的综合评价最高;在选题策略中增设 “影子题库”可以明显提高项目调用的均匀性;并且不同的项目参数分布或不同的能力估计方法都对CAT评价指标有影响相似文献

3.

认知诊断CAT选题策略及初始题选取方法

下载免费PDF全文

涂冬波蔡艳戴海琦《心理科学》2013,36(2):469-474

计算机化认知诊断自适应测验(CD_CAT)是将认知诊断的基本理论、方法与计算机化自适应测验相结合的产物,是现代测量学发展的新领域。对于计算机化自适应测验（CAT）中的选题策略研究一直是国内外学者关注的问题,然而对于计算机化认知诊断自适应测验的选题策略研究却很少报导,而对于计算机化认知诊断自适应测验的初始题选取方法的研究却更少。本研究采用计算机模拟程序对HO-DINA模型下CD_CAT的五种选题策略及二种初始题选取方法进行研究。研究表明：不同初始题选取方法及选题策略均会影响对被试诊断的准确性及能力估计的精度;总体来看,对于二种初始题选取方法,本研究提出的“T阵法”优于传统的随机法;对于五种选题策略,SL_GDI法最优;初始题选取方法及选题策略的搭配中,“T阵法”和SL_GDI法的搭配最佳。相似文献

4.

基于GPCM的计算机自适应测验选题策略比较 总被引：1，自引：0，他引：1

刘珍丁树良林海菁《心理学报》2008,40(5):618-625

选题策略是计算机自适应测验（Computerized Adaptive Testing , CAT）研究的一项重要内容,它的好坏直接关系到考试的信度、效度及考试的安全性。CAT的许多研究与应用,都建立在0-1二级评分模型基础上,对多级评分CAT的选题策略的研究很少报导。目前国内虽已开展了基于GRM的CAT研究,但基于GPCM的CAT的研究尚未见有关报道。本文通过计算机模拟程序,对基于拓广分部评分模型(Generalized Partial Credit Model, GPCM)下的CAT的四种选题策略在多种情况下进行了比较研究。研究结果表明：被试能力呈正态分布时,选题策略的使用效果与项目步骤参数分布有很大的关系。（1）项目步骤参数均服从正态分布时,采用能力与项目步骤参数匹配选题策略效果最佳;（2）项目步骤参数均服从均匀分布时,能力与项目步骤参数平均数匹配选题策略效果最佳相似文献

5.

可以兼顾策略、认知状态和能力的CD-CAT选题方法

戴步云张敏强黎光明汪新光胡姗《心理科学》2018,(2):459-465

当CD-CAT测验需要同时诊断被试的解题策略、认知状态并评估被试的宏观能力时,就需要在选题过程中兼顾这三个测量目标。用两种不同方式将多策略香农熵(MSSHE)指标与Fisher信息量相结合,提出多策略情境中的DWI指标MSDWI)选题法与“先用MSSHE后用Fisher信息量”的两步选题法。基于多策略RRUM模型(MS-RRUM),将这两种方法与随机选题法在不同属性数量条件下进行模拟比较,结果表明：当属性数量为4个或6个时,两步选题法在策略判准率、认知状态判准率和能力估计三个方面都有最佳的效果。相似文献

6.

基于CD-CAT的多策略RRUM模型及其选题方法开发

戴步云张敏强焦璨黎光明朱华伟张文怡《心理学报》2015,47(12):1511-1519

在有多种解题策略的认知诊断问题情境中, 用每个Q矩阵表示一种解题策略, 由此将单策略认知诊断RRUM模型拓广为多策略RRUM模型(MS-RRUM)。随后, 在应用MS-RRUM模型的CD-CAT中开发了适用于多策略情境的MAP参数估计法和多策略香农熵(MSSHE)选题法。将MSSHE选题法与随机选题法分别在不同属性数量、不同测验长度下进行比较, 结果发现前者对被试的策略和认知状态判准率都显著优于后者, 而且都很理想。这样就顺利实现了在CD-CAT做策略诊断的目标。相似文献

7.

基于项目区分度的双目标CD-CAT选题策略

何洁毛秀珍唐倩王霞《心理科学》2022,(1):204-212

针对双目标CD-CAT,将六种项目区分度（鉴别力D、一般区分度GDI、优势比OR、2PL的区分度a、属性区分度ADI、认知诊断区分度CDI）分别与IPA方法结合,得到新的选题策略。模拟研究比较了它们的表现,还考察了区分度分层在控制项目曝光的表现。结果发现：新方法都能明显提高知识状态的判准率和能力估计精度;分层选题均能很好地提高题库利用率。总体上,OR加权能显著提高测量精度;OR分层选题在保证测量精度条件下显著提高项目曝光均匀性。相似文献

8.

引入曝光因子的计算机化自适应测验选题策略

程小扬丁树良严深海朱隆尹《心理学报》2011,43(2):203-212

在计算机化自适应测验(CAT)的研究中, 制定既高效又安全的选题策略是一个追求目标。用极大项目信息量准则(MIC)选题使得测验效率高、能力估计准确, 缺点是项目调用很不均匀, 影响考试的安全; 按a分层法通过控制试题曝光率以提高考试的安全性, 但该方法可能会使测验效率略有下降, 且该方法在各层内部无法实现对区分度的调整。本文针对上述两种选题策略的优缺点, 对0-1评分下的CAT, 通过引入曝光因子、分阶段自动调整区分度的影响以及提高选题准确性等手段, 对MIC和a-STR进行改进, 引入了两类新的选题策略。计算机模拟实验显示, 新的选题方法效果比较理想。相似文献

9.

CAT选题策略分类概述

简小珠戴海琦张敏强彭春妹《心理学探新》2014,34(5):446-451

选题是计算机化自适应测验（CAT）测试过程的关键环节,选题策略的目标是要达到较高的测量精度,同时也实现试题曝光率控制及其他测验目标的实现.本文根据选题策略的基本原理和衍生发展,将众多CAT选题策略分为五大选题策略系列：Fisher函数系列、K-LI函数系列、α分层系列、贝叶斯系列、b匹配系列;并根据测验目标（测验精度、试题曝光率控制、内容平衡、多条件约束）对这些选题策略进行了细分,并对CAT选题策略的选择思路进行归纳. 相似文献

10.

兼顾测验效率和题库使用率的CD-CAT选题策略

下载免费PDF全文

汪文义丁树良宋丽红《心理科学》2014,37(1):212-216

CD–CAT中已有选题策略较注重测验效率,而对题库使用率不够重视。针对此问题,基于DINA模型,引入两种新的选题策略KLED和RHA,同时对HA进行模拟研究。结果显示：PWKL与KLED只在测验效率上具有优势;KLED若按属性向量分层,题库使用率有所提高,KLED比ED更容易推广到其他有显式表达的诊断模型场合;HA、RHA和RP–PWKL可较好兼顾测验效度和题库使用率,但RP-PWKL需设置项目的最大曝光率阈值。两种新选题方法在定长和变长CD-CAT都具有一定的应用价值。相似文献

11.

多级评分计算机化自适应测验动态综合选题策略 总被引：1，自引：0，他引：1

罗芬丁树良王晓庆《心理学报》2012,44(3):400-412

多级评分可以提供更多关于被试的信息, 是计算机化自适应测验的一个发展方向, 选题策略是计算机化自适应测验的研究重点。对于多级评分的等级反应模型, 本文拟用区间估计的思想改进近期提出的几种选题策略, 并且将两级评分b-STR和a-STR推广到多级评分以改进最大信息量选题策略。Monte Carlo模拟实验表明在达到或接近原有选题策略测验精度的基础上, 本文提出的几种新选题策略有的能够有效降低测验长度, 有的可以极大降低项目曝光率。相似文献

12.

Consequences of Ignoring Guessing Effects on Measurement Invariance Analysis

Ismail Cuhadar Yanyun Yang Insu Paek 《应用心理检测》2021,45(4):283

Pseudo-guessing parameters are present in item response theory applications for many educational assessments. When sample size is not sufficiently large, the guessing parameters may be ignored from the analysis. This study examines the impact of ignoring pseudo-guessing parameters on measurement invariance analysis, specifically, on item difficulty, item discrimination, and mean and variance of ability distribution. Results show that when non-zero guessing parameters are ignored from the measurement invariance analysis, item discrimination estimates tend to decrease particularly for more difficult items, and item difficulty estimates decrease unless the items are highly discriminating and difficult. As the guessing parameter increases, the size of the decrease in item discrimination and difficulty tends to increase, and the estimated mean and variance of ability distribution tend to be inaccurate. When two groups have heterogeneous ability distributions, ignoring the guessing parameter affects the reference group and the focal group differently. Implications of result findings are discussed. 相似文献

13.

Adaptive reasoning: Componential and eye movement analysis of geometric analogy performance

Charles E. Bethell-Fox David F. Lohman Richard E. Snow 《Intelligence》1984,8(3):205-238

The present study explored individual differences in performance of a geometric analogies task. Whereas past studies employed true/false or two-alternative items, the present research included four-alternative items and studied eye movements and confidence judgements for each item performance as well as latency and error. Item difficulty proved to be a function of an interaction between the number of response alternatives and the number of elements in items, especially for subjects lower in fluid-analytic reasoning ability. Results were interpreted using two hypothesized performance strategies: constructive matching and response elimination. The less efficient of these, response elimination, seemed to be used more by lower ability subjects on more difficult items. While two previous theories resemble one or the other of these strategies, neither alone seems to capture the complexity of adaptive problem solving. It appears that a comprehensive theory should incorporate strategy shifting as a function of item difficulty and subject ability.Componential models, based in part on past research, revealed that a justification component was activated and deactivated depending upon the nature of the analogy being solved. In addition, two new components, spatial inference and spatial application, were identified as important on some items, suggesting that different geometric analogy items invoke different cognitive processing components. Thus, a comprehensive theory should also describe component activation and deactivation. 相似文献

14.

项目难度与分值对自定步调学习时间的影响

李伟健蔡任娜陈海德汪磊王敏敏《心理科学》2013,36(6):1363-1368

为了探讨项目难度与分值对自定步调学习时间的影响及学习时间分配的内在机制。实验1a和实验1b分别检验项目难度与分值对自定步调学习时间的影响,发现学习者倾向于将更多学习时间分配到困难或高分值的项目上;实验2设置”难1分项目-中5分项目-易5分项目”和“难1分项目-中1分项目-易5分项目”两种情境,在前者中发现难1分项目与中5分项目的自定步调学习时间显著多于易5分项目,后者中发现难1分项目的自定步调学习时间显著多于中1分项目和易5分项目,表明了学习者在自定步调学习中存在权衡过程。相似文献

15.

A Comparison of Using the Fixed Common-Precalibrated Parameter Method and the Matched Characteristic Curve Method for Linking Multiple-Test Items

《International Journal of Testing》2013,13(3):267-293

A linking design typically consists of a data collection procedure together with an item linking procedure that places item parameters calibrated from multiple test forms onto a common scale. This study considered 2 potentially useful item response theory linking designs. The first one is characterized by selecting a single set of common items across all multiple test forms, the precalibrated item parameters of which are kept fixed while the unknown parameters of the other items are being estimated. This linking design will be referred to as the fixed common-precalibrated item parameter design. However, data collected under this design could also be analyzed by the characteristic curve method, which constituted an alternative linking procedure. In this study, the relative merits of the 2 linking designs were examined with respect to their robustness against 3 manipulated conditions-namely, when the common items have imprecise estimates, when there is a noticeable difference in the average item difficulty between the common and the noncommon items, and when the examinees are heterogeneous in terms of their abilities. A parameter recovery study was conducted to achieve this purpose. The results indicated that both linking designs were capable of producing accurate linking of items and equivalent estimation of ability parameters under the 3 conditions. When the 2 designs were actually utilized in the development of an item bank, it was found that both linking designs produced quite consistent solutions despite minor differences on some item and ability estimates. Condition under which a linking design is preferred over the other is also provided in the Discussion section of this article. 相似文献

16.

难度－价值权衡情境下的项目选择及其眼动特征

王志伟姜英杰《心理科学》2019,(4):854-860

本研究采用行为和眼动技术,通过设置难度—价值权衡情境,考察在学习时间有限的条件下,学习者如何选择学习项目,以及该过程如何发生。结果发现：(1)被试更多选择得分期望最高的项目,而非分值最高或最简单的项目。(2)选择学习项目的过程不存在计算项目得分期望的过程。结果表明,学习者项目选择不仅基于难度或价值,而是会权衡难度和价值,选择得分期望最高的项目。该过程不符合补偿性决策理论的预期,与非补偿性理论预期一致。相似文献

17.

考虑题目选项信息的非参数认知诊断计算机自适应测验

孙小坚郭磊《心理学报》2022,54(9):1137-1150

选择题中的作答选项能提供额外诊断信息, 为充分利用选项信息, 研究提出认知诊断计算机自适应测验(CD-CAT)中两种处理选择题选项信息的非参数选题策略和变长终止规则。模拟研究的结果发现：(1)定长条件下两种非参数选题策略的分类准确性整体要高于参数选题策略; (2)两种非参数选题策略较参数选题策略具有更加均衡的题库使用情况; (3)非参数选题策略在两种新的变长终止规则下具有更高的分类准确率; (4)两种非参数选题策略均适用于选择题CD-CAT情境, 使用者可任选其一进行测验分析。相似文献