首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
He  Yinhong  Chen  Ping 《Psychometrika》2020,85(1):35-55

The maintenance of item bank is essential for continuously implementing adaptive tests. Calibration of new items online provides an opportunity to efficiently replenish items for the operational item bank. In this study, a new optimal design for online calibration (referred to as D-c) is proposed by incorporating the idea of original D-optimal design into the reformed D-optimal design proposed by van der Linden and Ren (Psychometrika 80:263–288, 2015) (denoted as D-VR design). To deal with the dependence of design criteria on the unknown item parameters of new items, Bayesian versions of the locally optimal designs (e.g., D-c and D-VR) are put forward by adding prior information to the new items. In the simulation implementation of the locally optimal designs, five calibration sample sizes were used to obtain different levels of estimation precision for the initial item parameters, and two approaches were used to obtain the prior distributions in Bayesian optimal designs. Results showed that the D-c design performed well and retired smaller number of new items than the D-VR design at almost all levels of examinee sample size; the Bayesian version of D-c using the prior obtained from the operational items worked better than that using the default priors in BILOG-MG and PARSCALE; and Bayesian optimal designs generally outperformed locally optimal designs when the initial item parameters of the new items were poorly estimated.

  相似文献   

2.
Multidimensional computerized adaptive testing (MCAT) has received increasing attention over the past few years in educational measurement. Like all other formats of CAT, item replenishment is an essential part of MCAT for its item bank maintenance and management, which governs retiring overexposed or obsolete items over time and replacing them with new ones. Moreover, calibration precision of the new items will directly affect the estimation accuracy of examinees’ ability vectors. In unidimensional CAT (UCAT) and cognitive diagnostic CAT, online calibration techniques have been developed to effectively calibrate new items. However, there has been very little discussion of online calibration in MCAT in the literature. Thus, this paper proposes new online calibration methods for MCAT based upon some popular methods used in UCAT. Three representative methods, Method A, the ‘one EM cycle’ method and the ‘multiple EM cycles’ method, are generalized to MCAT. Three simulation studies were conducted to compare the three new methods by manipulating three factors (test length, item bank design, and level of correlation between coordinate dimensions). The results showed that all the new methods were able to recover the item parameters accurately, and the adaptive online calibration designs showed some improvements compared to the random design under most conditions.  相似文献   

3.
张雪琴  毛秀珍  李佳 《心理科学进展》2020,28(11):1970-1978
项目增补是题库建设和维护的重要手段, 而标定新题参数是项目增补的重要内容。在线标定设计和在线标定方法分别研究新题的施测方式和参数估计方法, 是计算机化自适应测验(computerized adaptive testing, CAT)情景下项目增补的核心技术。重点厘清在线标定设计与在线标定方法的发展思路和脉络, 并对它们的特点、联系和表现进行介绍和评价。未来应基于其他信息指标进一步研究在线标定设计, 可基于联合估计和误差校正的思路探究在线标定方法, 应加强研究认知诊断CAT和多维CAT的在线标定技术, 深入开展项目增补方法的实证研究。  相似文献   

4.
Ul Hassan  Mahmood  Miller  Frank 《Psychometrika》2019,84(4):1101-1128

Item calibration is a technique to estimate characteristics of questions (called items) for achievement tests. In computerized tests, item calibration is an important tool for maintaining, updating and developing new items for an item bank. To efficiently sample examinees with specific ability levels for this calibration, we use optimal design theory assuming that the probability to answer correctly follows an item response model. Locally optimal unrestricted designs have usually a few design points for ability. In practice, it is hard to sample examinees from a population with these specific ability levels due to unavailability or limited availability of examinees. To counter this problem, we use the concept of optimal restricted designs and show that this concept naturally fits to item calibration. We prove an equivalence theorem needed to verify optimality of a design. Locally optimal restricted designs provide intervals of ability levels for optimal calibration of an item. When assuming a two-parameter logistic model, several scenarios with D-optimal restricted designs are presented for calibration of a single item and simultaneous calibration of several items. These scenarios show that the naive way to sample examinees around unrestricted design points is not optimal.

  相似文献   

5.
谭青蓉  汪大勋  罗芬  蔡艳  涂冬波 《心理学报》2021,53(11):1286-1300
项目增补(Item Replenishing)对认知诊断计算机自适应测验(CD-CAT)题库的维护有着至关重要的作用, 而在线标定是一种重要的项目增补方式。基于数据挖掘中特征选择(Feature Selection)的思路, 提出一种高效的基于熵的信息增益的在线标定方法(记为IGEOCM), 该方法利用被试在新旧题上的作答联合估计新题的Q矩阵和项目参数。研究采用Monte Carlo模拟实验验证所开发新方法的效果, 并同时与已有的在线标定方法SIE、SIE-R-BIC和RMSEA-N进行比较。结果表明:新开发的IGEOCM在各实验条件下均具有较好的项目标定精度和项目估计效率, 且整体上优于已有的SIE等方法; 同时, IGEOCM标定新题所需的时间低于SIE等方法。总之, 研究为CD-CAT题库中项目的增补提供了一种更为高效、准确的方法。  相似文献   

6.
陈平 《心理学报》2016,48(9):1184-1198
在线标定技术由于具有诸多优点而被广泛应用于计算机化自适应测验(CAT)的新题标定。Method A是想法最直接、算法最简单的CAT在线标定方法, 但它具有明显的理论缺陷--在标定过程中将能力估计值视为能力真值。将全功能极大似然估计方法(FFMLE)与“利用充分性结果”估计方法(ECSE)的误差校正思路融入Method A (新方法分别记为FFMLE-Method A和ECSE-Method A), 从理论上对能力估计误差进行校正, 进而克服Method A的标定缺陷。模拟研究的结果表明:(1)在大多数实验条件下, 两种新方法较Method A总体上可以改进标定精度, 且在测验长度为10的短测验上的改进幅度最大; (2)当CAT测验长度较短或中等(10或20题)时, 两种新方法的表现与性能最优的MEM已非常接近。当测验长度较长(30题)时, ECSE-Method A的总体表现最好、优于MEM; (3)样本量越大, 各种方法的标定精度越高。  相似文献   

7.
计算机化自适应测验中原始题项目参数的估计   总被引:1,自引:1,他引:0  
计算机化自适应测验(Computerized Adaptive Testing, 简称CAT)其安全性面临着新的挑战, 小题库的安全更受威胁。如何建设一个大型、优质的题库成为CAT研究中一个非常重要的课题。目前CAT题库的建设存在一些问题, 如成本高且保密性较差。尤其是等值技术较复杂且锚题重复使用容易造成泄露。如能在实施CAT过程中插入未经过参数估计的项目(原始题), 同时对原始题项目参数进行估计, 这对建设大型、优质的CAT题库来说其意义是不言而喻的。本文基于1PLM和2PLM对此进行研究, 提出了原始题在线估计的新方法以及推导出了求区分度参数a迭代初值的计算公式。研究结果表明:无论是模拟研究还是实证研究, 原始题被作答的次数对项目参数估计结果都会产生不同的影响, 并且原始题作答人数越多项目参数估计精度也越高。  相似文献   

8.
陈平  辛涛 《心理学报》2011,43(6):710-724
项目增补对认知诊断计算机化自适应测验(CD-CAT)中的题库维护至关重要。在传统CAT中, 在线标定方法经常用于估计新题的项目参数。然而直到现在, 在CD-CAT领域还没有任何关于在线标定的论文公开发表。为将传统CAT中3种有代表性的在线标定方法(Method A、OEM和 MEM)推广至CD-CAT (CD-Method A、CD-OEM和CD-MEM)建立分析基础, 并采用模拟方法对这3种方法进行比较。研究表明:CD-Method A方法在项目参数的返真性方面优于其它两种方法; 自适应标定设计较随机标定设计可以提高项目参数的返真质量。  相似文献   

9.
Item calibration is an essential issue in modern item response theory based psychological or educational testing. Due to the popularity of computerized adaptive testing, methods to efficiently calibrate new items have become more important than that in the time when paper and pencil test administration is the norm. There are many calibration processes being proposed and discussed from both theoretical and practical perspectives. Among them, the online calibration may be one of the most cost effective processes. In this paper, under a variable length computerized adaptive testing scenario, we integrate the methods of adaptive design, sequential estimation, and measurement error models to solve online item calibration problems. The proposed sequential estimate of item parameters is shown to be strongly consistent and asymptotically normally distributed with a prechosen accuracy. Numerical results show that the proposed method is very promising in terms of both estimation accuracy and efficiency. The results of using calibrated items to estimate the latent trait levels are also reported.  相似文献   

10.
Replenishing item pools for on-line ability testing requires innovative and efficient data collection designs. By generating localD-optimal designs for selecting individual examinees, and consistently estimating item parameters in the presence of error in the design points, sequential procedures are efficient for on-line item calibration. The estimating error in the on-line ability values is accounted for with an item parameter estimate studied by Stefanski and Carroll. LocallyD-optimaln-point designs are derived using the branch-and-bound algorithm of Welch. In simulations, the overall sequential designs appear to be considerably more efficient than random seeding of items.This report was prepared under the Navy Manpower, Personnel, and Training R&D Program of the Office of the Chief of Naval Research under Contract N00014-87-0696. The authors wish to acknowledge the valuable advice and consultation given by Ronald Armstrong, Charles Davis, Bradford Sympson, Zhaobo Wang, Ing-Long Wu and three anonymous reviewers.  相似文献   

11.
Methods of cognitive diagnostic computerized adaptive testing (CD-CAT) under higher-order cognitive diagnosis models have been developed to simultaneously provide estimates of the attribute mastery statuses of examinees for formative assessment and estimates of a latent continuous trait for overall summative evaluation. In a typical CD-CAT environment, examinees are often subject to a time limit, and the examinees’ response times (RTs) for specific test items can be routinely recorded by custom-made programs. Because examinees are individually administered tailored sets of test items from the item pool, they may experience different levels of speededness during testing and different levels of risk of running out of time. In this study, RTs were considered during the item-selection procedure to control the test speededness and the RTs were treated as useful information for improving latent trait estimation in CD-CAT under the higher-order deterministic input, noisy ‘and’ gate (DINA) model. A modified posterior-weighted Kullback–Leibler (PWKL) method that maximizes the item information per time unit and a shadow-test method that assembles a provisional test subject to a specified time constraint were developed. Two simulation studies were conducted to assess the effects of the proposed methods on the quality of CD-CAT for fixed- and variable-length exams. The results show that, compared with the traditional PWKL method, the proposed methods preserve a lower risk of running out of time while ensuring satisfactory attribute estimation and providing more accurate estimates of the latent trait and speed parameters. Finally, several suggestions for future research are proposed.  相似文献   

12.
A nonparametric technique based on the Hamming distance is proposed in this research by recognizing that once the attribute vector is known, or correctly estimated with high probability, one can determine the item-by-attribute vectors for new items undergoing calibration. We consider the setting where Q is known for a large item bank, and the q-vectors of additional items are estimated. The method is studied in simulation under a wide variety of conditions, and is illustrated with the Tatsuoka fraction subtraction data. A consistency theorem is developed giving conditions under which nonparametric Q calibration can be expected to work.  相似文献   

13.
Computerized classification testing (CCT) commonly chooses items maximizing information at the cut score, which yields the most information for decision-making. However, a corollary problem is that all examinees will be given the same set of items, resulting in high test overlap rate and unbalanced item bank usage, which threatens test security. Moreover, another pivotal issue for CCT is time control. Since both the extremely long response time (RT) and large RT variability across examinees intensify time-induced anxiety, it is crucial to reduce the number of examinees exceeding the time limitation and the differences between examinees' test-taking times. To satisfy these practical needs, this paper proposes the novel idea of stage adaptiveness to tailor the item selection process to the decision-making requirement in each step and generate fresh insight into the existing response time selection method. Results indicate that a balanced item usage as well as short and stable test times across examinees can be achieved via the new methods.  相似文献   

14.
There has recently been much interest in computerized adaptive testing (CAT) for cognitive diagnosis. While there exist various item selection criteria and different asymptotically optimal designs, these are mostly constructed based on the asymptotic theory assuming the test length goes to infinity. In practice, with limited test lengths, the desired asymptotic optimality may not always apply, and there are few studies in the literature concerning the optimal design of finite items. Related questions, such as how many items we need in order to be able to identify the attribute pattern of an examinee and what types of initial items provide the optimal classification results, are still open. This paper aims to answer these questions by providing non‐asymptotic theory of the optimal selection of initial items in cognitive diagnostic CAT. In particular, for the optimal design, we provide necessary and sufficient conditions for the Q ‐matrix structure of the initial items. The theoretical development is suitable for a general family of cognitive diagnostic models. The results not only provide a guideline for the design of optimal item selection procedures, but also may be applied to guide item bank construction.  相似文献   

15.
This study investigates using response times (RTs) with item responses in a computerized adaptive test (CAT) setting to enhance item selection and ability estimation and control for differential speededness. Using van der Linden’s hierarchical framework, an extended procedure for joint estimation of ability and speed parameters for use in CAT is developed following van der Linden; this is called the joint expected a posteriori estimator (J-EAP). It is shown that the J-EAP estimate of ability and speededness outperforms the standard maximum likelihood estimator (MLE) of ability and speededness in terms of correlation, root mean square error, and bias. It is further shown that under the maximum information per time unit item selection method (MICT)—a method which uses estimates for ability and speededness directly—using the J-EAP further reduces average examinee time spent and variability in test times between examinees above the resulting gains of this selection algorithm with the MLE while maintaining estimation efficiency. Simulated test results are further corroborated with test parameters derived from a real data example.  相似文献   

16.
Interim assessment occurs throughout instruction to provide feedback about what students know and have achieved. Different from the current available cognitive diagnostic computerized adaptive testing (CD-CAT) design that focuses on assessment at a single time point, the authors discuss several designs of interim CD-CAT that are suitable in the learning context. The interim CD-CAT differs from the current available CD-CAT designs primarily because students’ mastery profile (i.e., skills mastery) changes due to learning, and new attributes are added periodically. Moreover, hierarchies exist among attributes taught sequentially and such information could be used during item selection. Two specific designs are considered: The first one is when new attributes are taught in Stage II, but the student mastery status of the previously taught attributes stays the same. The second design is when both new attributes are taught, and previously taught attributes can be further learned or forgotten in Stage II. For both designs, the authors propose an individual prior, which considers a person’s learning history and population learning model, to start an interim CD-CAT. Simulation results show that the Stage II CD-CAT using individual prior outperforms the methods using population priors. The GDINA (generalized deterministic inputs, noisy, “and” gate) diagnostic index (GDI) is extended to accommodate item hierarchies, and analytic results are provided to further illustrate the types of items that are most popular during item selection. As the first study that focuses on the application of CD-CAT in a learning context, the methods and results present herein showed the great promise of using CD-CAT to monitor learning.  相似文献   

17.
应用OMST在线装配模式,提出自适应分组认知诊断测验(CD-AMGT)。由于知识状态的先决关系是偏序关系,而且构成格(lattice),利用知识状态当前估计值在格中的上下确界对被试真实知识状态的可能范围进行界定,由此装配下一分组,分组中结合PWKL策略或SHE策略进行选题以兼顾诊断精度、效率和安全性。模拟实验表明,CD-AMGT与PWKL、SHE对比,当题目类型丰富时,以分类准确率略微降低为代价,其题库使用均匀性和计算用时均表现出较大优势。  相似文献   

18.
孙小坚  郭磊 《心理学报》2022,54(9):1137-1150
选择题中的作答选项能提供额外诊断信息, 为充分利用选项信息, 研究提出认知诊断计算机自适应测验(CD-CAT)中两种处理选择题选项信息的非参数选题策略和变长终止规则。模拟研究的结果发现:(1)定长条件下两种非参数选题策略的分类准确性整体要高于参数选题策略; (2)两种非参数选题策略较参数选题策略具有更加均衡的题库使用情况; (3)非参数选题策略在两种新的变长终止规则下具有更高的分类准确率; (4)两种非参数选题策略均适用于选择题CD-CAT情境, 使用者可任选其一进行测验分析。  相似文献   

19.
计算机形式的测验能够记录考生在测验中的题目作答时间(Response Time, RT),作为一种重要的辅助信息来源,RT对于测验开发和管理具有重要的价值,特别是在计算机化自适应测验(Computerized Adaptive Testing, CAT)领域。本文简要介绍了RT在CAT选题方面应用并作以简评,分析了这些技术在实践中的可行性。最后,探讨了当前RT应用于CAT选题存在的问题以及可以进一步开展的研究方向。  相似文献   

20.
The response time‐based concealed information test can reveal when a person recognizes a relevant item among other, irrelevant items, based on comparatively slower responding. Thereby, if a person is concealing the knowledge about the relevance of this item (e.g., recognizing it as a murder weapon), this deception can be revealed. A recent study, conducted online and using a between‐subject design, introduced a significantly enhanced version by including additional items in the task. While this modified version outperformed the original version, it also resulted in a much higher rate of participant dropouts (i.e., participants leaving the experiment's website without completing the task). The grave implication is that the perceived enhancement is perhaps merely due to selective attrition. Therefore, the current experiment replicates the original one, but using a within‐subject design. The results show that there is a large enhancement even when selective attrition is prevented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号