首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Some standard errors in item response theory   总被引:2,自引:0,他引:2  
The mathematics required to calculate the asymptotic standard errors of the parameters of three commonly used logistic item response models is described and used to generate values for some common situations. It is shown that the maximum likelihood estimation of a lower asymptote can wreak havoc with the accuracy of estimation of a location parameter, indicating that if one needs to have accurate estimates of location parameters (say for purposes of test linking/equating or computerized adaptive testing) the sample sizes required for acceptable accuracy may be unattainable in most applications. It is suggested that other estimation methods be used if the three parameter model is applied in these situations.The research reported here was supported, in part, by contract #F41689-81-6-0012 from the Air Force Human Resources Laboratory to McFann-Gray & Associates, Benjamin A. Fairbank, Jr., Principal Investigator. Further support of Wainer's effort was supplied by the Educational Testing Service, Program Statistics Research Project.  相似文献   

2.
The conventional method of measuring ability, which is based on items with assumed true parameter values obtained from a pretest, is compared to a Bayesian method that deals with the uncertainties of such items. Computational expressions are presented for approximating the posterior mean and variance of ability under the three-parameter logistic (3PL) model. A 1987 American College Testing Program (ACT) math test is used to demonstrate that the standard practice of using maximum likelihood or empirical Bayes techniques may seriously underestimate the uncertainty in estimated ability when the pretest sample is only moderately large.This work was partially supported under contract No. N00014-85-K-0113, NR150-535, from the Cognitive Science Program, Office of Naval Research. The authors wish to thank Mark D. Reckase for providing the ACT data used in the illustration and two referees, Asociate Editor and Editor for helpful suggestions.  相似文献   

3.
Standard procedures for estimating item parameters in item response theory (IRT) ignore collateral information that may be available about examinees, such as their standing on demographic and educational variables. This paper describes circumstances under which collateral information about examineesmay be used to make inferences about item parameters more precise, and circumstances under which itmust be used to obtain correct inferences.This work was supported by Contract No. N00014-85-K-0683, project designation NR 150-539, from the Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research. Reproduction in whole or in part is permitted for any purpose of the United States Government. We are indebted to Tim Davey, Eugene Johnson, and three anonymous referees for their comments on earlier versions of the paper.  相似文献   

4.
A plausibles-factor solution for many types of psychological and educational tests is one that exhibits a general factor ands − 1 group or method related factors. The bi-factor solution results from the constraint that each item has a nonzero loading on the primary dimension and at most one of thes − 1 group factors. This paper derives a bi-factor item-response model for binary response data. In marginal maximum likelihood estimation of item parameters, the bi-factor restriction leads to a major simplification of likelihood equations and (a) permits analysis of models with large numbers of group factors; (b) permits conditional dependence within identified subsets of items; and (c) provides more parsimonious factor solutions than an unrestricted full-information item factor analysis in some cases. Supported by the Cognitive Science Program, Office of Naval Research, Under grant #N00014-89-J-1104. We would like to thank Darrell Bock for several helpful suggestions.  相似文献   

5.
Replenishing item pools for on-line ability testing requires innovative and efficient data collection designs. By generating localD-optimal designs for selecting individual examinees, and consistently estimating item parameters in the presence of error in the design points, sequential procedures are efficient for on-line item calibration. The estimating error in the on-line ability values is accounted for with an item parameter estimate studied by Stefanski and Carroll. LocallyD-optimaln-point designs are derived using the branch-and-bound algorithm of Welch. In simulations, the overall sequential designs appear to be considerably more efficient than random seeding of items.This report was prepared under the Navy Manpower, Personnel, and Training R&D Program of the Office of the Chief of Naval Research under Contract N00014-87-0696. The authors wish to acknowledge the valuable advice and consultation given by Ronald Armstrong, Charles Davis, Bradford Sympson, Zhaobo Wang, Ing-Long Wu and three anonymous reviewers.  相似文献   

6.
Various different item response theory (IRT) models can be used in educational and psychological measurement to analyze test data. One of the major drawbacks of these models is that efficient parameter estimation can only be achieved with very large data sets. Therefore, it is often worthwhile to search for designs of the test data that in some way will optimize the parameter estimates. The results from the statistical theory on optimal design can be applied for efficient estimation of the parameters.A major problem in finding an optimal design for IRT models is that the designs are only optimal for a given set of parameters, that is, they are locally optimal. Locally optimal designs can be constructed with a sequential design procedure. In this paper minimax designs are proposed for IRT models to overcome the problem of local optimality. Minimax designs are compared to sequentially constructed designs for the two parameter logistic model and the results show that minimax design can be nearly as efficient as sequentially constructed designs.  相似文献   

7.
题组作为众多测验中的一种常见题型,由于项目间存在一定程度的依赖性而违背了局部独立性假设,若用项目反应模型进行参数估计将会出现较大的偏差.题组反应理论将被试与题组的交互作用纳入到模型中,解决了项目间相依性的问题.笔者对题组反应理论的发展、基本原理及其相关研究进行了综述,并将其应用在中学英语考试中.与项目反应理论相对比,结果发现:(1)题组反应模型与项目反应模型在各参数估计值的相关系数较强,尤其是能力参数和难度参数;(2)在置信区间宽度的比较上,题组反应模型在各个参数上均窄于项目反应模型,即题组反应模型的估计精度优于项目反应模型.  相似文献   

8.
设计一个理想测验和被试作答情况,在单、双参数模型下进行能力估计,存在第一、二未契合现象;增加c参数后进行能力估计,则能有效纠正第一未契合现象,仍然存在第二未契合现象,同时存在第三未契合现象;增加γ参数后进行能力估计,则能有效纠正第二未契合现象,仍然存在第一未契合现象,同时存在第四未契合现象;同时增加c、γ参数后进行能力估计,则能有效纠正第一、二、三、四未契合现象。最后概述了c、γ参数的测量含义  相似文献   

9.
The Dutch Identity: A new tool for the study of item response models   总被引:1,自引:0,他引:1  
The Dutch Identity is a useful way to reexpress the basic equations of item response models that relate the manifest probabilities to the item response functions (IRFs) and the latent trait distribution. The identity may be exploited in several ways. For example: (a) to suggest how item response models behave for large numbers of items—they are approximate submodels of second-order loglinear models for 2 J tables; (b) to suggest new ways to assess the dimensionality of the latent trait—principle components analysis of matrices composed of second-order interactions from loglinear models; (c) to give insight into the structure of latent class models; and (d) to illuminate the problem of identifying the IRFs and the latent trait distribution from sample data.This research was supported in part by contract number N00014-87-K-0730 from the Cognitive Science Program of the Office of Naval Research. I realized the usefulness of the identity in Theorem 1 while lecturing in the Netherlands during October, 1986. Because this was in no small part due to the stimulating psychometric atmosphere there, I call the result the Dutch Identity.  相似文献   

10.
Item response curves for a set of binary responses are studied from a Bayesian viewpoint of estimating the item parameters. For the two-parameter logistic model with normally distributed ability, restricted bivariate beta priors are used to illustrate the computation of the posterior mode via the EM algorithm. The procedure is illustrated by data from a mathematics test.This work was supported under Contract No. N00014-85-K-0113, NR 150-535, from Personnel and Training Research Programs, Psychological Sciences Division, Office of Naval Research. The authors wish to thank Mark D. Reckase for providing the ACT data used in the illustration and Michael J. Soltys for computational assistance. They also wish to thank the editor and four anonymous reviewers for many valuable suggestions.  相似文献   

11.
项目反应理论是测量被试潜在特质的现代测量理论, 潜在类别分析是基于模型的潜在特质分类技术。混合项目反应理论将项目反应理论与潜在类别分析相结合, 能够同时对被试分类并量化其潜在特质。在阐述混合项目反应理论概念、原理的基础上, 介绍了MRM、mNRM和mPCM等几种常见混合模型及其参数估计方法, 并从心理与行为特征分类、项目功能差异检测、测验效度评价等方面评述了其在心理测验中的应用发展轨迹。  相似文献   

12.
涂冬波  蔡艳  戴海琦  丁树良 《心理学报》2011,43(11):1329-1340
本研究介绍并引进了现代测量理论中的前沿技术—— 多维项目反应理论, 采用MCMC算法实现了其参数估计; 并将MIRT应用于瑞文高级推理测验, 以探讨MIRT在心理测验中的具体应用。研究结果表明:(1)本研究自主编制的MIRT参数估计程序基本可行, 其估计的精度与国外研究结论相当甚至更好。(2)在测验维度和样本容量两因素完全随机实验设计下(2×3), 随着被试和题目样本容量的增加, MIRT参数估计的精度越高且估计的稳定性越强; 但随着测验维度的增加, MIRT参数估计精度和稳定性均随之降低。(3)MIRT对心理测验的分析比UIRT能提供更为精确和细致的信息。它对心理测验的编制、开发及评价具有重要的指导和参考价值, 值得引进及借鉴。  相似文献   

13.
14.
简小珠  戴步云  戴海琦 《心理学报》2016,48(12):1625-1630
试题难度、试题考查重要性程度加权是多级记分试题的两个基本属性, 因而在IRT项目特征函数中需用不同参数来表示。以往多级记分模型用多个难度参数来描述多级记分试题的难度, 不能有效的表达多级记分试题的分数权重作用。从多级记分试题的分数加权作用角度, 本文提出Logistic加权模型并论述了理论构建思想。在Logistic加权模型下对项目参数估计的EM算法进行推导并编写了相应的参数估计程序。在Logistic加权模型下进行测验模拟, 发现项目参数估计的模拟返真性能良好。  相似文献   

15.
The paper addresses and discusses whether the tradition of accepting point-symmetric item characteristic curves is justified by uncovering the inconsistent relationship between the difficulties of items and the order of maximum likelihood estimates of ability. This inconsistency is intrinsic in models that provide point-symmetric item characteristic curves, and in this paper focus is put on the normal ogive model for observation. It is also questioned if in the logistic model the sufficient statistic has forfeited the rationale that is appropriate to the psychological reality. It is observed that the logistic model can be interpreted as the case in which the inconsistency in ordering the maximum likelihood estimates is degenerated.The paper proposes a family of models, called the logistic positive exponent family, which provides asymmetric item chacteristic curves. A model in this family has a consistent principle in ordering the maximum likelihood estimates of ability. The family is divided into two subsets each of which has its own principle, and includes the logistic model as a transition from one principle to the other. Rationale and some illustrative examples are given.  相似文献   

16.
It is often considered desirable to have the same ordering of the items by difficulty across different levels of the trait or ability. Such an ordering is an invariant item ordering (IIO). An IIO facilitates the interpretation of test results. For dichotomously scored items, earlier research surveyed the theory and methods of an invariant ordering in a nonparametric IRT context. Here the focus is on polytomously scored items, and both nonparametric and parametric IRT models are considered.The absence of the IIO property in twononparametric polytomous IRT models is discussed, and two nonparametric models are discussed that imply an IIO. A method is proposed that can be used to investigate whether empirical data imply an IIO. Furthermore, only twoparametric polytomous IRT models are found to imply an IIO. These are the rating scale model (Andrich, 1978) and a restricted rating scale version of the graded response model (Muraki, 1990). Well-known models, such as the partial credit model (Masters, 1982) and the graded response model (Samejima, 1969), do no imply an IIO.  相似文献   

17.
Constant latent odds-ratios models and the mantel-haenszel null hypothesis   总被引:1,自引:0,他引:1  
In the present paper, a new family of item response theory (IRT) models for dichotomous item scores is proposed. Two basic assumptions define the most general model of this family. The first assumption is local independence of the item scores given a unidimensional latent trait. The second assumption is that the odds-ratios for all item-pairs are constant functions of the latent trait. Since the latter assumption is characteristic of the whole family, the models are called constant latent odds-ratios (CLORs) models. One nonparametric special case and three parametric special cases of the general CLORs model are shown to be generalizations of the one-parameter logistic Rasch model. For all CLORs models, the total score (the unweighted sum of the item scores) is shown to be a sufficient statistic for the latent trait. In addition, conditions under the general CLORs model are studied for the investigation of differential item functioning (DIF) by means of the Mantel-Haenszel procedure. This research was supported by the Dutch Organization for Scientific Research (NWO), grant number 400-20-026.  相似文献   

18.
Bayes modal estimation in item response models   总被引:1,自引:0,他引:1  
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one-, two-, and three-parameter binary logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.This research was supported by a grant from the Spencer Foundation, Chicago, IL. Comments and suggestions on earlier drafts by Charles Lewis, Frederic Lord, Rosenbaum, James Ramsey, Hiroshi Watanabe, the editor, and two anonymous referees are gratefully acknowledged.  相似文献   

19.
针对双目标CD-CAT,将六种项目区分度(鉴别力D、一般区分度GDI、优势比OR、2PL的区分度a、属性区分度ADI、认知诊断区分度CDI)分别与IPA方法结合,得到新的选题策略。模拟研究比较了它们的表现,还考察了区分度分层在控制项目曝光的表现。结果发现:新方法都能明显提高知识状态的判准率和能力估计精度;分层选题均能很好地提高题库利用率。总体上,OR加权能显著提高测量精度;OR分层选题在保证测量精度条件下显著提高项目曝光均匀性。  相似文献   

20.
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters. Simulated data sets were analyzed via two joint and two marginal Bayesian estimation procedures. The marginal Bayesian estimation procedures yielded consistently smaller root mean square differences than the joint Bayesian estimation procedures for item and ability estimates. As the sample size and test length increased, the four Bayes procedures yielded essentially the same result.The authors wish to thank the Editor and anonymous reviewers for their insightful comments and suggestions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号