排序方式: 共有43条查询结果,搜索用时 15 毫秒
1.
2.
Dorothy T. Thayer 《Psychometrika》1983,48(2):293-297
Consider an old testX consisting ofs sections and two new testsY andZ similar toX consisting ofp andq sections respectively. All subjects are given testX plus two variable sections from either testY orZ. Different pairings of variable sections are given to each subsample of subjects. We present a method of estimating the covariance matrix of the combined test (X
1, ...,X
s
,Y
1, ...,Y
p
,Z
1, ...,Z
q
) and describe an application of these estimation techniques to linear, observed-score, test equating.The author is indebted to Paul W. Holland and Donald B. Rubin for their encouragement and many helpful comments and suggestions that contributed significantly to the development of this paper.This research was supported by the Program Statistics Research Project of the ETS Research Statistics Group. 相似文献
3.
初中词汇理解能力量表的编制 总被引:4,自引:2,他引:2
应用项目反应理论为初中各年级编制了词汇理解能力的测验,其中包含了143个多项选择的词汇项目,经过反复预测和大规模的正式测试,证关了这三个测验的量表拟全于2PL模型,项目特征曲线拟合度良好的项目占全体项目数90%以上,能力的一维性也得以确认,经等值化后,各年级的区分度均值分别为0.61(初一),0.59(初二),0.55(初三)难度均值分别为-1.61,-1.30,-0.56。 相似文献
4.
Zhonghua Zhang 《应用心理检测》2021,45(5):331
In this study, the delta method was applied to estimate the standard errors of the true score equating when using the characteristic curve methods with the generalized partial credit model in test equating under the context of the common-item nonequivalent groups equating design. Simulation studies were further conducted to compare the performance of the delta method with that of the bootstrap method and the multiple imputation method. The results indicated that the standard errors produced by the delta method were very close to the criterion empirical standard errors as well as those yielded by the bootstrap method and the multiple imputation method under all the manipulated conditions. 相似文献
5.
探究带宽选择方法、样本量、题目数量、等值设计、数据模拟方式对项目反应理论观察分数核等值的影响。通过两种数据模拟方式,获得研究数据,并计算局部与全域评价指标。研究发现,在随机组设计中,带宽选择方法表现相似;考生样本量和题目数量影响甚微。在非等组设计中,惩罚法与Silverman经验准则表现优异;增加题目量可降低百分相对误差和随机误差;增加样本量导致百分相对误差变大,随机误差减小。数据模拟方式可影响等值评价。未来应重点关注等值系统评估。 相似文献
6.
实际应用中测验往往具有多维结构, 如果仍采用单维IRT方法进行等值, 会得到不准确的结果。因此对于多维结构的测验, 需要使用多维IRT等值方法来实现参数的转换。基于共同题设计, 文章通过模拟研究的方法, 考察了不同铆测验设计下几种多维IRT等值方法的表现, 同时考虑了测验长度、两个维度题目数量的比例、铆测验长度、铆测验的选择策略、两个维度之间的相关和等值群体的能力水平差异六个因素的影响。所比较的多维IRT等值方法有:均值/均值(MM)方法, 均值/标准差(MS)方法, Stoking-Lord (SL)方法, Haebara (HB)方法, 最小平方(LS)方法。结果显示:(1) SL, HB和LS方法得到的等值误差均方根最小, 且在各条件下表现较为稳定。(2) MM和MS方法在非等组条件下呈现出很大的误差均方根。(3)铆测验设计对SL, HB和LS方法的等值结果没有显著影响。(4)在两个维度之间的相关较高, 测验长度和铆测验长度较长, 等值群体的能力水平没有差异的条件下, SL, HB和LS方法得到的等值误差均方根最小。 相似文献
7.
现在,等值越来越受到各考试测验机构及测量学研究人员的重视,特别是项目反应理论等值的优越性更使他们有了信心。然而,很多人却没有注意到被试能力分布形态可能给等值结果带来的影响效果及程度。本研究以项目反应理论两级记分模型的项目参数等值在不同被试能力分布形态下的结果差异作为重点,探讨被试抽样偏差可能给项目特征曲线等值带来的误差问题。研究结果表明,被试能力分布形态会显著地影响项目参数等值的系数,特别地,能力分布的偏态系数与等值方程的截距存在显著的线性相关关系,但能力分布形态的变化对等值方程中斜率的影响并不明显 相似文献
8.
Abstract: In test operations using IRT (item response theory), items are included in a test before being used to rate subjects and the response data is used to estimate their item parameters. However, this method of test operation may lead to item content leakage and an adequate test operation can become difficult. To address this problem, Ozaki and Toyoda (2005, 2006 ) developed item difficulty parameter estimation methods that use paired comparison data from the perspective of the difficulty of items as judged by raters familiar with the field. In the present paper, an improved method of item difficulty parameter estimation is developed. In this new method, an item for which the difficulty parameter is to be estimated is compared with multiple items simultaneously, from the perspective of their difficulty. This is not a one-to-one comparison but a one-to-many comparison. In the comparisons, raters are informed that items selected from an item pool are ordered according to difficulty. The order will provide insight to improve the accuracy of judgment. 相似文献
9.
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages
over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions
of scores from two tests. The Bayesian model and the previous equating models are compared through the analysis of data sets
famous in the equating literature. Also, the classical percentile-rank, linear, and mean equating models are each proven to
be a special case of a Bayesian model under a highly-informative choice of prior distribution. 相似文献
10.