期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

罗莲《心理学探新》2008,28(2):69-74

该文介绍了一种新的等值方法一核等值法。首先介绍了核等值法的研究过程、它的主要特点以及五个步骤（前平滑处理、估计分数概率、连续化、等值、计算等值标准误）。之后,介绍了核等值法与其他传统的观察分等值方法的差异,最后是对核等值法的评价。相似文献

2.

吴锐丁树良甘登文《心理学报》2010,42(3):434-442

题组越来越多地出现在各类考试中, 采用标准的IRT模型对有题组的测验等值, 可能因忽略题组的局部相依性导致等值结果的失真。为解决此问题, 我们采用基于题组的2PTM模型及IRT特征曲线法等值, 以等值系数估计值的误差大小作为衡量标准, 以Wilcoxon符号秩检验为依据, 在几种不同情况下进行了大量的Monte Carlo模拟实验。实验结果表明, 考虑了局部相依性的题组模型2PTM绝大部分情况下都比2PLM等值的误差小且有显著性差异。另外, 用6种不同等值准则对2PTM等值并评价了不同条件下等值准则之间的优劣。相似文献

3.

一种新的等值准则及其适用范围的探讨 总被引：3，自引：0，他引：3

丁树良熊建华罗芬吴锐甘小方涂白《心理学报》2005,37(5):674-680

受假设检验方法的启发,该文引出了一种基于项目反应理论的新等值方法——平方根等值准则。它具有一些特点：定义式中答对、答错概率同时出现而不能互相替代;极易从0—1评分模式的版本转换到多级评分版本;它可以看成是Haebara等值准则的加权形式。以等值系数估计值的误差大小为衡量标准,以Wilcoxon符号秩检验为依据,大量的Monte Carlo模拟结果显示了一种有趣的现象,即等值方法的运用范围既与项目参数估计精度有关,又与等值系数A的范围有关,但与另一个等值系数B的范围无关。当项目参数估计精度较高或中等而A取值在0.9～1.3之间,新方法往往比Stocking_Lord方法和Haebara方法的估计误差小且有显著性差异,当项目参数估计精度较低时,而A从1.0～2.0时新方法都有优越性。相似文献

4.

锚题题型与等值估计方法对等值的影响

戴海崎刘启辉《心理学报》2002,34(4):37-40

锚测验———非等组设计是一种非常重要的等值设计方法。研究证明 :在此设计之下作为等值媒体的锚测验采用的题型不同对等值结果会有不同影响 ;采用的等值关系估计方法不同对等值结果也有不同影响 ;题型与估计方法之间还有明显的交互作用。研究认为 ,在当前的命题与评分技术水平条件下 ,锚测验以纯客观题组成为最佳 ;在锚测验题量固定的条件下 ,等值关系估计以选用频数估计法为最佳。相似文献

5.

概率分布等值法及其应用

丁树良吴锐张节兰熊建华《心理学报》2008,40(1):101-108

在项目反应理论框架下,根据已有文献提出了开发新的测验等值准则的方法,即许多准则都可以看成是通过对锚题上作答反应概率分布进行变换而导出。据此揭示了两个著名的等值准则——Haebara方法和Stocking-Lord方法之间的联系,并且导出了一个新的等值准则——余弦等值准则。为了讨论余弦准则的行为表现,开展了一系列Monte-Carlo模拟研究。模拟结果表明,余弦准则在多级评分模型GPCM上表现比Haebara方法和Stocking--Lord方法都好,而对GRM和2PLM,其表现不如Haebara,但可以和Stocking-Lord方法相提并论。这一发现提醒我们等值准则的选用是否恰当,不仅与等值系数所落的范围有关,而且还与项目反应函数（IRF）有更密切的关系相似文献

6.

测验等值：从IRT到MIRT

谢晶张厚粲《心理学探新》2009,29(5):67-71

等值作为保证测验公平性的技术手段,一直是测验理论研究的重要方面。MIRT理论的发展证明了题目和测验是复杂的,传统的单维模型已经不能满足对人和题目／测验之间关系的探讨需求。目前MIRT等值研究主要有两种取向,其中一种取向是研究多维数据对IRT等值会产生什么样的影响;第二种取向是通过开发新的计算方法和计算工具研究MIRT等值过程。MIRT等值研究最重要的是对等值方法和过程实现的研究,目前已取得一些进展,在进行这些研究的过程中最重要的考虑因素是控制其误差影响因素。相似文献

7.

核等值：一种观察分数等值体系

王少杰张敏强李拓宇梁正妍《心理科学进展》2020,28(5):855-870

核等值流程包括：预平滑、估计分数概率、连续化、等值、评估等值结果。该方法兼具线性等值与等百分位等值的优点, 各环节扩展性与包容性较强; 采用平滑与连续化处理, 可降低等值随机误差; 等值差异标准误等其所特有的概念为结果评估提供可靠的工具。连续化与带宽选择方法等因素均可影响其表现; 基于核等值的新方法为等值发展提供了新颖的视角。未来可关注核等值体系的扩充与完善、流程的更新、等值方法的结合和比较等方向。相似文献

8.

高中会考等值方法的比较研究

张光旭杨志明《心理学探新》1999,(4)

本研究采用随机等组设计与铆测验相结合的方案。首先验证了两随机等组的平均数、方差和分布状态无显著差异,再用随机等组的等值分作为等值效标来检验其他等值方法的误差,然后比较了在铆测验设计中三种线性等值方法（在不同总体权重下）的误差值,以选出适合高中合考的等值方法及总体权重。经研究发现：会考等值宜采用Tucker观察分数线性等值方法,并宜选择总体权重W1＝1。相似文献

9.

检验导出的等值新方法及其表现探讨

熊建华丁树良雷宁宁《心理学探新》2007,27(1):70-74

该文受Berkson将检验方法用于估计未知参数的启发,根据三个拟合优度统计量导出三种新的求取等值系数的方法,即:平方根等值方法(Square Root criterion,SQRTcrit)、对称相对熵等值方法(Symmetric Relative Entropy criterion,SREcrit)、加权等值方法(Weighted criterion,Wcrit),即Haebara准则的加权式。虽然在被检验的两个分布列很接近时,这三个多项拟合优度检验方法是渐近等价的,然而用它们求取等值系数时,Monte-Carlo模拟结果表明这三种新等值方法的行为表现存在差异。它们之间的差异和随机误差的大小有密切关系,即与项目参数估计的精度有关;还与等值系数A的范围有关。相似文献

10.

大学英语四、六级考试分数等值研究 总被引：5，自引：0，他引：5

朱正才《心理学报》2005,37(2):280-284

对现有的大学英语四、六级考试分数等值模式中存在的若干问题进行了深入的分析,并提出了新的解决方案——一个基于铆题设计和两参数IRT模型的解决方案。主要包括：（1）用两参数逻辑斯蒂模型替代原来的Rasch模型,以改进题目模型的适合性;（2）用共同题目的等值设计取代原来的共同被试等值设计,解决共同被试等值设计中,等值考生的动机水平难以控制的难题;（3）建立专用的等值用题库,并且一次性完成其中铆题的预测和参数标定工作,以解决原来等值模式中存在的误差累积问题。同时,由于铆题的保密工作难度较小,因此,等值专用题库对保证等值结果的可靠性也具有重大意义;（4）本文还对新的分数等值方案进行了真实的考试数据等值计算实验,并得到了一个令人满意的分数等值结果。相似文献

11.

A Bayesian Nonparametric Approach to Test Equating

George Karabatsos Stephen G. Walker 《Psychometrika》2009,74(2):211-232

A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are compared through the analysis of data sets famous in the equating literature. Also, the classical percentile-rank, linear, and mean equating models are each proven to be a special case of a Bayesian model under a highly-informative choice of prior distribution. 相似文献

12.

Standard Errors of Kernel Equating: Accounting for Bandwidth Estimation

Kseniia Marcq Bjrn Andersson 《应用心理检测》2022,46(3):200

In standardized testing, equating is used to ensure comparability of test scores across multiple test administrations. One equipercentile observed-score equating method is kernel equating, where an essential step is to obtain continuous approximations to the discrete score distributions by applying a kernel with a smoothing bandwidth parameter. When estimating the bandwidth, additional variability is introduced which is currently not accounted for when calculating the standard errors of equating. This poses a threat to the accuracy of the standard errors of equating. In this study, the asymptotic variance of the bandwidth parameter estimator is derived and a modified method for calculating the standard error of equating that accounts for the bandwidth estimation variability is introduced for the equivalent groups design. A simulation study is used to verify the derivations and confirm the accuracy of the modified method across several sample sizes and test lengths as compared to the existing method and the Monte Carlo standard error of equating estimates. The results show that the modified standard errors of equating are accurate under the considered conditions. Furthermore, the modified and the existing methods produce similar results which suggest that the bandwidth variability impact on the standard error of equating is minimal. 相似文献

13.

Asymptotic standard errors of irt observed-score equating methods

Haruhiko?Ogasawara Email author 《Psychometrika》2003,68(2):193-211

A method of the IRT observed-score equating using chain equating through a third test without equating coefficients is presented with the assumption of the three-parameter logistic model. The asymptotic standard errors of the equated scores by this method are obtained using the results given by M. Liou and P.E. Cheng. The asymptotic standard errors of the IRT observed-score equating method using a synthetic examinee group with equating coefficients, which is a currently used method, are also provided. Numerical examples show that the standard errors by these observed-score equating methods are similar to those by the corresponding true score equating methods except in the range of low scores.The author is indebted to Michael J. Kolen for access to the real data used in this article and anonymous reviewers for their corrections and suggestions on this work. 相似文献

14.

The reliability of linearly equated tests

Daniel O. Segall 《Psychometrika》1994,59(3):361-375

相似文献

15.

基于CTT的锚测验非等组设计中四种等值方法的比较研究

下载免费PDF全文

焦丽亚辛涛《心理发展与教育》2006,22(1):97-102

采用锚测验非等组设计的数据收集方案,对4种基于经典测量理论的等值方法进行了比较研究。研究数据取自TIMSS1999数据库,兼用等值标准误和交叉验证方法作为各等值方法比较的检验标准,利用CIPE程序对实验数据进行分析。研究结果表明,针对本研究所设置的等值情境,线性等值优于等百分位等值,其中Tucker线性方法比Levine观察分数线性方法更好一些,Braun-Holland线性方法不宜采用,频数估计等百分位方法等值误差较大,亦不足取。相似文献

16.

Analytic smoothing for equipercentile equating under the common item nonequivalent populations design 总被引：1，自引：0，他引：1

Michael J. Kolen David Jarjoura 《Psychometrika》1987,52(1):43-59

A cubic spline method for smoothing equipercentile equating relationships under the common item nonequivalent populations design is described. Statistical techniques based on bootstrap estimation are presented that are designed to aid in choosing an equating method/degree of smoothing. These include: (a) asymptotic significance tests that compare no equating and linear equating to equipercentile equating; (b) a scheme for estimating total equating error and for dividing total estimated error into systematic and random components. The smoothing technique and statistical procedures are explored and illustrated using data from forms of a professional certification test. 相似文献

17.

对15种测验等值方法的比较研究 总被引：20，自引：2，他引：18

谢小庆《心理学报》2000,32(2):217-222

此项研究通过试验方法对４种基于经典测验理论的等值方法和１１种基于项目反应理论的等值方法进行了比较研究。研究数据为ＨＳＫ正式考试的数据,研究采用了较为可靠的检验标准。研究结果表明,在有些情况下,进行等值处理并非是最好的选择;在题库建设中,某些ＩＲＴ方法是可行的;至少对于ＨＳＫ数据,不论是单、双、三参数,不论是ｍｓ方法和ｍｍ方法,ＩＲＴ参数转换等值方法的误差都较大,均不足取。相似文献