首页 | 本学科首页   官方微博 | 高级检索  
     

LP方法及其与三种常用DIF检测方法的比较
引用本文:余跃 杜文久 周娟 秦菊香. LP方法及其与三种常用DIF检测方法的比较[J]. 心理科学, 2016, 39(3): 720-726
作者姓名:余跃 杜文久 周娟 秦菊香
作者单位:1. 西南大学数学与统计学院;2. 成都石室双楠实验学校;
摘    要:本研究基于项目反应理论,提出了一种检验力高且犯Ⅰ类错误率小的检测DIF的新方法:LP法(Likelihood Procedure),且以2PLM下对题目进行DIF检验为例介绍此法。本文通过与MH方法、Lord卡方检验法和Raju面积测量法三种常用的检验DIF的方法比较研究LP法的有效性,同时探讨样本容量、测验长度、目标组和参照组能力分布的差异、DIF值大小等相关因素对LP法有效性可能产生的影响。通过模拟研究,得到以下结论:(1)LP法比MH法及Lord卡方法更灵敏且更稳健;(2) LP法比Raju面积测量法更合理;(3)LP法的检验力随着被试样本容量或DIF值的增大而增大;(4)当参照组与目标组的能力无差异时,LP法在各种条件下的检验力比参照组与目标组的能力有差异时的检验力高;(5)LP法对一致性DIF和非一致性DIF都有良好的检验力,且LP法对一致性DIF的检验力比对非一致性DIF的检验力高。LP法可以简便的扩展并运用到多维度、多级评分项目上。

关 键 词:项目功能差异  项目反应理论  LP法(Likelihood Procedure)  MH法(Mantel-Haenszel Procedure)  Lord卡方检验法  Raju面积测量法  
收稿时间:2015-07-20
修稿时间:2015-11-29

A New Method:LP and Its Comparision With Three kinds of Commonly Detect Procedure of DIF
Abstract:With the development of psychology metrology and wide application of psychological and educational tests, the fairness of test has been concerned by educators and psychologists, and more in-depth study on the differential item functioning has become the fact. Detection of differential item functioning (DIF) has been widely employed in the analysis of routine items, and a number of methods have been developed to detect DIF, such as Mantel-Hansel(MH) Procedure, Standardization(STND), Simultaneous Item Bias Procedure(SIBTEST), Likelihood Ration (LR) Test, Lord’s Chi-Square, Raju's Area Measures, MIMIC Method, etc. in most of those which there exist either a low power of test or a high type I error rate. Therefore it's necessary to find out one more effective method to detect DIF. Proposed in the paper for detecting differential item functioning (DIF), LP(Likelihood Procedure) is an IRT-based method with item-detection under the condition of two parameter logistic model (2PLM) as a representative.The performance of LP was compared with that of MH method, Lord chi-squared and Raju Area Measurement. DIF size, Test length, Sample size, the difference distribution of abilities between the focal group and reference group were also considered. Three levels of DIF size are 0.3, 0.5 and 0.8. Two levels of test length are 40 and 100. Three levels of sample size are 500 examinees, 1000 examinees and 2000 examinees. There are two distributions of abilities between the focal group and reference group, One fits in with standard normal distribution individually, the other says that distribution of abilities in reference group fits in with standard normal distribution while those in focal group fits in with normal distribution in which the mean is -1 and the standard deviation is 1. In this simulation study, data was generated using two parameter logistic model. The DIF item’s difficulty value in the study is corresponding to those in the focal group, or discrimination value is greater than those in the reference group. There are six DIF items in each group totally under the condition of uniform DIF and non-uniform DIF, including corresponding ones of three true-value DIF item. The simulation research indicates the following results: (1) LP has a high power of test and low and stable type I error rate. (2) As a whole the power of LP is higher than Lord chi-squared method and far higher than Mantel-Hansel(MH) method; and the type I error rate of LP is lower than Lord chi-squared method, when the test length is 100, MH method’s type I error rate is far beyond the range of stability scope. (3) LP is no better than Raju Area Measurement method in power of test, but the type I error rate of the later is so high that it’s above 0.1 and far beyond the range of stability scope under a variety of conditions.Generally speaking, LP has the following advantages: (1) LP is more sensitive and stability compared with MH. (2) LP is more reasonable used for checking DIF compared with Raju Area Measurement. (3) LP's power increases with the participants sample size or true DIF value. (4) Compared with the condition of same abilities, LP's power is lower When focal group and reference group behave diffierent abilities. (5) LP's power is high for both uniform DIF and non-uniform DIF, and the power is higher for the former. Finnally, LP is not only applicable to two parameter logistic model, but single parameter and three parameter logistic model as well. In addition, It’s easy to be applied extensively to multidimensional and multicategory scoring item.
Keywords:Differential Item Functioning (DIF)   Item Response Theory (IRT)   LP(Likelihood Procedure)   the Mantel-Hansel(MH) method    Lord’s Chi-Square   Raju’s Area Measures  
本文献已被 CNKI 等数据库收录!
点击此处可从《心理科学》浏览原始摘要信息
点击此处可从《心理科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号