首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 156 毫秒
1.
检验项目功能差异的两类方法-CFA和IRT的比较   总被引:2,自引:0,他引:2  
目前在验证性因素分析(CFA)和项目反应理论(IRT)两个领域,都有一些检验方法来识别项目功能差异(DIF)。该文主要针对单维的多级计分项目,分别介绍CFA和IRT检测DIF的方法,并进行二者的比较。  相似文献   

2.
中国少数民族考生与外国考生HSK成绩的公平性分析   总被引:3,自引:0,他引:3  
该研究利用项目功能差异(Differential Item Functioning,简称DIF)理论,对HSK考生中不同两个群体——外国人和中国国内的少数民族,进行题目的反应分析。考查HSK的题目是否存在不利于某一群体。具体做法:采用MH和SIBTEST方法检测DIF,利用标准化的离散分析方法和SIBTEST的项目束分析法鉴别DIF的真伪,并寻找造成DIF的原因。由数据分析的结果可知,HSK(初、中等)A卷对对外国考生和中国国内的少数民族考生存在一些有DIF的题目。  相似文献   

3.
运用均数与协方差结构模型侦查项目功能差异   总被引:1,自引:0,他引:1       下载免费PDF全文
阐释了运用多组均数与协方差结构(MACS)模型侦查多级反应项目的一致性与非一致性项目功能差异(DIF)的原理与程序, 以道德自我概念量表DIF的侦查进行示例, 并对该方法进行了评价。与项目反应理论比照, MACS采用系统的、迭代的方式利用修正指数来侦查DIF, 并提供多个拟合指数协同评价模型拟合;与标准验证性因素分析相较, MACS不仅能侦查非一致性DIF, 而且能侦查一致性DIF。运用MACS侦查DIF是一种值得推荐的方法。  相似文献   

4.
篇章形式的阅读测验在语文学科考试与语言能力测试中占有越来越重要的地位。篇章阅读测验是一种典型的题组测验, 因此需要采用能够处理题组效应的统计方法进行分析。在进行项目功能差异(DIF)检验时, 也需要采用与之匹配的DIF检验方法。目前能够处理题组效应的DIF检验方法主要包括变通的题组DIF检验方法和基于题组反应模型的DIF检验方法, 基于题组反应模型的DIF检验方法由于实现过程繁琐, 目前只停留在理论探讨阶段。本研究将变通的题组DIF检验方法及其效应值指标引入篇章阅读测验的DIF检验中, 能够解决篇章阅读测验中DIF检验与测量的问题, 效应值指标能够为如何处理有DIF效应的题组项目提供重要依据。本研究首先选用非题组DIF检验方法与变通的题组DIF检验方法对一份试卷进行DIF检验, 两种方法的比较结果体现了进行题组DIF检验的必要性与优越性, 然后选用变通的题组DIF检验方法中有代表性的四种方法对某阅读成就测验进行题组DIF检验。研究结果表明, 在篇章阅读测验中, 能够处理题组效应的DIF检验方法较传统的DIF检验方法具有较大的优越性。  相似文献   

5.
篇章形式的阅读测验是一种典型的题组测验,在进行项目功能差异(DIF)检验时需要采用与之匹配的DIF检验方法.基于题组反应模型的DIF检验方法是真正能够处理题组效应的DIF检验方法,能够提供题组中每个项目的DIF效应测量,是题组DIF检验方法中较有理论优势的一种,主要使用的方法是Rasch题组DIF检验方法.该研究将Rasch题组DIF检验方法引入篇章阅读测验的DIF检验中,对某阅读成就测验进行题组DIF检验,结果显示,该测验在内容维度和能力维度的部分子维度上出现了具有显著DIF效应的项目,研究从测验公平的角度对该测验的进一步修改及编制提出了一定的建议.研究中进一步将Rasch题组DIF检验方法与基于传统Rasch模型的DIF检验方法以及变通的题组DIF检验方法的结果进行比较,研究结果体现了进行题组DIF检验的必要性与优越性.研究结果表明,在篇章阅读测验中,能够真正处理题组效应的题组DIF检验方法更加具有理论优势且对于阅读测验的编制与质量的提高具有更重要的意义.  相似文献   

6.
采用Rosenberg自尊量表(RSES)对425名在校大学生进行施测,应用项目反应理论的Rasch模型对项目指标进行分析及DIF检验。结果表明,Rosenberg自尊量表具有单维性,量表的信度为0.84; 除项目8以外,其他项目拟合指标良好,较适用来区分中等及偏低自尊水平的个体,项目功能差异检验发现在项目1和项目5上存在DIF,表现为男生自尊水平要高于女生。相对于经典测量理论,应用Rasch模型分析Rosenberg自尊量表具有优势,为进一步的完善和使用该自尊量表提供依据。  相似文献   

7.
王卓然  郭磊  边玉芳 《心理学报》2014,46(12):1923-1932
检测项目功能差异(DIF)是认知诊断测验中很重要的问题。首先将逻辑斯蒂克回归法(LR)引入认知诊断测验DIF检测, 然后将LR法与MH法和Wald检验法的DIF检验效果进行比较。在比较中同时考察了匹配变量、DIF种类、DIF大小和受测者人数的影响。结果表明:(1) LR法在认知诊断测验DIF检测中, 检验力较高, 一类错误率较低。(2) LR法在检测认知诊断测验的DIF时, 不受认知诊断方法的影响。(3) LR法可以有效区分一致性DIF和非一致性DIF, 并有较高检验力和较低一类错误率。(4)采用知识状态作为匹配变量, 能够得到较理想的检验力和一类错误率。(5) DIF越大, 受测者人数越多, 统计检验力越高, 但一类错误率不受影响。  相似文献   

8.
本研究基于项目反应理论,提出了一种检验力高且犯Ⅰ类错误率小的检测DIF的新方法:LP法(Likelihood Procedure),且以2PLM下对题目进行DIF检验为例介绍此法。本文通过与MH方法、Lord卡方检验法和Raju面积测量法三种常用的检验DIF的方法比较研究LP法的有效性,同时探讨样本容量、测验长度、目标组和参照组能力分布的差异、DIF值大小等相关因素对LP法有效性可能产生的影响。通过模拟研究,得到以下结论:(1)LP法比MH法及Lord卡方法更灵敏且更稳健;(2) LP法比Raju面积测量法更合理;(3)LP法的检验力随着被试样本容量或DIF值的增大而增大;(4)当参照组与目标组的能力无差异时,LP法在各种条件下的检验力比参照组与目标组的能力有差异时的检验力高;(5)LP法对一致性DIF和非一致性DIF都有良好的检验力,且LP法对一致性DIF的检验力比对非一致性DIF的检验力高。LP法可以简便的扩展并运用到多维度、多级评分项目上。  相似文献   

9.
项目功能差异在跨文化人格问卷分析中的应用   总被引:2,自引:0,他引:2  
曹亦薇 《心理学报》2003,35(1):120-126
利用IRT的等级模型调查了中日两组被试关于SHIBA简易人格量表中“环境敏感性”的项目功能差异(DIF)的现状。研究发现:(1)量表中DIF的项目比例大(3/4);(2)DIF与项目内容、阈值有关而与区分度大小关系不大;(3)DIF项目间的日方特征曲线较之中方有较强的整合性。该研究利用DIF研究结果对跨文化的人格比较作了新尝试。最后提出了关于深化DIF研究的新课题  相似文献   

10.
矩阵取样测验包含多个题册,单个题册的总分不能直接作为匹配变量用于 DIF 检测。本研究首先基于模拟数据,同时采用 I RT_Δb法,以及用 I RT模型估计的考生能力作为匹配变量修订后的 L R法对矩阵取样测验进行DIF检测,分析二者进行DIF检测的有效性及其相关影响因素;并根据已有的LR法DIF判断标准划定出I RT_Δb法分类标准;最后使用实证数据加以验证。结果显示:矩阵取样测验中, I RT_Δb法和修正LR法均能较好地区分DIF量不同的题目;样本量、题册中DIF题目的比例和考生群体间真实能力的差异对两种方法的检验力、犯I类错误的概率和分类结果都有较大影响。  相似文献   

11.
This study investigated differential item functioning (DIF) mechanisms in the context of differential testlet effects across subgroups. Specifically, we investigated DIF manifestations when the stochastic ordering assumption on the nuisance dimension in a testlet does not hold. DIF hypotheses were formulated analytically using a parametric marginal item response function approach and compared with empirical DIF results from a unidimensional item response theory approach. The comparisons were made in terms of type of DIF (uniform or non‐uniform) and direction (whether the focal or reference group was advantaged). In general, the DIF hypotheses were supported by the empirical results, showing the usefulness of the parametric approach in explaining DIF mechanisms. Both analytical predictions of DIF and the empirical results provide insights into conditions where a particular type of DIF becomes dominant in a specific DIF direction, which is useful for the study of DIF causes.  相似文献   

12.
三种常用DIF检测方法的比较研究   总被引:6,自引:1,他引:5  
本研究在对DIF作出新的更为严格的界定、对三种常用的DIF检澍方法进行详细介绍的基础上.以1999年高考英语试卷75道选择题为研究材料,对三种常用的DIF检测方法进行了实证研究。结果表明:MH方法与SIBTEST方法检测的敏感性较STND方法高;MH方法与SIBTEST方法检出的一致性很高;SIBTEST方法具有较好的性能,在实际应用中可以作为首选的方法;在进行DIF检测时,样本容量在1000左右为宜。  相似文献   

13.
Standardized tests are used widely in comparative studies of clinical populations, either as dependent or control variables. Yet, one cannot always be sure that the test items measure the same constructs in the groups under study. In the present work, 460 participants with intellectual disability of undifferentiated etiology and 488 typical children were tested using Raven's Colored Progressive Matrices (RCPM). Data were analyzed using binomial logistic regression modeling designed to detect differential item functioning (DIF). Results showed that 12 items out of 36 function differentially between the two groups, but only 2 items exhibit at least moderate DIF. Thus, a very large majority of the items have identical discriminative power and difficulty levels across the two groups. It is concluded that RCPM can be used with confidence in studies comparing participants with and without intellectual disability. In addition, it is suggested that methods for investigating internal bias of tests used in cross-cultural, cross-linguistic or cross-gender comparisons should also be regularly employed in studies of clinical populations, particularly in the field of developmental disability, to show the absence of systematic measurement error (i.e. DIF) affecting item responses.  相似文献   

14.
Differential item functioning (DIF), referring to between-group variation in item characteristics above and beyond the group-level disparity in the latent variable of interest, has long been regarded as an important item-level diagnostic. The presence of DIF impairs the fit of the single-group item response model being used, and calls for either model modification or item deletion in practice, depending on the mode of analysis. Methods for testing DIF with continuous covariates, rather than categorical grouping variables, have been developed; however, they are restrictive in parametric forms, and thus are not sufficiently flexible to describe complex interaction among latent variables and covariates. In the current study, we formulate the probability of endorsing each test item as a general bivariate function of a unidimensional latent trait and a single covariate, which is then approximated by a two-dimensional smoothing spline. The accuracy and precision of the proposed procedure is evaluated via Monte Carlo simulations. If anchor items are available, we proposed an extended model that simultaneously estimates item characteristic functions (ICFs) for anchor items, ICFs conditional on the covariate for non-anchor items, and the latent variable density conditional on the covariate—all using regression splines. A permutation DIF test is developed, and its performance is compared to the conventional parametric approach in a simulation study. We also illustrate the proposed semiparametric DIF testing procedure with an empirical example.  相似文献   

15.
刘红云  骆方 《心理学报》2008,40(1):92-100
作者简要介绍了多水平项目反应模型,对多水平项目反应理论与通常项目反应理论之间的关系进行了探讨,得到了多水平项目反应模型参数与通常项目反应模型参数之间的关系,并讨论了多水平项目反应模型的推广模型。通过一个实际例子,用多水平项目反应模型对测验中项目的特征进行分析;检验个体水平和组水平预测变量对能力参数的影响;对项目功能差异进行分析。最后文章就多水平项目反应理论模型的优势与不足进行了讨论  相似文献   

16.
This report documents relationships between differential item functioning (DIF) identification and: (1) item–trait association, and (2) scale multidimensionality in personality assessment. Applying [Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.] logistic regression model, DIF effect size is found to become increasingly inflated as investigated item associations with trait scores decrease. Similar patterns were noted for the influence of scale multidimensionality on DIF identification. Individuals who investigate DIF in personality assessment applications are provided with estimates regarding the impact of the magnitude of item and trait association and scale multidimensionality on DIF occurrence and effect size. The results emphasize the importance of excluding investigated items in focal trait identification prior to conducting DIF analyses and reporting item and scale psychometric properties in DIF reports.  相似文献   

17.
Abstract

Recent work reframes direct effects of covariates on items in mixture models as differential item functioning (DIF) and shows that, when present in the data but omitted from the fitted latent class model, DIF can lead to overextraction of classes. However, less is known about the effects of DIF on model performance—including parameter bias, classification accuracy, and distortion of class-specific response profiles—once the correct number of classes is chosen. First, we replicate and extend prior findings relating DIF to class enumeration using a comprehensive simulation study. In a second simulation study using the same parameters, we show that, while the performance of LCA is robust to the misspecification of DIF effects, it is degraded when DIF is omitted entirely. Moreover, the robustness of LCA to omitted DIF differs widely based on the degree of class separation. Finally, simulation results are contextualized by an empirical example.  相似文献   

18.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号