首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
本研究基于项目反应理论,提出了一种检验力高且犯Ⅰ类错误率小的检测DIF的新方法:LP法(Likelihood Procedure),且以2PLM下对题目进行DIF检验为例介绍此法。本文通过与MH方法、Lord卡方检验法和Raju面积测量法三种常用的检验DIF的方法比较研究LP法的有效性,同时探讨样本容量、测验长度、目标组和参照组能力分布的差异、DIF值大小等相关因素对LP法有效性可能产生的影响。通过模拟研究,得到以下结论:(1)LP法比MH法及Lord卡方法更灵敏且更稳健;(2) LP法比Raju面积测量法更合理;(3)LP法的检验力随着被试样本容量或DIF值的增大而增大;(4)当参照组与目标组的能力无差异时,LP法在各种条件下的检验力比参照组与目标组的能力有差异时的检验力高;(5)LP法对一致性DIF和非一致性DIF都有良好的检验力,且LP法对一致性DIF的检验力比对非一致性DIF的检验力高。LP法可以简便的扩展并运用到多维度、多级评分项目上。  相似文献   

2.
Item response theory was used to address gender bias in interest measurement. Differential item functioning (DIF) technique, SIBTEST and DIMTEST for dimensionality, were applied to the items of the six General Occupational Theme (GOT) and 25 Basic Interest (BI) scales in the Strong Interest Inventory. A sample of 1860 women and 1105 men was used. The scales were not unidimensional and contain both primary and minor dimensions. Gender-related DIF was detected in two-thirds of the items. Item type (i.e., occupations, activities, school subjects, types of people) did not differ in DIF. A sex-type dimension was found to influence the responses of men and women differently. When the biased items were removed from the GOT scales, gender differences favoring men were reduced in the R and I scales but gender differences favoring women remained in the A and S scales. Implications for the development, validation and use of interest measures are discussed.  相似文献   

3.
项目功能差异在跨文化人格问卷分析中的应用   总被引:2,自引:0,他引:2  
曹亦薇 《心理学报》2003,35(1):120-126
利用IRT的等级模型调查了中日两组被试关于SHIBA简易人格量表中“环境敏感性”的项目功能差异(DIF)的现状。研究发现:(1)量表中DIF的项目比例大(3/4);(2)DIF与项目内容、阈值有关而与区分度大小关系不大;(3)DIF项目间的日方特征曲线较之中方有较强的整合性。该研究利用DIF研究结果对跨文化的人格比较作了新尝试。最后提出了关于深化DIF研究的新课题  相似文献   

4.
Standardized tests are used widely in comparative studies of clinical populations, either as dependent or control variables. Yet, one cannot always be sure that the test items measure the same constructs in the groups under study. In the present work, 460 participants with intellectual disability of undifferentiated etiology and 488 typical children were tested using Raven's Colored Progressive Matrices (RCPM). Data were analyzed using binomial logistic regression modeling designed to detect differential item functioning (DIF). Results showed that 12 items out of 36 function differentially between the two groups, but only 2 items exhibit at least moderate DIF. Thus, a very large majority of the items have identical discriminative power and difficulty levels across the two groups. It is concluded that RCPM can be used with confidence in studies comparing participants with and without intellectual disability. In addition, it is suggested that methods for investigating internal bias of tests used in cross-cultural, cross-linguistic or cross-gender comparisons should also be regularly employed in studies of clinical populations, particularly in the field of developmental disability, to show the absence of systematic measurement error (i.e. DIF) affecting item responses.  相似文献   

5.
经济法试题DIF的参数法检测研究   总被引:2,自引:1,他引:1  
该研究基于项目反应理论的Samejima等级反应模型(GRM),在MULTILOG软件支持下,应用参数检测方法,对某年度全国性资格考试的某科目试卷中经济法部分的21个项目做了DIF检测分析。结果如下:存在性别DIF的项目一个,存在民族DIF的项目四个,存在工作性质DIF的项目一个。其中项目68在民族层面上表现为一致性DIF,项目64既存在民族DIF又存在工作性质DIF。通过对项目统计量、反应曲线的分析和专家的讨论,文章最后还分析了产生这些DIF的几个可能的原因。  相似文献   

6.
7.
A method for analyzing test item responses is proposed to examine differential item functioning (DIF) in multiple-choice items through a combination of the usual notion of DIF, for correct/incorrect responses and information about DIF contained in each of the alternatives. The proposed method uses incomplete latent class models to examine whether DIF is caused by the attractiveness of the alternatives, difficulty of the item, or both. DIF with respect to either known or unknown subgroups can be tested by a likelihood ratio test that is asymptotically distributed as a chi-square random variable.  相似文献   

8.
The 20-item Toronto Alexithymia Scale (TAS-20) is a self-report questionnaire designed to measure the three components of alexithymia; difficulty identifying feelings in the self (DIF), difficulty describing feelings (DDF), and externally orientated thinking (EOT). We examined the scale’s psychometric properties in Australian nonclinical (N = 428) and psychiatric (N = 156) samples. In terms of factorial validity, confirmatory factor analyses found the traditional 3-factor correlated model (DIF, DDF, EOT) to be the best and most parsimonious solution, but it did not reach adequate levels of goodness-of-fit in either sample. Several EOT items loaded poorly on their intended factor, and a reverse-scored item method factor was present; the factor structure of the scale was invariant across both samples. A higher-order factor model (with a single higher-order factor) was slightly inferior to the correlated models, but still tenable. The total scale score and DIF and DDF subscales displayed sound internal consistency, but the EOT subscale did not. We conclude that the TAS-20 has, for the most part, adequate psychometric properties, though interpretation should focus only on the total scale score and DIF and DDF subscales; we recommend the EOT subscale score not be used. Implications for clinical use and future revision of the scale are discussed.  相似文献   

9.
This study investigated gender based differential item functioning (DIF) in science literacy items included in the Program for International Student Assessment (PISA) 2012. Prior research has suggested presence of such DIF in large scale surveys. Our study extends the empirical literature by examining gender based DIF differences at the country level in order to gain a better overall picture of how cultural and national differences affect occurrence of uniform and nonuniform DIF. Our statistical results indicate existence of widespread gender based DIF in PISA with estimates of percentage of potentially biased items ranging between 2 and 44% (M = 16, SD = 9.9). Our reliance on nationally representative country samples allow these findings to have wide applicability.  相似文献   

10.
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response patterns. It also yielded more DIF items with larger effect sizes and more consistent item response patterns by substantive aspects (e.g., reading comprehension processes and cognitive complexity of items). Based on our findings, we suggest empirically evaluating the homogeneity assumption in international assessments because international populations cannot be assumed to have homogeneous item response patterns. Otherwise, differences in response patterns within these populations may be under-detected when conducting manifest DIF analyses. Detecting differences in item responses across international examinee populations has implications on the generalizability and meaningfulness of DIF findings as they apply to heterogeneous examinee subgroups.  相似文献   

11.
题目差异功能(differential item functioning,DIF)是构造测验公平性的重要依据,DIF的研究与测验的效度有直接的关联.本文通过对DIF的提出作简要的回顾,着重介绍如何运用Logistic Regression探测一致性DIF和非一致性DIF,并例证了学习适应性测验(AAT)的6个项目在性别上存在题目差异功能.  相似文献   

12.
The aim of this study was to determine whether the items from a reading comprehension test in European Portuguese function differently across students from rural and urban areas, which biases the test validity and the equity in assessment. The sample was composed of 653 students from second, third and fourth grades. The presence of differential item functioning (DIF) was analysed using logistic regression and the Mantel–Haenszel procedure. Although 17 items were flagged with DIF, only five items showed non-negligible DIF in all effect-size measures. The evidence of invariance across students with rural or urban backgrounds for most of the items supports the validity of the test though the five identified items should be further investigated.  相似文献   

13.
The present study examined the psychometric properties of a universal screening instrument called the Emotional and Behavioral Screener (EBS), which is designed to identify students exhibiting emotional and behavioral problems. The primary purposes of this study were to assess the measurement invariance of EBS items between Caucasian and African-American students and to assess the impact of differential item functioning (DIF) on EBS scores. The sample consisted of 946 elementary students from throughout the U.S. The findings suggested that EBS items exhibited small to negligible levels of DIF, and that DIF did not significantly impact EBS scores. The results supported the EBS as universal screening instrument that is fair in measuring the emotional and behavioral risk of elementary students. Research limitations and implications for school professionals are discussed.  相似文献   

14.
本文将多维题组反应模型(MTRM)应用到多维题组测验的项目功能差异(DIF)检验中,通过模拟研究和应用研究探究MTRM在DIF检验中的准确性、有效性和影响因素,并与忽略题组效应的多维随机系数多项Logistic模型(MRCMLM)进行对比。结果表明:(1)随着样本量的增大,MTRM对有效DIF值检出率增高,错误率降低,在不同条件下结果的稳定性更高;(2)与MRCMLM相比,基于MTRM的DIF检验模型检验率更高,受到其他因素的影响更小;(3)当测验中题组效应较小时,MTRM与MRCMLM结果差异较小,但是MTRM模型拟合度更高。  相似文献   

15.
This study used an ideal point response model to examine the extent to which applicants and incumbents differ when responding to personality items. It was hypothesized that applicants' responses would exhibit less folding at high trait levels than incumbents' responses. We used sample data from applicants (N=1,509) and incumbents (N=1,568) who completed the 16 Personality Questionnaire Select. Differential item (DIF) and test functioning (DTF) analyses were conducted using the generalized graded unfolding model, which is based on ideal point model assumptions. Out of the 90 items, 50 showed DIF; however, only 11 were in the hypothesized direction. DTF was significant for 3 of the 12 scales; 2 were in the hypothesized direction.  相似文献   

16.
《Body image》2014,11(3):206-209
Many widely used measures of body image were developed using all-female samples and thus may not adequately capture the male experience of body dissatisfaction. The current study examined differential item functioning (DIF) in three commonly-used measures of body image: The Body Shape Questionnaire (N = 590, 39.7% male), the Body Dissatisfaction subscale of the Eating Disorders Inventory (N = 529, 44.6% male), and the Shape and Weight Concern subscales of the Eating Disorders Examination Questionnaire (N = 1116, 43.5% male). Participants completed a series of measures evaluating body image and eating pathology. Results evidenced statistically significant DIF in several of the items; one item met criteria for clinically significant DIF. While most items did not evidence clinically elevated levels of DIF, additional evaluation is necessary in order to determine overall quality of the measures in terms of capturing the experience of male body image concerns.  相似文献   

17.
A recent study of the Five Facet Mindfulness Questionnaire reported high levels of differential item functioning (DIF) for 18 of its 39 items in meditating and nonmeditating samples that were not demographically matched. In particular, meditators were more likely to endorse positively worded items whereas nonmeditators were more likely to deny negatively worded (reverse-scored) items. The present study replicated these analyses in demographically matched samples of meditators and nonmeditators (n = 115 each) and found that evidence for DIF was minimal. There was little or no evidence for differential relationships between positively and negatively worded items for meditators and nonmeditators. Findings suggest that DIF based on items' scoring direction is not problematic when the Five Facet Mindfulness Questionnaire is used to compare demographically similar meditators and nonmeditators.  相似文献   

18.
This study investigated whether the linguistic complexity of items leads to gender differential item functioning (DIF) on mathematics assessments. Two forms of a mathematics test were developed. The first form consisted of algebra items based on mathematical expressions, terms, and equations. In the second form, the same items were written as word problems without changing their contents and solutions. The test forms were given to a sample of 671 sixth-grade students from 10 middle schools in Turkey. The tests were administered to the students with a 4-week interval. Explanatory item response modeling and logistic regression approaches were used to examine gender DIF. Several word problems were flagged as having gender DIF in favor of female examinees, whereas mathematically expressed forms of the same items did not function differently across male and female examinees. The verbal content of word problems seems to influence the way males and females respond to items.  相似文献   

19.
This report documents relationships between differential item functioning (DIF) identification and: (1) item–trait association, and (2) scale multidimensionality in personality assessment. Applying [Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.] logistic regression model, DIF effect size is found to become increasingly inflated as investigated item associations with trait scores decrease. Similar patterns were noted for the influence of scale multidimensionality on DIF identification. Individuals who investigate DIF in personality assessment applications are provided with estimates regarding the impact of the magnitude of item and trait association and scale multidimensionality on DIF occurrence and effect size. The results emphasize the importance of excluding investigated items in focal trait identification prior to conducting DIF analyses and reporting item and scale psychometric properties in DIF reports.  相似文献   

20.
Usually, methods for detection of differential item functioning (DIF) compare the functioning of items across manifest groups. However, the manifest groups with respect to which the items function differentially may not necessarily coincide with the true source of the bias. It is expected that DIF detection under a model that includes a latent DIF variable is more sensitive to this source of bias. In a simulation study, it is shown that a mixture item response theory model, which includes a latent grouping variable, performs better in identifying DIF items than DIF detection methods using manifest variables only. The difference between manifest and latent DIF detection increases as the correlation between the manifest variable and the true source of the DIF becomes smaller. Different sample sizes, relative group sizes, and significance levels are studied. Finally, an empirical example demonstrates the detection of heterogeneity in a minority sample using a latent grouping variable. Manifest and latent DIF detection methods are applied to a Vocabulary test of the General Aptitude Test Battery (GATB).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号