首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cross-cultural researchers have not used cultural dimensions to predict when differential item functioning (DIF) in attitude survey items is likely to occur. Predictive hypotheses for items related to supervision on a global corporate survey were developed based on 3 of Hofstede's (1991a) dimensions. In some cases, greater DIF was found on hypothesized items between countries differing on cultural dimensions. Implications for the use of this framework and DIF in examining multinational employee opinion surveys are discussed.  相似文献   

2.
The cross-cultural equivalence of a multinational employee opinion survey was examined using multiple-groups covariance structure analysis to examine 4 scales in 4 countries. Cultural and linguistic influences were considered by assessing equivalence across 2 pairs of countries having the same language but different cultures (U.S. and Australia, Mexico and Spain) and across countries differing in culture and language (U.S. and Mexico). The measure was equivalent across U.S. and Australian samples only. Analyses indicated items that were the source of lack of invariance. One cause explored was translation problems. Practical issues in assessing measurement equivalence in employee opinion surveys are discussed.  相似文献   

3.
This study proposes a multiple-group cognitive diagnosis model to account for the fact that students in different groups may use distinct attributes or use the same attributes but in different manners (e.g., conjunctive, disjunctive, and compensatory) to solve problems. Based on the proposed model, this study systematically investigates the performance of the likelihood ratio (LR) test and Wald test in detecting differential item functioning (DIF). A forward anchor item search procedure was also proposed to identify a set of anchor items with invariant item parameters across groups. Results showed that the LR and Wald tests with the forward anchor item search algorithm produced better calibrated Type I error rates than the ordinary LR and Wald tests, especially when items were of low quality. A set of real data were also analyzed to illustrate the use of these DIF detection procedures.  相似文献   

4.
As research continues to document differences in the prevalence of mental health problems such as depression across racial/ethnic groups, the issue of measurement equivalence becomes increasingly important to address. The Mood and Feelings Questionnaire (MFQ) is a widely used screening tool for child and adolescent depression. This study applied a differential item functioning (DIF) framework to data from a sample of 6th and 8th grade students in the Seattle Public School District (N = 3,593) to investigate the measurement equivalence of the MFQ. Several items in the MFQ were found to have DIF, but this DIF was associated with negligible individual- or group-level impact. These results suggest that differences in MFQ scores across groups are unlikely to be caused by measurement non-equivalence.  相似文献   

5.
This research used logistic regression to model item responses from a popular 360-degree-for-development survey used in a leadership development programme given to middle and upper level European managers in Brussels. The survey contained 106 items on 16 scales. The model used gender of ratee and rater group to identify items that exhibited differential item functioning (DIF). The rater groups were self, boss, peer, and direct report. The sample consisted of 356 survey families where a survey family consisted of a matched set of four surveys: one self, one boss, one peer, and one direct report. The sample contained 88% male and 12% female raters. The sample contained 1424 total surveys. The procedure for flagging items exhibiting differential functioning used effect size computed from Wald chi-square statistics rather than statistical significance, resulting in fewer flagged items. One item exhibited rating anomalies due to the gender of the ratee; 55 items exhibited DIF attributable to rater group. The apparent effect of the DIF was small with each item. An examination of the maximum likelihood parameter estimates suggested the rater group DIF was the result of either hierarchical complexity or organizational contingency. The DIF due to gender conformed to prior expectations of gender-related stereotypical interpretations. This research further suggested that DIF due to environmental complexity or organizational contingency could be a naturally occurring phenomenon in some 360-degree assessment, and that the interpretation of some 360-degree feedback could need to include the potential for such DIF to exist.  相似文献   

6.
Differential item functioning (DIF) assessment is key in score validation. When DIF is present scores may not accurately reflect the construct of interest for some groups of examinees, leading to incorrect conclusions from the scores. Given rising immigration, and the increased reliance of educational policymakers on cross-national assessments such as Programme for International Student Assessment, Trends in International Mathematics and Science Study, and Progress in International Reading Literacy Study (PIRLS), DIF with regard to native language is of particular interest in this context. However, given differences in language and cultures, assuming similar cross-national DIF may lead to mistaken assumptions about the impact of immigration status, and native language on test performance. The purpose of this study was to use model-based recursive partitioning (MBRP) to investigate uniform DIF in PIRLS items across European nations. Results demonstrated that DIF based on mother's language was present for several items on a PIRLS assessment, but that the patterns of DIF were not the same across all nations.  相似文献   

7.
This study investigated the equivalence of different types of informants, such as children (or early adolescents) and parents, in evaluating child externalizing and internalizing problems. We applied a polytomous item response theory (IRT) model for the Strengths and Difficulties Questionnaire (SDQ). We obtained responses to three subscales—Conduct Problems, Hyperactivity/Inattention, and Emotional Symptoms—from 541 elementary school students aged 10–12 years, fathers for 233 students, mothers for 275 students, and the homeroom teachers for 524 students. Expected values on the individual item calculated by the discrimination and threshold parameters were compared among students, fathers, and mothers as an investigation of differential item functioning (DIF) or differential informant functioning. Assessing either externalizing or internalizing problems were mostly equivalent between fathers and mothers, and most items for externalizing problems functioned equally between students and parents, whereas items for internalizing problems showed DIF between them. IRT also yielded that the intervals of response categories varied across items, particularly for the conduct problems items “fight” and “steal,” and positively worded items showed an extremely low threshold.  相似文献   

8.
基于改进的Wald统计量,将适用于两群组的DIF检测方法拓展至多群组的项目功能差异(DIF)检验;改进的Wald统计量将分别通过计算观察信息矩阵(Obs)和经验交叉相乘信息矩阵(XPD)而得到。模拟研究探讨了此二者与传统计算方法在多个群组下的DIF检验情况,结果表明:(1)Obs和XPD的一类错误率明显低于传统方法,DINA模型估计下Obs和XPD的一类错误率接近理论水平;(2)样本量和DIF量较大时,Obs和XPD具有与传统Wald统计量大体相同的统计检验力。  相似文献   

9.
汉语词汇测验中的项目功能差异初探   总被引:6,自引:1,他引:5  
曹亦薇  张厚粲 《心理学报》1999,32(4):460-467
该文运用两种不同的方法对实际的汉语词汇测验中的36个词汇进行了DIF探测。对于1400多劬的初三学生分别作了男女生与城郊学生间的比较。在男女组分析中检出7个属于一致性DIF的项目;对于城郊学生组经两种方法同时确定的DIF项目有7个,其中5个是一致性DIF,2个是非一致性DIF的项目。该文还讨论了产生DIF的可能因素。  相似文献   

10.

Differential item functioning (DIF) statistics were computed using items from the Peabody Individual Achievement Test (PIAT)-Reading Comprehension subtest for children of the same age group (ages 7 through 12 respectively). The pattern of observed DIF items was determined by comparing each cohort across age groups. Differences related to race and gender were also identified within each cohort. Characteristics of DIF items were identified based on sentence length, vocabulary frequency, and density of a sentence. DIF items were more frequently associated with short sentences than with long sentences. This study explored the potential limitation in the longitudinal use of items in an adaptive test.  相似文献   

11.
Because of the practical, theoretical, and legal implications of differential item functioning (DIF) for organizational assessments, studies of measurement equivalence are a necessary first step before scores can be compared across individuals from different groups. However, commonly recommended criteria for evaluating results from these analyses have several important limitations. The present study proposes an effect size index for confirmatory factor analytic (CFA) studies of measurement equivalence to address 1 of these limitations. The application of this index is illustrated with personality data from American English, Greek, and Chinese samples. Results showed a range of nonequivalence across these samples, and these differences were linked to the observed effects of DIF on the outcomes of the assessment (i.e., group-level mean differences and adverse impact).  相似文献   

12.
Much has been stated in the popular press about the effects of the events of 9/11/01 on employee attitudes about work. This study examined a large sample (N = 70,671) of employees of a multinational manufacturer whose annual employee survey data collection was interrupted by the events. After demonstrating measurement equivalence across time and countries, changes in attitudes pre- and post-9/11 were examined. Only negligible differences were found in Job Satisfaction, Supervisor Evaluation, Stress, and Organizational Commitment to Diversity for U.S. employees or for employees worldwide. Demographic differences in response to events were not found. Implications for understanding effects of stressful external events on employee perceptions of work are discussed.  相似文献   

13.
We investigated measurement equivalence in two antisocial behavior scales (i.e., one scale for adolescents and a second scale for young adults) by examining differential item functioning (DIF) for respondents from single-parent (n = 109) and two-parent families (n = 447). Even though one item in the scale for adolescents and two items in the scale for young adults showed significant DIF, the two scales exhibited non-significant differential test functioning (DTF). Both uniform and nonuniform DIF were investigated and examples of each type were identified. Specifically, uniform DIF was exhibited in the adolescent scale whereas nonuniform DIF was shown in the young adult scale. Implications of DIF results for assessment of antisocial behavior, along with strengths and limitations of the study, are discussed.  相似文献   

14.
This research provides an example of testing for differential item functioning (DIF) using multiple indicator multiple cause (MIMIC) structural equation models. True/False items on five scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP) were tested for uniform DIF in a sample of Air Force recruits with groups defined by gender and ethnicity. Uniform DIF exists when an item is more easily endorsed for one group than the other, controlling for group mean differences on the variable under study. Results revealed significant DIF for many SNAP items and some effects were quite large. Differentially-functioning items can produce measurement bias and should be either deleted or modeled as if separate items were administered to different groups. Future research should aim to determine whether the DIF observed here holds for other samples.  相似文献   

15.
Meta-analyses on job crafting reveal that while approach-oriented job crafting (e.g., increasing job resources or challenging job demands) relates positively to employee performance, avoidance-oriented job crafting (e.g., decreasing hindering job demands) has either non-significant or negative implications for employee functioning. However, the joint effects of approach and avoidance job crafting remain an underdeveloped area of research. We administered a three-week diary survey among 87 employees to test interaction effects of approach and avoidance job crafting on employee (other-referenced and past-referenced) work performance and employability. Results revealed that decreasing hindering job demands related positively to other-referenced performance when increasing social job resources was higher than employees’ average, and to past-referenced performance when increasing structural job resources was higher than employees’ average. Also, decreasing hindering job demands related negatively with employability only at lower levels of increasing challenging job demands, while the relationship was non-significant at higher levels of increasing challenging demands. These results indicate that considering job crafting strategies in tandem adds to our understanding of their role for employee functioning.  相似文献   

16.
The Strengths and Difficulties Questionnaire (SDQ) is one of the most widely used measures of young people’s mental health difficulties in research and clinical decision-making. Although the SDQ is available in both paper and computer survey formats, cross-format equivalences have yet to be established. The current study aimed to assess the measure’s equivalence across paper- and computer-based survey formats in a community-based school setting. The study examined self-reported measures completed by a matched sample of 11–14 year olds in secondary schools in England (589 completed paper version; 589 online version). Analyses demonstrate that the factor structure, although did not vary by survey format, resulted in poorly fitting models limiting the use of model based invariance testing. Results indicate that the measure does not operate similarly across different formats, with scale-level mean differences observed for the hyperactivity scale, which also affects the total difficulties score, with higher scores seen in the paper version. Responses to the impact supplement were also influenced by survey format, with higher impact in specific domains disclosed on the computer-based measure. Item-level differential item functioning was observed for four items in the measure; two from the prosocial scale where the DIF is large enough to affect the scale (DTF, ν2 = 0.14). The inconsistency across survey formats highlights the need for more assessment of influences of different survey formats on young people, their perceived privacy and their mental health disclosures via different media. The findings also highlight the potential confounding effect of format when different methods of data collection are used, with a potentially substantive impact on cross-sample comparisons and within child clinical review.  相似文献   

17.
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response patterns. It also yielded more DIF items with larger effect sizes and more consistent item response patterns by substantive aspects (e.g., reading comprehension processes and cognitive complexity of items). Based on our findings, we suggest empirically evaluating the homogeneity assumption in international assessments because international populations cannot be assumed to have homogeneous item response patterns. Otherwise, differences in response patterns within these populations may be under-detected when conducting manifest DIF analyses. Detecting differences in item responses across international examinee populations has implications on the generalizability and meaningfulness of DIF findings as they apply to heterogeneous examinee subgroups.  相似文献   

18.
We discuss the use of cognitive interviewing with bilinguals as an integral part of cross-cultural adaptation of personality questionnaires. The aim is to maximize semantic equivalence to increase the likelihood of items maintaining the intended structure and meaning in the target language. We refer to this part of adaptation as semantic enhancement, and integrate cognitive interviewing within it as a tool for scrutinizing translations, the connotative meaning, and the psychological impact of items across languages. During the adaptation of a work-based personality questionnaire from English to Arabic, Chinese (Mandarin), and Spanish, we cognitively interviewed 12 bilingual participants about 136 items in different languages (17% of all items), of which 67 were changed. A content analysis categorizing the reasons for amending items elicited 11 errors that affect 2 identified forms of semantic equivalence. We provide the resultant coding scheme as a framework for designing cognitive interviewing protocols and propose a procedure for implementing them. We discuss implications for theory and practice.  相似文献   

19.
To date, the statistical software designed for assessing differential item functioning (DIF) with Mantel-Haenszel procedures has employed the following statistics: the Mantel-Haenszel chi-square statistic, the generalized Mantel-Haenszel test and the Mantel test. These statistics permit detecting DIF in dichotomous and polytomous items, although they limit the analysis to two groups. On the contrary, this article describes a new approach (and the related software) that, using the generalized Mantel-Haenszel statistic proposed by Landis, Heyman, and Koch (1978), permits DIF assessment in multiple groups, both for dichotomous and polytomous items. The program is free of charge and is available in the following languages: Spanish, English and Portuguese.  相似文献   

20.
Usually, methods for detection of differential item functioning (DIF) compare the functioning of items across manifest groups. However, the manifest groups with respect to which the items function differentially may not necessarily coincide with the true source of the bias. It is expected that DIF detection under a model that includes a latent DIF variable is more sensitive to this source of bias. In a simulation study, it is shown that a mixture item response theory model, which includes a latent grouping variable, performs better in identifying DIF items than DIF detection methods using manifest variables only. The difference between manifest and latent DIF detection increases as the correlation between the manifest variable and the true source of the DIF becomes smaller. Different sample sizes, relative group sizes, and significance levels are studied. Finally, an empirical example demonstrates the detection of heterogeneity in a minority sample using a latent grouping variable. Manifest and latent DIF detection methods are applied to a Vocabulary test of the General Aptitude Test Battery (GATB).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号