首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The study of prediction bias is important and the last five decades include research studies that examined whether test scores differentially predict academic or employment performance. Previous studies used ordinary least squares (OLS) to assess whether groups differ in intercepts and slopes. This study shows that OLS yields inaccurate inferences for prediction bias hypotheses. This paper builds upon the criterion-predictor factor model by demonstrating the effect of selection, measurement error, and measurement bias on prediction bias studies that use OLS. The range restricted, criterion-predictor factor model is used to compute Type I error and power rates associated with using regression to assess prediction bias hypotheses. In short, OLS is not capable of testing hypotheses about group differences in latent intercepts and slopes. Additionally, a theorem is presented which shows that researchers should not employ hierarchical regression to assess intercept differences with selected samples.  相似文献   

2.
The statistical literature on bias in psychological testing distinguishes at least two forms of bias: measurement bias and predictive bias. Measurement bias concerns group differences in the relationship between a test and the latent variable to be measured. Predictive bias concerns group differences in the relationship between a test and an external criterion. How are these two forms of bias related? For example. if a test is unbiased in the predictive sense, does this fact support the hypothesis that the test is unbiased in the measurement sense? A theorem is given that describes the conditions under which measurement invariance (lack of bias) is consistent with predictive invariance for the linear case. Paradoxically, these two forms of invariance are shown to be inconsistent under realistic conditions. This duality or inconsistency is illustrated in simulated data. The implications of the duality for group differences research are illustrated in real data involving gender and ethnic differences on the SAT. The phenomenon of duality may force a reinterpretation of common empirical findings of test criterion regression slope invariance. and of invariance in test validities. Other implications are discussed.  相似文献   

3.
Wu W  Lu Y  Tan F  Yao S  Steca P  Abela JR  Hankin BL 《Assessment》2012,19(4):506-516
This study tested the measurement invariance of Children's Depression Inventory (CDI) and compared its factorial variance/covariance and latent means among Chinese and Italian children. Multigroup confirmatory factor analysis of the original five factors identified by Kovacs revealed that full measurement invariance did not hold. Further analysis showed that 4 of 21 factor loadings, 14 of 26 intercepts, and 12 of 26 item errors were noninvariant. Factor variance and covariance invariant tests revealed significant differences between Chinese and Italian samples. The latent factor mean comparison suggested no significant difference across the two groups. Nevertheless, the finding of partial metric and scalar invariance suggested that observed mean differences on the CDI items cannot be fully explained by the mean differences in the latent factor. These results suggest that researchers and practitioners exercise caution when gauging the size of the true national population differences in depressive symptoms among Italian and Chinese children when assessed via CDI. In addition to providing needed evidence on the use of the CDI in Italian and Chinese children specifically, the methods used in this research can serve more generally as an example for other cross-cultural assessment research to test structural equivalence and measurement invariance of scales and to determine why it is important to do so.  相似文献   

4.
It is argued that analyses of subgroup differences utilizing a bivariate correlation strategy do not provide an adequate examination of test fairness. An analysis of differential prediction, which involves slopes and intercepts of regression lines results in more complete coverage of the test fairness issue, since the overall regression line determines the way in which a test is used for prediction. While subgroup correlation coefficients yield information concerning the slopes and intercepts, means and standard deviations must also be examined. A moderated multiple regression strategy is recommended as an alternative to separate analyses by subgroups. An ordered step-up regression procedure is presented which is more encompassing than the bivariate strategies, while avoiding inherent problems associated with subgroup coding in multiple regression.  相似文献   

5.
Practical and theoretical issues are discussed for testing (a) the comparability, or measurement equivalence, of psychological constructs and (b) detecting possible sociocultural difference on the constructs in cross-cultural research designs. Specifically, strong factorial invariance (Meredith, 1993) of each variable's loading and intercept (mean-level) parameters implies that constructs are fundamentally the same in each sociocultural group, and thus comparable. Under this condition, hypotheses about the nature of sociocultural differences and similarities can be confidently and meaningfully tested among the constructs' moments in each sociocultural sample. Some of the issues involved in making such tests are reviewed and explicated within the framework of multiple-group mean and covariance structures analyses.  相似文献   

6.
以生活满意度量表为例,运用实证性因素分析,考察在中国文化下网络测验和传统纸笔测验之间的测量不变性。结果显示,网络测验和纸笔测验之间存在弱不变性,即网络测验和纸笔测验有着相同的测量单位;但网络测验和纸笔测验只存在部分的强不变性和部分的严格不变性,测验实施环境对结果的影响不可忽视。该研究表明,恰当设计的网络测验是可靠的,同时还提示,当一个测验在不同情境下运用时,检验测量不变性十分必要  相似文献   

7.
Measurement invariance,factor analysis and factorial invariance   总被引:31,自引:0,他引:31  
Several concepts are introduced and defined: measurement invariance, structural bias, weak measurement invariance, strong factorial invariance, and strict factorial invariance. It is shown that factorial invariance has implications for (weak) measurement invariance. Definitions of fairness in employment/admissions testing and salary equity are provided and it is argued that strict factorial invariance is required for fairness/equity to exist. Implications for item and test bias are developed and it is argued that item or test bias probably depends on the existence of latent variables that are irrelevant to the primary goal of test constructers.Presidential address delivered at the Annual Meeting of the Psychometric Society in Berkeley, California, June 18–20, 1993.  相似文献   

8.
Background/Objective: Orgasm Rating Scale (ORS) assess the subjective orgasm experience in context of sexual relationship. It is composed of four dimensions attributed to the orgasm (Affective, Sensory, Intimacy, and Rewards). The purpose is to analyse the factorial invariance of the ORS across groups, to examine the metric equivalence across sex, and to present the standard scores. Method: A total of 1,472 Spanish adults (715 men and 757 women) were evaluated. They were distributed across age groups (18-34, 35-49 and 50 years old and older). Factorial invariance across different groups and the differential functioning of the items across sex were analyzed, internal consistency was examined, and the standard scores were developed. Results: The structure of the ORS showed strict measurement invariance across sex, relationship status, sexual orientation and education level. It also reached a scalar measurement invariance across age range and duration of the relationship. Some items showed a differential functioning between sexes. Conclusions: The Spanish version of the ORS is invariant across different groups at a factorial level, and it shows equivalence across sex in most of its items at a metric level. The standard scores allow a more accurate assessment of the subjective orgasm experience in context of sexual relationship.  相似文献   

9.
Measurement invariance of the WISC-IV second-order factorial structure between normative and clinical samples was investigated using WISC-IV core subtests and a total of 1100 children aged 6-16. Multi-group higher order analysis of mean and covariance structure (MG-MACS) models were used to analyze these data. Results supported measurement invariance. Only Coding and Comprehension subtest intercepts varied slightly between groups. The hypothesized WISC-IV factor model described the data well. Factor patterns, first- and second-order factor loadings, intercepts, residual variances of measured subtests, and disturbances of first-order factors of the WISC-IV were generally invariant. Results suggested that WISC-IV index scores and subtests have the same meaning for children in both normative and clinical groups.  相似文献   

10.
Little is known about the factorial invariance and latent mean difference across sex and age in trait emotional intelligence (EI). The purpose of this study was to examine whether the measurement structure underlying trait EI is equivalent across sex and age groups. The sample consisted of 2919 teenagers, youths and adults. In order to investigate the above hypothesis the measurement and structural equivalence as well as the equality of latent means of scores across sex and age were tested. The multi-group confirmatory factor analysis results revealed that configural, metric, scalar and structural invariance exist across sex and age samples. Findings regarding the latent mean differences across sex and age groups are discussed with reference to recent and past findings.  相似文献   

11.
A psychological measurement model provides an explicit definition of (a) the theoretical and (b) the numerical relationships between observed scores and the latent variables that underlie the observed scores. Examination of the metric invariance of a measurement model involves testing the hypothesis that all components of the model relating observed scores to latent variables are equal across groups. The assumption of metric invariance is necessary for simple interpretation of scores. Establishing metric invariance also has implications for interpretation of convergent and divergent validity and patterns of deficit or disability. In this study the equivalence of the measurement model derived from the U.S. Wechsler Adult Intelligence Scale-III standardization sample was compared with a heterogeneous neurosciences sample in Australia. A pattern of strict metric invariance was observed across samples. These results provide evidence of the generality of the model underlying measurement of cognitive abilities.  相似文献   

12.
Sex and ethnic group differences were examined on the operational composites and tests used to select applicants for US Air Force officer commissioning programmes and for pilot training. Results showed that large mean score differences in applicant samples were substantially reduced among the pilot trainees. Despite differences in test performance, there was no evidence of differential validity for groups. When group differences in predicted pilot training completion rate were observed, performance was overestimated for the minority group relative to the majority group. When regression equations were adjusted for unreliability of the predictors, the observed differences in intercepts were reduced or eliminated. No prediction bias was observed against the minority groups.  相似文献   

13.
Tracy L. Tylka 《Body image》2013,10(3):415-418
Considered a measure of positive body image, the Body Appreciation Scale (BAS; Avalos et al., 2005) assesses acceptance of, favorable opinions toward, and respect for the body. Although the BAS was originally developed for and psychometrically examined with women, researchers are administering it to men and making gender comparisons. However, tests of measurement equivalence/invariance are needed to determine whether the BAS operates similarly for women and men. Therefore, in the present study, the BAS's cross-gender configural, factor loading, and intercept invariance was examined among 930 college women and men. The BAS demonstrated measurement equivalence/invariance between women and men, suggesting that gender comparisons can be made with confidence. Additional evidence was accrued for the convergent validity of the male version of the BAS, as it was related to men's dissatisfaction with muscularity, body fat, and height. These findings reinforce the structural and construct integrity of the BAS.  相似文献   

14.
For 25 years psychologists have measured systematic measurement bias in terms of regression lines. According to this traditional approach a test is an unbiased predictor of a criterion for all subgroups if all subgroups have identical Y regression lines (i.e., identical slopes and identical Y intercepts). This paper shows that the traditional model is fundamentally incorrect and identical Y regression lines are not expected to occur with an unbiased test in a testing situation in which one group score lower than another group on both the test and criterion. This is the case even if the test is perfectly reliable. The traditional model for measuring bias actually results in a consistent error or bias against groups which score lower than average on both the test and criterion. In practice this bias operates against minority groups. Tests now thought to be unbiased or even biased in favor of minority groups may in fact be biased against minority groups. A new model of test bias, which is based solely on measurement principles, is briefly introduced. In this model unbiased tests produce groups with identical test-criterion common-factor axes having a slope of S YC/S XC and with each axis intersecting the group centroids.  相似文献   

15.
This study examines differential prediction of WIAT achievement scores based on WISC-III FSIQ in white as compared with African American and Hispanic children, and in females as compared with males. A procedure which allows simultaneous comparisons of slopes and intercepts across groups is employed. The results are consistent with previous research findings in supporting the general absence of bias in predicting achievement from IQ.  相似文献   

16.
This study explored factor structure and measurement and structural invariance of the MSCEIT V2.0 across two age groups: 258 young (18-31 years) and 262 older adults (32-79 years). Results supported a three-factor solution reflecting the Experiential Emotional Intelligence area, and Understanding Emotions and Managing Emotions branches. There was evidence of measurement invariance of factor structure and factor loadings, and partial support for invariance of the intercepts. Comparisons of latent factor means suggested that older adults have significantly higher mean scores on two of the three factors: Understanding and Managing Emotions. Implications of the invariance tests and latent means analyses are discussed.  相似文献   

17.
The analysis of measurement invariance of latent constructs is important in research across groups, or across time. By establishing whether factor loadings, intercepts and residual variances are equivalent in a factor model that measures a latent concept, we can assure that comparisons that are made on the latent variable are valid across groups or time. Establishing measurement invariance involves running a set of increasingly constrained structural equation models, and testing whether differences between these models are significant. This paper provides a step-by-step guide to analysing measurement invariance.  相似文献   

18.
Given the clinical usefulness of the CFQ-BI (Cognitive Fusion Questionnaire—Body Image; the only existing measure to assess the body-image-related cognitive fusion), the present study aimed to confirm its one-factor structure, to verify its measurement invariance between clinical and non-clinical samples, to analyze its internal consistency and sensitivity to detect differences between samples, as well as to explore the incremental and convergent validities of the CFQ-BI scores in Brazilian samples. This was a cross-sectional study, which was conducted in clinical (women with overweight or obesity in treatment for weight loss) and non-clinical samples (women from the general population). The one-factor structure was confirmed showing factorial measurement invariance across clinical and non-clinical samples. The CFQ-BI scores presented an excellent internal consistency, were able to discriminate clinical and non-clinical samples, and were positively associated with binge eating severity, general cognitive fusion, and psychological inflexibility. Furthermore, body-image-related cognitive fusion scores (CFQ-BI) presented incremental validity over a general measure of cognitive fusion in the prediction of binge eating symptoms. This study demonstrated that CFQ-BI is a short scale with reliable and robust scores in Brazilian samples, presenting incremental and convergent validities, measurement invariance, and sensitivity to detect differences between clinical and non-clinical groups of women, enabling comparative studies between them.  相似文献   

19.
Measurement bias refers to systematic differences across subpopulations in the relation between observed test scores and the latent variant underlying the test scores. Comparisons of subpopulations with the same score on the latent variable can be expected to have the same observed test score. Measurement invariance is therefore one of the key issues in psychological testing. It has been established that strict factorial invariance (SFI) with respect to a selection variable V almost certainly implies weak measurement invariance with respect to V: given SFI, means and variances of observed scores do not depend on V. It is shown that this result can be extended. SFI in groups derived by selection on V has implications not only for V but also for potentially biasing variables W, if W and the selection variable V and/or if W and the factor underlying the observed test scores are statistically dependent. Given SFI with respect to V and prior knowledge concerning these dependencies, it is not necessary to measure and model variables W in order to exclude them as potentially biasing variables if the investigation focuses on groups selected on V.  相似文献   

20.
Measurement invariance (lack of bias) of a manifest variableY with respect to a latent variableW is defined as invariance of the conditional distribution ofY givenW over selected subpopulations. Invariance is commonly assessed by studying subpopulation differences in the conditional distribution ofY given a manifest variableZ, chosen to substitute forW. A unified treatment of conditions that may allow the detection of measurement bias using statistical procedures involving only observed or manifest variables is presented. Theorems are provided that give conditions for measurement invariance, and for invariance of the conditional distribution ofY givenZ. Additional theorems and examples explore the Bayes sufficiency ofZ, stochastic ordering inW, local independence ofY andZ, exponential families, and the reliability ofZ. It is shown that when Bayes sufficiency ofZ fails, the two forms of invariance will often not be equivalent in practice. Bayes sufficiency holds under Rasch model assumptions, and in long tests under certain conditions. It is concluded that bias detection procedures that rely strictly on observed variables are not in general diagnostic of measurement bias, or the lack of bias.Preparation of this article was supported in part by PSC-CUNY grant #661282 to Roger E. Millsap.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号