首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Measurement invariance,factor analysis and factorial invariance   总被引:31,自引:0,他引:31  
Several concepts are introduced and defined: measurement invariance, structural bias, weak measurement invariance, strong factorial invariance, and strict factorial invariance. It is shown that factorial invariance has implications for (weak) measurement invariance. Definitions of fairness in employment/admissions testing and salary equity are provided and it is argued that strict factorial invariance is required for fairness/equity to exist. Implications for item and test bias are developed and it is argued that item or test bias probably depends on the existence of latent variables that are irrelevant to the primary goal of test constructers.Presidential address delivered at the Annual Meeting of the Psychometric Society in Berkeley, California, June 18–20, 1993.  相似文献   

心理测量平衡性研究与实例   总被引:1,自引:0,他引:1  
刘军  吴维库 《心理科学》2005,28(1):170-174,169
心理测量研究中,测量不变性(或称平衡性)是量表稳定性问题中的一个难题而且在比较研究中受到特别重视。结构方程模型因在平衡性形式捕捉方面功能强大而受到广泛应用。该研究讨论了测量平衡性的各种形式并演示了应用结构方程模型评估测量平衡性的过程。  相似文献   

在行为科学研究领域中,检验测量工具的测量不变性是进行群体差异比较的前提。目前,多组验证性因子分析(多组CFA)方法被广泛用于检验测量不变性,但是它对跨组等值的限制过于严格,在实际应用中常常存在大量局限。贝叶斯渐近测量不变性方法基于贝叶斯思想的优良特性,放宽了传统多组CFA方法对跨组差异的严格限制,避免了传统方法的问题,具有较高的应用价值。文章详细介绍了贝叶斯渐近测量不变性方法的原理及优势,同时通过实例展示了渐近测量不变性方法在Mplus软件中的具体分析过程。  相似文献   

以生活满意度量表为例,运用实证性因素分析,考察在中国文化下网络测验和传统纸笔测验之间的测量不变性。结果显示,网络测验和纸笔测验之间存在弱不变性,即网络测验和纸笔测验有着相同的测量单位;但网络测验和纸笔测验只存在部分的强不变性和部分的严格不变性,测验实施环境对结果的影响不可忽视。该研究表明,恰当设计的网络测验是可靠的,同时还提示,当一个测验在不同情境下运用时,检验测量不变性十分必要  相似文献   

Specification search problems refer to two important but under-addressed issues in testing for factorial invariance: how to select proper reference indicators and how to locate specific non-invariant parameters. In this study, we propose a two-step procedure to solve these issues. Step 1 is to identify a proper reference indicator using the Bayesian structural equation modeling approach. An item is selected if it is associated with the highest likelihood to be invariant across groups. Step 2 is to locate specific non-invariant parameters, given that a proper reference indicator has already been selected in Step 1. A series of simulation analyses show that the proposed method performs well under a variety of data conditions, and optimal performance is observed under conditions of large magnitude of non-invariance, low proportion of non-invariance, and large sample sizes. We also provide an empirical example to demonstrate the specific procedures to implement the proposed method in applied research. The importance and influences are discussed regarding the choices of informative priors with zero mean and small variances. Extensions and limitations are also pointed out.  相似文献   

Pseudo-guessing parameters are present in item response theory applications for many educational assessments. When sample size is not sufficiently large, the guessing parameters may be ignored from the analysis. This study examines the impact of ignoring pseudo-guessing parameters on measurement invariance analysis, specifically, on item difficulty, item discrimination, and mean and variance of ability distribution. Results show that when non-zero guessing parameters are ignored from the measurement invariance analysis, item discrimination estimates tend to decrease particularly for more difficult items, and item difficulty estimates decrease unless the items are highly discriminating and difficult. As the guessing parameter increases, the size of the decrease in item discrimination and difficulty tends to increase, and the estimated mean and variance of ability distribution tend to be inaccurate. When two groups have heterogeneous ability distributions, ignoring the guessing parameter affects the reference group and the focal group differently. Implications of result findings are discussed.  相似文献   

Growing international research interest in negative-leadership behaviors prompts the need to examine whether measures of ineffective leadership developed in the United States are equivalent across countries outside the United States. B. J. Tepper's (2000) abusive supervision measure has been used widely inside and outside the United States and merits research attention on its construct equivalence across different cultural settings. The authors conducted a series of multigroup confirmatory factor analyses to investigate the measurement equivalence of this measure across Taiwan (N = 256) and the United States (N = 389). Configural invariance was established, suggesting that both U.S. and Taiwanese samples perceive abusive supervision as a single-factor concept. Furthermore, the establishment of partial metric invariance and partial scalar invariance suggests that the abusive supervision measure is applicable to crosscultural comparisons in latent means, construct variance, construct covariances, and unstandardized path coefficients with the caution that workers from different cultures calibrate their responses differently when answering some items.  相似文献   

The Dirty Dozen (Jonason & Webster, 2010) is a frequently used concise version of the Dark Triad to measure three socially aversive personality traits: Machiavellianism, psychopathy and, narcissism. The present study has examined measurement invariance in a sample of Belgian adults. The present study aims to assess measurement invariance of the Dutch version of the Dirty Dozen measure across gender in a large city-based representative adult sample in Belgium (N = 1587). Multi-group first-order confirmatory factor analysis for categorical indicators was utilized. In addition, unique associations between Dirty Dozen traits, trait self-control and, acceptance of illegitimate norms were examined in a series of structural equation models. Results indicated that the internal consistency of the Dirty Dozen subscales was good for Machiavellianism (α = 0.80) and narcissism (α = 0.80), but modest for psychopathy (α = 0.64). The hypothesized three correlated factors model with separate factors for Machiavellianism, psychopathy and, narcissism provided a poor fit for men and women. Invariance testing across gender showed evidence for weak invariance only, indicating that the underlying latent factors are measured the same way with the same metric in the two populations. However, we were not able to establish strong measurement invariance. Observed group differences should be interpreted with caution. Furthermore, Machiavellianism and psychopathy were strongly associated with trait self-control in both men and women. Strong correlations were found between acceptance of illegitimate norms and Dirty Dozen traits, Machiavellianism and, psychopathy, but not with narcissism.  相似文献   

郑显亮  顾海根  赵必华 《心理科学》2011,34(5):1195-1200
与一阶因素模型相比,二阶因素模型具有较多优点,但二阶因素模型的测量等价性检验要更复杂,它需要依次进行七个不同水平的检验:形等价、一阶弱等价、二阶弱等价、一阶强等价、二阶强等价、二阶严等价和一阶严等价。低水平的等价性满足之后,才能进行更为严格的高一水平的等价性检验。运用均值和协方差结构(MACS)模型对大学生网络利他行为量表(IABSU)进行二阶因素模型的测量等价性检验,结果表明,IABSU具有跨地域的完全一阶、二阶严等价性。  相似文献   

李冲  刘红云 《心理科学》2011,34(6):1482-1487
研究介绍了针对等级数据的模型建构(LRV,潜在反应变量模型)和参数估计(WLSMV)方法,以及在此基础上的测量不变性检验(DIFFTEST)方法,同时采用蒙特卡洛模拟研究方法,考察样本总量大小、组间样本量对比情况、阈值差异程度、量表长度等因素,对DIFTEST进行针对等级数据的测量不变性检验效果的影响情况,以及WLSMV估计方法下的参数复原情况。研究结果发现WLSMV估计方法参数的复原效果很好;DIFFTEST的一类错误概率达到可接受水平,在大样本情况下、组间样本量基本相等、阈值差异程度较大时,DIFFTEST检测力较好。在控制测量不变性遭受破坏的项目个数情况下,随着测验长度的增加,DIFFTEST的检测力下降。  相似文献   

不同定义平行测验等值的群体不变性   总被引:1,自引:0,他引:1  
群体不变性是等值的一个重要假设,即对不同的考生子群体等值函数一致。本研究对不同平行测验定义下线性等值的群体不变性进行了理论分析和模拟研究,模拟研究REMSD指标通过六种不同加权方式计算。结果显示,严格平行测验在信度较低时REMSD指标更大;子群体均值差异和信度差异对REMSD的影响存在明显的交互作用;REMSD指标在期望权重等权下的最大,在分数权重采用子群体比例加权最小。最后对结果进行了讨论,对REMSD权重使用及进一步研究给出了建议。  相似文献   

In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic classification models (DCMs). DCMs are a newer class of psychometric models that are designed to classify examinees according to levels of categorical latent traits. We examined the invariance property for general DCMs using the log-linear cognitive diagnosis model (LCDM) framework. We conducted a simulation study to examine the degree to which theoretical invariance of LCDM classifications and item parameter estimates can be observed under various sample and test characteristics. Results illustrated that LCDM classifications and item parameter estimates show clear invariance when adequate model data fit is present. To demonstrate the implications of this important property, we conducted additional analyses to show that using pre-calibrated tests to classify examinees provided consistent classifications across calibration samples with varying mastery profile distributions and across tests with varying difficulties.  相似文献   

In this study, we introduce an interval estimation approach based on Bayesian structural equation modeling to evaluate factorial invariance. For each tested parameter, the size of noninvariance with an uncertainty interval (i.e. highest density interval [HDI]) is assessed via Bayesian parameter estimation. By comparing the most credible values (i.e. 95% HDI) with a region of practical equivalence (ROPE), the Bayesian approach allows researchers to (1) support the null hypothesis of practical invariance, and (2) examine the practical importance of the noninvariant parameter. Compared to the traditional likelihood ratio test, simulation results suggested that the proposed Bayesian approach could offer additional insight into evaluating factorial invariance, thus, leading to more informative conclusions. We provide an empirical example to demonstrate the procedures necessary to implement the proposed method in applied research. The importance of and influences on the choice of an appropriate ROPE are discussed.  相似文献   

测量不变性在自我报告问卷或量表的心理测量应用中非常重要,是跨组比较的前提条件。测量不变性检验模型包括无任何约束的分组验证性因素分析(Mgroup)、形态的不变性(M1)、负荷的不变性(M2)、截距的不变性(M3)、严格不变性(M4)、因子方差-协方差的不变性(M5)以及潜均值的不变性(M6)。以生活满意度量表(SWLS)为例,针对1343名大学生(年龄17-25岁,20.01±1.53),进行有急事需要处理(否vs.是),答题时感受(积极情绪vs.消极情绪),噪音水平(无噪音vs.有噪音),答题用时(长vs.短),性别(男vs.女),户口(非农业户口vs.农业户口)等不同组别的生活满意度量表(SWLS)完全因素不变性检验。结果表明:(1)是否有急事需要处理的不变性成立(Δχ2=0.49~10.59,p>0.05);(2)答题时感受不变性部分成立,M5、M6模型不变性不成立(Δχ2(1=3.96、20.89,p<0.05);(3)噪音水平不变性部分成立,M3与M4模型不变性检验不成立(Δχ2(4)=14.75,Δχ2(5)=23.91,p<0.05);(4)答题用时不变性不成立(Δχ2=11.01~41.95,均p<0.05);(5)性别的不变性部分成立,M4模型不变性检验不成立(Δχ2(5)=64.40,p<0.05);(6)户口的不变性部分成立,M6模型不变性检验不成立(Δχ2(1)=11.49,p<0.05)。  相似文献   

The Self-Other Differentiation Scale (Olver, Aries, &; Batgos, 1989 Olver, R. R., Aries, E., &; Batgos, J. (1989). Self-other differentiation and the mother-child relationship: The effects of sex and birth order. The Journal of Genetic Psychology, 150, 311321. doi:10.1080/00221325.1989.9914600[Taylor &; Francis Online] [Google Scholar]) is a self-report instrument assessing the experience of a separate sense of self from others. The authors aimed to examine its dimensionality, reliability, and measurement invariance across gender. It was completed by 348 participants (48% men) from 17 to 30 years old in Study 1, 348 participants (40% men) from 18 to 28 years old in Study 2, and 1,068 participants (49% men) from 17 to 28 years old in Study 3. The results supported the hypothesis of just one factor underlying the scale; they also showed an appropriate internal consistency and a partial measurement invariance across gender. Results also showed evidence for a 10-item version of the scale. Globally, the Self-Other Differentiation Scale can be considered a good scale to assess individual's sense of differentiation of one's own sense of self from others.  相似文献   

Examining the influence of culture on personality and its unbiased assessment is the main subject of cross-cultural personality research. Recent large-scale studies exploring personality differences across cultures share substantial methodological and psychometric shortcomings that render it difficult to differentiate between method and trait variance. One prominent example is the implicit assumption of cross-cultural measurement invariance in personality questionnaires. In the rare instances where measurement invariance across cultures was tested, scalar measurement invariance—which is required for unbiased mean-level comparisons of personality traits—did not hold. In this article, we present an item sampling procedure, ant colony optimization, which can be used to select item sets that satisfy multiple psychometric requirements including model fit, reliability, and measurement invariance. We constructed short scales of the IPIP-NEO-300 for a group of countries that are culturally similar (USA, Australia, Canada, and UK) as well as a group of countries with distinct cultures (USA, India, Singapore, and Sweden). In addition to examining factor mean differences across countries, we provide recommendations for cross-cultural research in general. From a methodological perspective, we demonstrate ant colony optimization's versatility and flexibility as an item sampling procedure to derive measurement invariant scales for cross-cultural research. © 2020 The Authors. European Journal of Personality published by John Wiley & Sons Ltd on behalf of European Association of Personality Psychology  相似文献   

The experiences in close relationships revised (ECR-R) is widely used to assess romantic attachment dimensions. Investigating cultural limitations in its applicability is imperative. This study aims to examine the instrument’s: (1) factor structure in two large and normative samples of Greek (N = 1706, M age = 16.16; SD = 2.16; 49.7% male) and Cypriot (N = 1279; M age = 15.54; SD = 0.65; 44.9% male) adolescents; (2) measurement invariance between these groups, accounting for potential gender and age effects. Results supported the two-factor structure and indicated partial invariance of the constructs between Greek and Cypriot adolescents. Findings support limitations in the use of instruments adapted for Greece in Cyprus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号