温忠麟  侯杰泰 《心理学报》2008,40(1):119-124
Hu和Bentler(1998,1999)通过模拟研究推荐结构方程模型拟合指数临界值后,受到不少批评和质疑。此后有关拟合指数的研究重点不再是推出新的临界值标准。郭庆科等人的文章《不同条件下拟合指数的表现及临界值的选择》,仿照Hu和Bentler的做法,通过模拟研究推荐新的拟合指数临界值标准。本文旨在揭示这种做法的错误所在。用简单的Z检验,说明检验的临界值是不能通过模拟研究确定的。通过将一个特定真模型的众多错误模型分类,说明结构方程分析中真模型与错误模型差距的多样性,无法通过模拟一对真伪模型来代表。讨论了统计检验的本质和确定临界值的逻辑,还谈到应当从哪些角度检验和评价结构方程模型  相似文献   

温涵  梁韵斯 《心理科学》2015,(4):987-994
拟合指数检验是评价结构方程模型(SEM)的重要环节。从协方差结构分析的角度将SEM与传统的回归模型比较,容易理解为什么SEM需要拟合指数。揭示了目前几种流行的拟合指数检验的实质:基于卡方的绝对拟合指数(如RMSEA)检验的实质是重新设定卡方检验的显著性水平(不同于通常的.05),相对拟合指数(如NNFI和CFI)检验的实质是基于虚模型设定均方(卡方与自由度之比)降低到的比例;在NNFI大于临界值后,报告和检验CFI是不必要的。根据研究结果提出了一些方便实用的拟合检验建议。  相似文献   

采用与Hu & Bender不同的设计进行模拟研究,发现在他们推荐的8个指数中,NNFI、CFI、IFI在所有样本量条件下几乎都能得到最低的α与β错误率,而且最佳界值都在0.95左右,属于优良的指数。但Mc与GammaHat的表现却不好,因此不建议使用。他们建议的2指标策略被证明能降低α与β错误率,在样本量较小时尤其值得尝试。  相似文献   

结构方程模型检验:拟合指数与卡方准则   总被引:175,自引:15,他引:175  
讨论了Hu和Bentler(1998,1999)推荐的检验结构方程模型的7个拟合指数准则,对这7个指数的历史、特点和表现做了比较详细的述评。指出了他们基于这7个指数的单指数准则和2-指数准则的不足之处。提出了超低显著性水平下的卡方准则,并部分重复他们的模拟例子,将卡方准则与这7个指数准则比较,结果说明新的卡方准则优于其中的6个,与另一个相当。最后简要说明了应当如何检视拟合指数进行模型检验和模型比较。  相似文献   

该文主要介绍了评价和鉴定结构方程中构建的模型与数据拟合程度的三个方面:整体拟合、内部拟合及复核效度检验.整体拟合评鉴主要是结合拟合指数,当拟合指数处于临界值时,应同时参照其他检验结果并根据模型构建的理论依据进行综合判断;内部拟合从项目质量检验、测验信度、平均变异萃取量和效度四个角度进行检验;复核效度检验在多样本分析中可分为宽松复核、温和复核和严格复核,但如果为研究样本所限,可以通过ECVI来实现.  相似文献   

余嘉元 《心理学报》1994,27(2):219-224
为探讨线性逻辑斯谛模型(LLTM)的拟合条件及其和解题策略同质性之间的关系,让被试比较两个负整数指数幂的大小,发现全体被试的数据不能与拉希模型及LLTM相拟合。把被试按其解题策略分成不同策略组后,同一策略组被试的数据可以拟合于拉希模型,但对于LLTM,同一策略组的数据中部分项目的拟合较好,另外一些项目的拟合较差。这一结果表明,解题策略的同质性是LLTM拟合的必要条件,但还不是充分条件。  相似文献   

王昭  郭庆科  韩丹 《心理科学》2012,35(5):1225-1232
个人拟合指标是考察心理测验中偏差得分模式的新方法。研究中考察了G、C、MCI、U3、U、W、ECI6、L等8个拟合指标对艾森克人格问卷信效度的影响,以及各指标与正反向题回答不一致项目数的相关。结果表明,删除不同比例拟合程度不好的个体后,测验的信效度明显提高。同时PFS可鉴别人格测验中的默认反应偏差。各指标中l对测验信效度的改善效果最为理想。  相似文献   

涂冬波  张心  蔡艳  戴海琦 《心理科学》2014,37(1):205-211
本文将IRT常用资料-模型拟合检验统计量χ^2和G^2引入认知诊断领域,具体讨论了这两个统计量在认知诊断资料-模型拟合检验的可行性及其侦查效果,并讨论了其在实际中的应用,为研究者及实际应用者在认知诊断资料模型拟合检验中提供借鉴及方法学支持。研究发现:(1)χ^2和G^2统计量在认知诊断资料-模型拟合检验中,犯Ⅰ类错误和Ⅱ错误概率均小于5%,表明χ^2和G^2统计量均能有效地侦查项目失拟情况,均可用于认知诊断中的资料-模型拟合检验。(2)测验长度、被试样本容量、认知属性个数等因素均会影响χ^2和G^2统计量的侦查效果。(3)就所犯两类错误率而言,χ^2统计量优于G^2统计量。(4)两统计量均能有效地侦查出项目因属性被错误标定而导致的失拟,因而它们在侦查属性错误标定中有一定的应用前景。  相似文献   

王阳  温忠麟  付媛姝 《心理科学进展》2020,28(11):1961-1969
常用的结构方程模型拟合指数存在一定局限, 如χ 2以传统零假设为目标假设, 无法验证模型, 而RMSEA和CFI等描述性的拟合指数不具备推断统计性质, 等效性检验有效弥补了这些问题。首先说明等效性检验如何评价单个模型的拟合, 并解释其与零假设检验的不同, 然后介绍等效性检验如何分析测量不变性, 接着用实证数据展示了等效性检验在单个模型评价和测量不变性检验中的效果, 并与传统模型评价方法比较。  相似文献   

测量等价性指的是,应用量表进行测量时,当观测变量和潜在特质之间的关系在相比较的各个组之间等同时,就称该量表具备测量等价性。特别地,来自不同群体但在潜在特质上得分相等的个体,他们观测变量的得分也应该相等。测量工具满足测量等价性的要求是进行组间差异比较的前提条件。该文首先明确了测量等价性的概念及其研究历史,然后阐述了测量等价性的重要性以及对测量等价性分析的必要性,进而讨论了在结构方程模型中测量等价性所要满足的5个条件,最后列举了模型优劣判定的拟合度指数  相似文献   

This study examines the unscaled and scaled root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker–Lewis index (TLI) of diagonally weighted least squares (DWLS) and unweighted least squares (ULS) estimators in structural equation modeling with ordered categorical data. We show that the number of categories and threshold values for categorization can unappealingly impact the DWLS unscaled and scaled fit indices, as well as the ULS scaled fit indices in the population, given that analysis models are misspecified and that the threshold structure is saturated. Consequently, a severely misspecified model may be considered acceptable, depending on how the underlying continuous variables are categorized. The corresponding CFI and TLI are less dependent on the categorization than RMSEA but are less sensitive to model misspecification in general. In contrast, the number of categories and threshold values do not impact the ULS unscaled fit indices in the population.  相似文献   

A new type of nonnormality correction to the RMSEA has recently been developed, which has several advantages over existing corrections. In particular, the new correction adjusts the sample estimate of the RMSEA for the inflation due to nonnormality, while leaving its population value unchanged, so that established cutoff criteria can still be used to judge the degree of approximate fit. A confidence interval (CI) for the new robust RMSEA based on the mean-corrected (“Satorra-Bentler”) test statistic has also been proposed. Follow up work has provided the same type of nonnormality correction for the CFI (Brosseau-Liard & Savalei, 2014). These developments have recently been implemented in lavaan. This note has three goals: a) to show how to compute the new robust RMSEA and CFI from the mean-and-variance corrected test statistic; b) to offer a new CI for the robust RMSEA based on the mean-and-variance corrected test statistic; and c) to caution that the logic of the new nonnormality corrections to RMSEA and CFI is most appropriate for the maximum likelihood (ML) estimator, and cannot easily be generalized to the most commonly used categorical data estimators.  相似文献   

动机与情境对不同自控水平儿童学业求助的影响   总被引:6,自引:0,他引:6  
郑信军 《心理科学》2000,23(1):80-83
本实验以小学儿童为被试,研究了在不同动机取向和情境条件下的不同自我控制水平儿童的学业求助行为。结果表明:(1)单独无自尊压力的解难题情境比群体压力情境导致更多的求助行为。(2)低自控儿童比高自控儿童产生更多的执行性求助,但主要表现在单独无自尊压力的情境下。(3)在群体压力情境下,自我卷入的儿童比任务卷入的儿童表现出更多的求助行为。(4)在单独无自尊压力的情境下,任务卷入的儿童比自我卷入的儿童表现出  相似文献   

楚竹书《孔子诗论》“类序”辨析   总被引:1,自引:0,他引:1  
通过对楚竹书《孔子诗论》的分析,结合对《诗经》文本形成过程的探讨,得出《诗经》之部类及其类序为《风》、《小雅》、《大雅》、《颂》的结论,同时厘清孔子、《孔子诗论》和《毛诗》在文本方面的关系。《诗经》的结集经历了一个过程,其分类由来已久;未曾更改。《诗》“类序”的形成也具有一定的历史性,是在《诗》文本编辑过程中自然形成的。孔子在整理《诗经》的过程中并不存在一个前提性的“编序”原则,所谓孔子“删诗”,只是对诗篇做些必要的一般性古籍整理而已。至于出土文献《孔子诗论》中偶尔出现的颠倒《诗》“类序”的论述亦属正常,并不能说明孔子曾编有与传统《诗》“类序”相反的文本,也不能否定《毛诗》文本具有一定的历史延续性,更不能由此说明《孔子诗论》中出现“类序颠倒”具有更为不可测知的寓意。  相似文献   

李美娟  刘玥  刘红云 《心理学报》2020,52(4):528-540
学生在完成计算机动态测验过程中,会产生大量带有时间标记的过程性数据。本研究基于5个国家(地区)3196名学生在PISA2012一道交通问题解决任务上的139990条数据,将多水平混合IRT(MMix IRT)模型进行拓展,用于探索问题解决过程策略的类别特点。结果表明,该模型不仅可以基于行为序列对不同国家(地区)学生在解决问题时策略使用情况的典型特征进行分析,还可以提供个体水平的能力估计值。拓展的MMixIRT模型可用于分析过程性数据的特征。  相似文献   

The cross-cultural validity of the Child Behavior Checklist for Ages 2-3 (CBCL/2-3) was tested in three Dutch samples of children referred to mental health services, from the general population, and from a twin study. Six scales were derived from factor analyses and labeled Oppositional, Aggressive, and Overactive, which constituted a broadband Externalizing grouping; Withdrawn/Depressed and Anxious, which constituted a broadband Internalizing grouping; and Sleep Problems. Internal consistencies of the scales, their test-retest reliabilities, interparent agreement, discriminative power, predictive relations with problem ratings 2 years later, and relations to other instruments designed to measure general development and behavior problems were adequate, and highly comparable to psychometric properties in American samples. It was concluded that across languages and cultures behavioral/emotional problems of young preschoolers may be adequately assessed with the CBCL/2-3.  相似文献   

To investigate the nature of the task-stimulus interaction in tachistoscopic recognition of kana and kanji, right-handed normal subjects performed two phonological tasks and two visual tasks. In the phonological tasks, the subjects compared the members of a pair of kana or kanji appearing in the right or left visual field on the basis of phonological identity; while in the visual tasks, they compared the members of a pair of kana or kanji on the basis of visual identity. The results showed a significant Visual Field × Task interaction as well as a significant Task × Stimulus interaction, indicating that both the type of stimuli and the nature of task demands contribute importantly to the determination of visual field asymmetry and hence the relative participation of each hemisphere.  相似文献   

ObjectivesThe purpose of this study was to create and provide validity evidence for the Processes of Change in Psychological Skills Training Questionnaire (PCPSTQ).DesignThe current study used a cross-sectional research design.MethodsFive hundred fifty nine NCAA Division I, professional, and Olympic level athletes participated in the current study. To create the PCPSTQ, an initial pool of 114 items was generated by the research team. After a content validity process, 65 items were retained for analysis. Exploratory structural equation modeling was used as an analytic strategy to identify the most appropriate factor structure for the PCPSTQ. Decisions about the most appropriate model were made using multiple fit indices. To examine the construct validity of the PCPSTQ, a series of one-way ANOVAs were conducted to examine differences in processes of change use across stage of change.ResultsIn the current study, validity evidence provided support for a 7-factor process of change measure (χ2 = 325.84, p < .001; Comparative Fit Index = .971; Tucker Lewis Index = .945; Root Mean Square Error of Approximation = .037; Standard Root Mean Square Residual = .020). Results also supported the construct validity of the scale as a significant difference in process of change use across stage of change was reported for all seven processes.ConclusionsResults of the current study support the factor structure and construct validity of the PCPSTQ. It appears that the processes of behavior change reported across multiple behavior change domains might also be viable for sport psychology professionals.  相似文献   

