首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The root mean square error of approximation (RMSEA) is a popular fit index in structural equation modeling (SEM). Typically, RMSEA is computed using the normal theory maximum likelihood (ML) fit function. Under nonnormality, the uncorrected sample estimate of the ML RMSEA tends to be inflated. Two robust corrections to the sample ML RMSEA have been proposed, but the theoretical and empirical differences between the 2 have not been explored. In this article, we investigate the behavior of these 2 corrections. We show that the virtually unknown correction due to Li and Bentler (2006) Li, L. and Bentler, P. M. 2006. “Robust statistical tests for evaluating the hypothesis of close fit of misspecified mean and covariance structural models.”. In UCLA Statistics Preprint #506. Los Angeles: University of California..  [Google Scholar], which we label the sample-corrected robust RMSEA, is a consistent estimate of the population ML RMSEA yet drastically reduces bias due to nonnormality in small samples. On the other hand, the popular correction implemented in several SEM programs, which we label the population-corrected robust RMSEA, has poor properties because it estimates a quantity that decreases with increasing nonnormality. We recommend the use of the sample-corrected RMSEA with nonnormal data and its wide implementation.  相似文献   


The brief scale of Francis for religiosity (Francis-5) has shown acceptable psychometric performance in Colombian adolescents. However, a confirmatory factorial analysis (CFA) has not been performed. The objective of the researche was to make a CFA to the Francis-5 in a sample of students of Santa Marta, Colombia. A validation study was performed. 350 students between 10 and 17 years old, and 54% were female. A CFA was performed to test the dimensionality of different versions of five and four items of the scale. Authors calculated five goodness of fit indexes indices (Chi square, RMSEA, CFI, TLI, and SMSR). The Francis-5 presented as goodness-of-fit coefficients, chi squared = 18.5, gl = 5, p = .002, RMSEA = .09 (CI90% .05 to .13), CFI = .99, TLI = .99, and SMSR = .01. And the version with the best coefficients of goodness was the Francis-5 without the fourth item (‘Pray helps me a lot’) with chi squared = .55, gl = 2, p = .76, RMSEA = .00 (CI90% .00 to .07), CFI = 1.00, TLI = 1.00, and SMSR = .01. As conclusions, the dimensionality of the Francis-5 is questionable. For a version without the item 4 the data are better fit.  相似文献   

Survey data often contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. With typical nonnormally distributed data in practice, a rescaled statistic Trml proposed by Satorra and Bentler was recommended in the literature of SEM. However, Trml has been shown to be problematic when the sample size N is small and/or the number of variables p is large. There does not exist a reliable test statistic for SEM with small N or large p, especially with nonnormally distributed data. Following the principle of Bartlett correction, this article develops empirical corrections to Trml so that the mean of the empirically corrected statistics approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics control type I errors reasonably well even when N is smaller than 2p, where Trml may reject the correct model 100% even for normally distributed data. The application of the empirically corrected statistics is illustrated via a real data example.  相似文献   

This research is concerned with two topics in assessing model fit for categorical data analysis. The first topic involves the application of a limited-information overall test, introduced in the item response theory literature, to structural equation modeling (SEM) of categorical outcome variables. Most popular SEM test statistics assess how well the model reproduces estimated polychoric correlations. In contrast, limited-information test statistics assess how well the underlying categorical data are reproduced. Here, the recently introduced C2 statistic of Cai and Monroe (2014) is applied. The second topic concerns how the root mean square error of approximation (RMSEA) fit index can be affected by the number of categories in the outcome variable. This relationship creates challenges for interpreting RMSEA. While the two topics initially appear unrelated, they may conveniently be studied in tandem since RMSEA is based on an overall test statistic, such as C2. The results are illustrated with an empirical application to data from a large-scale educational survey.  相似文献   

The root mean square error of approximation (RMSEA) and the comparative fit index (CFI) are two widely applied indices to assess fit of structural equation models. Because these two indices are viewed positively by researchers, one might presume that their values would yield comparable qualitative assessments of model fit for any data set. When RMSEA and CFI offer different evaluations of model fit, we argue that researchers are likely to be confused and potentially make incorrect research conclusions. We derive the necessary as well as the sufficient conditions for inconsistent interpretations of these indices. We also study inconsistency in results for RMSEA and CFI at the sample level. Rather than indicating that the model is misspecified in a particular manner or that there are any flaws in the data, the two indices can disagree because (a) they evaluate, by design, the magnitude of the model's fit function value from different perspectives; (b) the cutoff values for these indices are arbitrary; and (c) the meaning of “good” fit and its relationship with fit indices are not well understood. In the context of inconsistent judgments of fit using RMSEA and CFI, we discuss the implications of using cutoff values to evaluate model fit in practice and to design SEM studies.  相似文献   

A great deal of educational and social data arises from cluster sampling designs where clusters involve schools, classrooms, or communities. A mistake that is sometimes encountered in the analysis of such data is to ignore the effect of clustering and analyse the data as if it were based on a simple random sample. This typically leads to an overstatement of the precision of results and too liberal conclusions about precision and statistical significance of mean differences. This paper gives simple corrections to the test statistics that would be computed in an analysis of variance if clustering were (incorrectly) ignored. The corrections are multiplicative factors depending on the total sample size, the cluster size, and the intraclass correlation structure. For example, the corrected F statistic has Fisher's F distribution with reduced degrees of freedom. The corrected statistic reduces to the F statistic computed by ignoring clustering when the intraclass correlations are zero. It reduces to the F statistic computed using cluster means when the intraclass correlations are unity, and it is in between otherwise. A similar adjustment to the usual statistic for testing a linear contrast among group means is described.  相似文献   

This study investigated the sensitivity of common fit indices (i.e., RMSEA, CFI, TLI, SRMR-W, and SRMR-B) for detecting misspecified multilevel SEMs. The design factors for the Monte Carlo study were numbers of groups in between-group models (100, 150, and 300), group size (10, 20, 30, and 60), intra-class correlation (low, medium, and high), and the types of model misspecification (Simple and Complex). The simulation results showed that CFI, TLI, and RMSEA could only identify the misspecification in the within-group model. Additionally, CFI, TLI, and RMSEA were more sensitive to misspecification in pattern coefficients while SRMR-W was more sensitive to misspecification in factor covariance. Moreover, TLI outperformed both CFI and RMSEA in terms of the hit rates of detecting the within-group misspecification in factor covariance. On the other hand, SRMR-B was the only fit index sensitive to misspecification in the between-group model and more sensitive to misspecification in factor covariance than misspecification in pattern coefficients. Finally, we found that the influence of ICC on the performance of targeted fit indices was trivial.  相似文献   

Empirical research has identified two distinct item clusters used to measure perceived behavioral control (PBC) labeled self-efficacy and controllability. Self-efficacy has been reported as the superior predictor of intention in all research efforts with these PBC factors. However, we argue that the conception of these item clusters originates from backward theorizing and hypothesize that controllability items are better indicators of the originally conceived PBC construct, while self-efficacy taps the measurement domains of both PBC and intention. Therefore, the purpose of this study was to investigate factor distinction with PBC and intention items across disparate samples of 302 undergraduate students and 267 cancer survivors in the exercise domain. Results found the only model with acceptable fit across populations was that of controllability and intention ( h 2 (8) = 6.91, p = 0.55; RMSEA = 0.00; CFI = 1.00 for undergraduates and h 2 (8) = 26.51, p <0.01; RMSEA = 0.09; CFI = 0.97 for cancer survivors) supporting our hypothesis that the controllability concept, but not the self-efficacy factor, has a clean measurement distinction from intention. Further, self-efficacy items showed factor complexity between PBC and intention across both samples, suggesting that self-efficacy may have been reported as superior to controllability in predicting intention due to measurement redundancy rather than meaningful causal influence.  相似文献   

A scaled difference test statistic [(T)\tilde]d\tilde{T}{}_{d} that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (Psychometrika 66:507–514, 2001). The statistic [(T)\tilde]d\tilde{T}_{d} is asymptotically equivalent to the scaled difference test statistic [`(T)]d\bar{T}_{d} introduced in Satorra (Innovations in Multivariate Statistical Analysis: A Festschrift for Heinz Neudecker, pp. 233–247, 2000), which requires more involved computations beyond standard output of SEM software. The test statistic [(T)\tilde]d\tilde{T}_{d} has been widely used in practice, but in some applications it is negative due to negativity of its associated scaling correction. Using the implicit function theorem, this note develops an improved scaling correction leading to a new scaled difference statistic [`(T)]d\bar{T}_{d} that avoids negative chi-square values.  相似文献   

温涵  梁韵斯 《心理科学》2015,(4):987-994
拟合指数检验是评价结构方程模型(SEM)的重要环节。从协方差结构分析的角度将SEM与传统的回归模型比较,容易理解为什么SEM需要拟合指数。揭示了目前几种流行的拟合指数检验的实质:基于卡方的绝对拟合指数(如RMSEA)检验的实质是重新设定卡方检验的显著性水平(不同于通常的.05),相对拟合指数(如NNFI和CFI)检验的实质是基于虚模型设定均方(卡方与自由度之比)降低到的比例;在NNFI大于临界值后,报告和检验CFI是不必要的。根据研究结果提出了一些方便实用的拟合检验建议。  相似文献   

Previous studies have shown that parental attachment is associated with higher levels of posttraumatic growth (PTG) and resilience in individuals who have experienced traumatic events. The present study investigated perceived social support as one pathway in which parental attachment is related to PTG and resilience among Chinese adolescents who have experienced trauma by considering the role of perceived social support. Participants were 443 Chinese adolescents who had experienced a severe tornado a year prior to this study. The results showed that our model fitted the data well [χ2/df = 2.847, comparative fit index (CFI) = 0.970, TLI = 0.963, root mean square error of approximation (RMSEA) (90% CI) = 0.065 (0.056–0.073)] and revealed that perceived social support partially mediated the relationship between parental attachment, and PTG and resilience. The clinical implications and limitations of our research, and recommendations for future research are discussed in this paper.  相似文献   

This simulation study investigates the performance of three test statistics, T1, T2, and T3, used to evaluate structural equation model fit under non normal data conditions. T1 is the well-known mean-adjusted statistic of Satorra and Bentler. T2 is the mean-and-variance adjusted statistic of Sattertwaithe type where the degrees of freedom is manipulated. T3 is a recently proposed version of T2 that does not manipulate degrees of freedom. Discrepancies between these statistics and their nominal chi-square distribution in terms of errors of Type I and Type II are investigated. All statistics are shown to be sensitive to increasing kurtosis in the data, with Type I error rates often far off the nominal level. Under excess kurtosis true models are generally over-rejected by T1 and under-rejected by T2 and T3, which have similar performance in all conditions. Under misspecification there is a loss of power with increasing kurtosis, especially for T2 and T3. The coefficient of variation of the nonzero eigenvalues of a certain matrix is shown to be a reliable indicator for the adequacy of these statistics.  相似文献   

Goodness-of-fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square, but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's (1984) asymptotically distribution-free method and Satorra Bentler's (1988, 1994) mean scaling statistic were developed under the presumption of nonnormality in the factors and errors. This article finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent, and Bibby's (1980) study of students tested for their ability in 5 content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.  相似文献   

A gender-based model has been designed to study the relationships that exist among self-concept dimensions and some health-promoting behaviours (consumption of healthy food and participation in sports) and health-risk behaviours (consumption of tobacco, alcohol, cannabis and unhealthy food). The model was employed on a representative sample of 1,038 adolescents from the Valencian Community, aged between 15 and 18 years old (528 girls and 510 boys, M age= 16.3; SD= .92). Path analysis with the Lisrel VIII program maximum likelihood method was used. The results show the model's good fit to the data with regard to both the boys (chi 2 /gl= 2.57; RMSR= .04; RMSEA= 0.5; GFI= .98; NNFI= .91; CFI= .97; CN= 350.10) and the girls (chi 2 /gl= 3.28; RMSR= .04; RMSEA= 0.6; GFI= .98; NNFI= .87; CFI= .95; CN= 284.42). For the two sexes, behavioural conduct, social acceptance and close friendship emerged as good predictors of health-risk behaviours. Athletic competence had an indirect influence on health behaviours, with participation in sports being a mediating variable in that relationship.  相似文献   

A variety of indices are commonly used to assess model fit in structural equation modeling. However, fit indices obtained from the normal theory maximum likelihood fit function are affected by the presence of nonnormality in the data. We present a nonnormality correction for 2 commonly used incremental fit indices, the comparative fit index and the Tucker-Lewis index. This correction uses the Satorra-Bentler scaling constant to modify the sample estimate of these fit indices but does not affect the population value. We argue that this type of nonnormality correction is superior to the correction that changes the population value of the fit index implemented in some software programs. In a simulation study, we demonstrate that our correction performs well across a variety of sample sizes, model types, and misspecification types.  相似文献   

Traditional structural equation modeling (SEM) techniques have trouble dealing with incomplete and/or nonnormal data that are often encountered in practice. Yuan and Zhang (2011a) developed a two-stage procedure for SEM to handle nonnormal missing data and proposed four test statistics for overall model evaluation. Although these statistics have been shown to work well with complete data, their performance for incomplete data has not been investigated in the context of robust statistics.

Focusing on a linear growth curve model, a systematic simulation study is conducted to evaluate the accuracy of the parameter estimates and the performance of five test statistics including the naive statistic derived from normal distribution based maximum likelihood (ML), the Satorra-Bentler scaled chi-square statistic (RML), the mean- and variance-adjusted chi-square statistic (AML), Yuan-Bentler residual-based test statistic (CRADF), and Yuan-Bentler residual-based F statistic (RF). Data are generated and analyzed in R using the package rsem (Yuan & Zhang, 2011b).

Based on the simulation study, we can observe the following: (a) The traditional normal distribution-based method cannot yield accurate parameter estimates for nonnormal data, whereas the robust method obtains much more accurate model parameter estimates for nonnormal data and performs almost as well as the normal distribution based method for normal distributed data. (b) With the increase of sample size, or the decrease of missing rate or the number of outliers, the parameter estimates are less biased and the empirical distributions of test statistics are closer to their nominal distributions. (c) The ML test statistic does not work well for nonnormal or missing data. (d) For nonnormal complete data, CRADF and RF work relatively better than RML and AML. (e) For missing completely at random (MCAR) missing data, in almost all the cases, RML and AML work better than CRADF and RF. (f) For nonnormal missing at random (MAR) missing data, CRADF and RF work better than AML. (g) The performance of the robust method does not seem to be influenced by the symmetry of outliers.  相似文献   

Ayala Cohen 《Psychometrika》1986,51(3):379-391
A test is proposed for the equality of the variances ofk 2 correlated variables. Pitman's test fork = 2 reduces the null hypothesis to zero correlation between their sum and their difference. Its extension, eliminating nuisance parameters by a bootstrap procedure, is valid for any correlation structure between thek normally distributed variables. A Monte Carlo study for several combinations of sample sizes and number of variables is presented, comparing the level and power of the new method with previously published tests. Some nonnormal data are included, for which the empirical level tends to be slightly higher than the nominal one. The results show that our method is close in power to the asymptotic tests which are extremely sensitive to nonnormality, yet it is robust and much more powerful than other robust tests.This research was supported by the fund for the promotion of research at the Technion.  相似文献   


We examined the mediating role of health literacy in the relationships between participant demographic characteristics and health information recall. Baseline data from two studies that focused on hypertensive adults (N = 1190; M = 62.28 years, SD = 11.98; 35.5% female; 45.9% African-American) were analyzed. The final model, which adjusted for recruitment site, indicated that financial status, race, and education were indirectly related to health information recall through health literacy. Increasing education was also directly related to better health information recall. Increasing age was not related to health literacy, but was related to poorer health information recall. The final model fit the data very well, χ2(3) = 0.69, p = .36, RMSEA = .000 (90% CI = .000 to .024),CFI = 1.00. The results suggest that health literacy might be one of the mechanisms underlying the relationships between participant demographic characteristics and poor health outcomes due to inaccurate recall of instructions.  相似文献   

The data obtained from one‐way independent groups designs is typically non‐normal in form and rarely equally variable across treatment populations (i.e. population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e. the analysis of variance F test) typically provides invalid results (e.g. too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non‐normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e. trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non‐normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non‐normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non‐normal.  相似文献   

Burnout has been recognized as an important stress‐related problem. Researchers have been troubled by some of the psychometric limitations of the questionnaires developed to evaluate burnout. This study was designed to assess the factor structure of the Spanish Burnout Inventory in a sample of 548 Brazilian public administration employees. This instrument comprises 20 items distributed in four dimensions: enthusiasm toward the job (5 items), psychological exhaustion (4 items), indolence (6 items), and guilt (5 items). The factor structure was examined through confirmatory factor analysis. To assess the factorial validity of the Spanish Burnout Inventory, four alternative models were tested. The four‐factor model obtained an adequate data fit for the sample, χ2(164) = 514.358, p < .001, RMSEA = .062, GFI = .910, NFI = .915, CFI = .940, and AIC = 606.358. The results showed that the four‐model factor of the Spanish Burnout Inventory possesses adequate psychometric properties in the Brazilian cultural context. Implications for future research and practice are also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号