首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Nonnormality of univariate data has been extensively examined previously (Blanca et al., Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84, 2013; Miceeri, Psychological Bulletin, 105(1), 156, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74 % of univariate distributions and 68 % multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17 % in a t-test and 30 % in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances. To facilitate future report of skewness and kurtosis, we provide a tutorial on how to compute univariate and multivariate skewness and kurtosis by SAS, SPSS, R and a newly developed Web application.  相似文献   

2.
Many models for multivariate data analysis can be seen as special cases of the linear dynamic or state space model. Contrary to the classical approach to linear dynamic systems analysis, in which high-dimensional exact solutions are sought, the model presented here is developed from a social science framework where low-dimensional approximate solutions are preferred. Borrowing concepts from the theory on mixture distributions, the linear dynamic model can be viewed as a multi-layered regression model, in which the output variables are imprecise manifestations of an unobserved continuous process. An additional layer of mixing makes it possible to incorporate non-normal as well as ordinal variables.Using the EM-algorithm, we find estimates of the unknown model parameters, simultaneously providing stability estimates. The model is very general and cannot be well estimated by other estimation methods. We illustrate the applicability of the obtained procedure through an example with generated data.  相似文献   

3.
When conducting robustness research where the focus of attention is on the impact of non-normality, the marginal skewness and kurtosis are often used to set the degree of non-normality. Monte Carlo methods are commonly applied to conduct this type of research by simulating data from distributions with skewness and kurtosis constrained to pre-specified values. Although several procedures have been proposed to simulate data from distributions with these constraints, no corresponding procedures have been applied for discrete distributions. In this paper, we present two procedures based on the principles of maximum entropy and minimum cross-entropy to estimate the multivariate observed ordinal distributions with constraints on skewness and kurtosis. For these procedures, the correlation matrix of the observed variables is not specified but depends on the relationships between the latent response variables. With the estimated distributions, researchers can study robustness not only focusing on the levels of non-normality but also on the variations in the distribution shapes. A simulation study demonstrates that these procedures yield excellent agreement between specified parameters and those of estimated distributions. A robustness study concerning the effect of distribution shape in the context of confirmatory factor analysis shows that shape can affect the robust \(\chi ^2\) and robust fit indices, especially when the sample size is small, the data are severely non-normal, and the fitted model is complex.  相似文献   

4.
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the distributions of the factors are modeled nonparametrically through a dynamic hierarchical Dirichlet process prior. A Markov chain Monte Carlo algorithm is developed for fitting the model, and the methodology is exemplified through a study of the dynamics of public attitudes toward science and technology in the United States over the period 1992?C2001.  相似文献   

5.
Explaining group-level outcomes from individual-level predictors requires aggregating the individual-level scores to the group level and correcting the group-level estimates for measurement errors in the aggregated scores. However, for discrete variables it is not clear how to perform the aggregation and correction. It is shown how stepwise latent class analysis can be used to do this. First, a latent class model is estimated in which the scores on a discrete individual-level predictor are used to construct group-level latent classes. Second, this latent class model is used to aggregate the individual-level predictor by assigning the groups to the latent classes. Third, a group-level analysis is performed in which the aggregated measures are related to the remaining group-level variables while correcting for the measurement error in the class assignments. This stepwise approach is introduced in a multilevel mediation model with a single individual-level mediator, and compared to existing methods in a simulation study. We also show how a mediation model with multiple group-level latent variables can be used with multiple individual-level mediators and this model is applied to explain team productivity (group level) as a function of job control (individual level), job satisfaction (individual level), and enriched job design (group level).  相似文献   

6.
Generalized structured component analysis (GSCA) is a component-based approach to structural equation modelling, which adopts components of observed variables as proxies for latent variables and examines directional relationships among latent and observed variables. GSCA has been extended to deal with a wider range of data types, including discrete, multilevel or intensive longitudinal data, as well as to accommodate a greater variety of complex analyses such as latent moderation analysis, the capturing of cluster-level heterogeneity, and regularized analysis. To date, however, there has been no attempt to generalize the scope of GSCA into the Bayesian framework. In this paper, a novel extension of GSCA, called BGSCA, is proposed that estimates parameters within the Bayesian framework. BGSCA can be more attractive than the original GSCA for various reasons. For example, it can infer the probability distributions of random parameters, account for error variances in the measurement model, provide additional fit measures for model assessment and comparison from the Bayesian perspectives, and incorporate external information on parameters, which may be obtainable from past research, expert opinions, subjective beliefs or knowledge on the parameters. We utilize a Markov chain Monte Carlo method, the Gibbs sampler, to update the posterior distributions for the parameters of BGSCA. We conduct a simulation study to evaluate the performance of BGSCA. We also apply BGSCA to real data to demonstrate its empirical usefulness.  相似文献   

7.
An algorithm described by Graybill (1969) factors a population correlation matrix, R, into an upper and lower triangular matrix, T and T′, such that R=T′T. The matrix T is used to generate multivariate data sets from a multinormal distribution. When this algorithm is used to generate data for nonnormal distributions, however, the sample correlations are systematically biased downward. We describe an iterative technique that removes this bias by adjusting the initial correlation matrix. R, factored by the Graybill algorithm. The method is illustrated by simulating a multivariate study by Mihal and Barrett (1976). Large-N simulations indicate that the iterative technique works: multivariate data sets generated with this approach successfully model both the univariate distributions of the individual variables and their multivariate structure (as assessed by intercorrelation and regression analyses).  相似文献   

8.
Current practice in structural modeling of observed continuous random variables is limited to representation systems for first and second moments (e.g., means and covariances), and to distribution theory based on multivariate normality. In psychometrics the multinormality assumption is often incorrect, so that statistical tests on parameters, or model goodness of fit, will frequently be incorrect as well. It is shown that higher order product moments yield important structural information when the distribution of variables is arbitrary. Structural representations are developed for generalizations of the Bentler-Weeks, Jöreskog-Keesling-Wiley, and factor analytic models. Some asymptotically distribution-free efficient estimators for such arbitrary structural models are developed. Limited information estimators are obtained as well. The special case of elliptical distributions that allow nonzero but equal kurtoses for variables is discussed in some detail. The argument is made that multivariate normal theory for covariance structure models should be abandoned in favor of elliptical theory, which is only slightly more difficult to apply in practice but specializes to the traditional case when normality holds. Many open research areas are described.  相似文献   

9.
This study compared 50 Irish and 50 American graduate and undergraduate psychology and counseling students on the ways they rated feelings of love, anger, and guilt on a semantic differential. A 2 × 2 × 2 multivariate analysis of variance was used in which the independent variables were class, gender, and country and the dependent variables were the semantic differential scales used for this research. It was found that the undergraduate students from Ireland evaluated the concepts of anger and guilt more positively than did the undergraduate students from the United States.  相似文献   

10.
We present an hierarchical Bayes approach to modeling parameter heterogeneity in generalized linear models. The model assumes that there are relevant subpopulations and that within each subpopulation the individual-level regression coefficients have a multivariate normal distribution. However, class membership is not known a priori, so the heterogeneity in the regression coefficients becomes a finite mixture of normal distributions. This approach combines the flexibility of semiparametric, latent class models that assume common parameters for each sub-population and the parsimony of random effects models that assume normal distributions for the regression parameters. The number of subpopulations is selected to maximize the posterior probability of the model being true. Simulations are presented which document the performance of the methodology for synthetic data with known heterogeneity and number of sub-populations. An application is presented concerning preferences for various aspects of personal computers.  相似文献   

11.
The increasing use of ordinal variables in different fields has led to the introduction of new statistical methods for their analysis. The performance of these methods needs to be investigated under a number of experimental conditions. Procedures to simulate from ordinal variables are then required. In this article, we deal with simulation from multivariate ordinal random variables. We propose a new procedure for generating samples from ordinal random variables with a prespecified correlation matrix and marginal distributions. Its features are examined and compared with those of its main competitors. A software implementation in R is also provided along with examples of its application.  相似文献   

12.
Correspondence analysis leads to a graphical representation of the associations between categories of the row and column variables of a contingency table. Greenacre's (1988) formulation of joint correspondence analysis is a multivariate extension which finds the optimal joint display of contingency tables between all pairs of variables in a set. Greenacre presented a discrepancy function and an alternating least squares algorithm for its minimization. Boik (1996) presented an alternative algorithm, also of the alternating least squares type, for minimizing the same discrepancy function. In this paper, a noniterative procedure, not based on the minimization of any discrepancy function, is described.  相似文献   

13.
This short note emphasizes the need for multivariate analysis when multiple correlated dependent variables are used in a study. The use of multivariate analysis and the consequences of not using it are illustrated in relation to a previously published study that used self-concept subscales as dependent variables.  相似文献   

14.
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In this article, we propose a broad class of semiparametric Bayesian SEMs, which allow mixed categorical and continuous manifest variables while also allowing the latent variables to have unknown distributions. In order to include typical identifiability restrictions on the latent variable distributions, we rely on centered Dirichlet process (CDP) and CDP mixture (CDPM) models. The CDP will induce a latent class model with an unknown number of classes, while the CDPM will induce a latent trait model with unknown densities for the latent traits. A simple and efficient Markov chain Monte Carlo algorithm is developed for posterior computation, and the methods are illustrated using simulated examples, and several applications.  相似文献   

15.
Mixture analysis of count data has become increasingly popular among researchers of substance use, behavioral analysis, and program evaluation. However, this increase in popularity seems to have occurred along with adoption of some conventions in model specification based on arbitrary heuristics that may impact the validity of results. Findings from a systematic review of recent drug and alcohol publications suggested count variables are often dichotomized or misspecified as continuous normal indicators in mixture analysis. Prior research suggests that misspecifying skewed distributions of continuous indicators in mixture analysis introduces bias, though the consequences of this practice when applied to count indicators has not been studied. The present work describes results from a simulation study examining bias in mixture recovery when count indicators are dichotomized (median split; presence vs. absence), ordinalized, or the distribution is misspecified (continuous normal; incorrect count distribution). All distributional misspecifications and methods of categorizing resulted in greater bias in parameter estimates and recovery of class membership relative to specifying the true distribution, though dichotomization appeared to improve class enumeration accuracy relative to all other specifications. Overall, results demonstrate the importance of accurately modeling count indicators in mixture analysis, as misspecification and categorizing data can distort study outcomes.  相似文献   

16.
A multinormal partial credit model for factor analysis of polytomously scored items with ordered response categories is derived using an extension of the Dutch Identity (Holland in Psychometrika 55:5?C18, 1990). In the model, latent variables are assumed to have a multivariate normal distribution conditional on unweighted sums of item scores, which are sufficient statistics. Attention is paid to maximum likelihood estimation of item parameters, multivariate moments of latent variables, and person parameters. It is shown that the maximum likelihood estimates can be found without the use of numerical integration techniques. More general models are discussed which can be used for testing the model, and it is shown how models with different numbers of latent variables can be tested against each other. In addition, multi-group extensions are proposed, which can be used for testing both measurement invariance and latent population differences. Models and procedures discussed are demonstrated in an empirical data example.  相似文献   

17.
Concise formulas for the standard errors of component loading estimates   总被引:1,自引:0,他引:1  
Concise formulas for the asymptotic standard errors of component loading estimates were derived. The formulas cover the cases of principal component analysis for unstandardized and standardized variables with orthogonal and oblique rotations. The formulas can be used under any distributions for observed variables as long as the asymptotic covariance matrix for sample covariances/correlations is available. The estimated standard errors in numerical examples were shown to be equivalent to those by the methods using information matrices.The author is indebted to anonymous reviewers for the corrections and suggestions on this study, which have led to improvements of earlier versions of this article.  相似文献   

18.
This paper extends the biplot technique to canonical correlation analysis and redundancy analysis. The plot of structure correlations is shown to the optimal for displaying the pairwise correlations between the variables of the one set and those of the second. The link between multivariate regression and canonical correlation analysis/redundancy analysis is exploited for producing an optimal biplot that displays a matrix of regression coefficients. This plot can be made from the canonical weights of the predictors and the structure correlations of the criterion variables. An example is used to show how the proposed biplots may be interpreted.  相似文献   

19.
A review of model-selection criteria is presented, with a view toward showing their similarities. It is suggested that some problems treated by sequences of hypothesis tests may be more expeditiously treated by the application of model-selection criteria. Consideration is given to application of model-selection criteria to some problems of multivariate analysis, especially the clustering of variables, factor analysis and, more generally, describing a complex of variables.  相似文献   

20.
对2288个留守儿童样本用儿童长处与困难问卷(SDQ)进行施测,并对他们的情绪和行为问题特征进行探索性的潜在类别分析,发现留守儿童的情绪和行为问题存在明显的分组特征。统计指标支持了3个潜类别的模型,根据3个潜在类别在问卷各条目上的条件概率特征分别定义为:“适应困难组”,“行为冲动组”和“良好适应组”,3个潜在类别所占全体样本的比例分别为32%、41%和27%。进一步的分析发现:相比较良好适应组而言,适应困难组和行为冲动组有着显著的性别和年级水平效应,适应困难组和行为冲动组的男生所占比例更大;同时,这两个组的小学生所占的比例也更大。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号