首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
To deal with missing data that arise due to participant nonresponse or attrition, methodologists have recommended an “inclusive” strategy where a large set of auxiliary variables are used to inform the missing data process. In practice, the set of possible auxiliary variables is often too large. We propose using principal components analysis (PCA) to reduce the number of possible auxiliary variables to a manageable number. A series of Monte Carlo simulations compared the performance of the inclusive strategy with eight auxiliary variables (inclusive approach) to the PCA strategy using just one principal component derived from the eight original variables (PCA approach). We examined the influence of four independent variables: magnitude of correlations, rate of missing data, missing data mechanism, and sample size on parameter bias, root mean squared error, and confidence interval coverage. Results indicate that the PCA approach results in unbiased parameter estimates and potentially more accuracy than the inclusive approach. We conclude that using the PCA strategy to reduce the number of auxiliary variables is an effective and practical way to reap the benefits of the inclusive strategy in the presence of many possible auxiliary variables.  相似文献   

2.
The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal variables and that it can handle and discover nonlinear relationships between variables. Also, nonlinear PCA can deal with variables at their appropriate measurement level; for example, it can treat Likert-type scales ordinally instead of numerically. Every observed value of a variable can be referred to as a category. While performing PCA, nonlinear PCA converts every category to a numeric value, in accordance with the variable's analysis level, using optimal quantification. The authors discuss how optimal quantification is carried out, what analysis levels are, which decisions have to be made when applying nonlinear PCA, and how the results can be interpreted. The strengths and limitations of the method are discussed. An example applying nonlinear PCA to empirical data using the program CATPCA (J. J. Meulman, W. J. Heiser, & SPSS, 2004) is provided.  相似文献   

3.
Ordinal data occur frequently in the social sciences. When applying principal component analysis (PCA), however, those data are often treated as numeric, implying linear relationships between the variables at hand; alternatively, non-linear PCA is applied where the obtained quantifications are sometimes hard to interpret. Non-linear PCA for categorical data, also called optimal scoring/scaling, constructs new variables by assigning numerical values to categories such that the proportion of variance in those new variables that is explained by a predefined number of principal components (PCs) is maximized. We propose a penalized version of non-linear PCA for ordinal variables that is a smoothed intermediate between standard PCA on category labels and non-linear PCA as used so far. The new approach is by no means limited to monotonic effects and offers both better interpretability of the non-linear transformation of the category labels and better performance on validation data than unpenalized non-linear PCA and/or standard linear PCA. In particular, an application of penalized optimal scaling to ordinal data as given with the International Classification of Functioning, Disability and Health (ICF) is provided.  相似文献   

4.
The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the components. In this paper, we adapt the variable neighborhood search (VNS) paradigm to develop two heuristics for variable selection in PCA. The performances of these heuristics were compared to those obtained by a branch-and-bound algorithm, as well as forward stepwise, backward stepwise, and tabu search heuristics. In the first experiment, which considered candidate pools of 18 to 30 variables, the VNS heuristics matched the optimal subset obtained by the branch-and-bound algorithm more frequently than their competitors. In the second experiment, which considered candidate pools of 54 to 90 variables, the VNS heuristics provided better solutions than their competitors for a large percentage of the test problems. An application to a real-world data set is provided to demonstrate the importance of variable selection in the context of PCA.  相似文献   

5.
Exploratory factor analysis is a popular statistical technique used in communication research. Although exploratory factor analysis (EFA) and principal components analysis (PCA) are different techniques, PCA is often employed incorrectly to reveal latent constructs (i.e., factors) of observed variables, which is the purpose of EFA. PCA is more appropriate for reducing measured variables into a smaller set of variables (i.e., components) by keeping as much variance as possible out of the total variance in the measured variables. Furthermore, the popular use of varimax rotation raises some concerns about the relationships among the factors that researchers claim to discover. This paper discusses the distinct purposes of PCA and EFA, using two data sets as examples to highlight the differences in results between these procedures, and also reviews the use of each technique in three major communication journals: Communication Monographs, Human Communication Research, and Communication Research.  相似文献   

6.
Four applications of permutation tests to the single-mediator model are described and evaluated in this study. Permutation tests work by rearranging data in many possible ways in order to estimate the sampling distribution for the test statistic. The four applications to mediation evaluated here are the permutation test of ab, the permutation joint significance test, and the noniterative and iterative permutation confidence intervals for ab. A Monte Carlo simulation study was used to compare these four tests with the four best available tests for mediation found in previous research: the joint significance test, the distribution of the product test, and the percentile and bias-corrected bootstrap tests. We compared the different methods on Type I error, power, and confidence interval coverage. The noniterative permutation confidence interval for ab was the best performer among the new methods. It successfully controlled Type I error, had power nearly as good as the most powerful existing methods, and had better coverage than any existing method. The iterative permutation confidence interval for ab had lower power than do some existing methods, but it performed better than any other method in terms of coverage. The permutation confidence interval methods are recommended when estimating a confidence interval is a primary concern. SPSS and SAS macros that estimate these confidence intervals are provided.  相似文献   

7.
It is well known that when data are nonnormally distributed, a test of the significance of Pearson's r may inflate Type I error rates and reduce power. Statistics textbooks and the simulation literature provide several alternatives to Pearson's correlation. However, the relative performance of these alternatives has been unclear. Two simulation studies were conducted to compare 12 methods, including Pearson, Spearman's rank-order, transformation, and resampling approaches. With most sample sizes (n ≥ 20), Type I and Type II error rates were minimized by transforming the data to a normal shape prior to assessing the Pearson correlation. Among transformation approaches, a general purpose rank-based inverse normal transformation (i.e., transformation to rankit scores) was most beneficial. However, when samples were both small (n ≤ 10) and extremely nonnormal, the permutation test often outperformed other alternatives, including various bootstrap tests. (PsycINFO Database Record (c) 2012 APA, all rights reserved).  相似文献   

8.
Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate normality. For nonlinear PCA, however, standard options for establishing stability are not provided. The authors use the nonparametric bootstrap procedure to assess the stability of nonlinear PCA results, applied to empirical data. They use confidence intervals for the variable transformations and confidence ellipses for the eigenvalues, the component loadings, and the person scores. They discuss the balanced version of the bootstrap, bias estimation, and Procrustes rotation. To provide a benchmark, the same bootstrap procedure is applied to linear PCA on the same data. On the basis of the results, the authors advise using at least 1,000 bootstrap samples, using Procrustes rotation on the bootstrap results, examining the bootstrap distributions along with the confidence regions, and merging categories with small marginal frequencies to reduce the variance of the bootstrap results.  相似文献   

9.
We present a new application of sampled permutation testing to examine whether two sequential associations are different within a single dyad (e.g., a teacher and a student). A Monte Carlo simulation with the same (i.e., 100 vs. 100) or a different (100 vs. 400) number of event pairs was used to simulate designs that use time-based (typicallyproducing equal-length comparisons) and event-based (typically producing different-length comparisons) data, respectively. For these pairs of simulated data streams, we compared the Type I error rates and the kappa for agreement on significance decisions, using the sampled permutation tests and the more traditional asymptotic log linear analysis. The results provide the first evidence relevant to evaluating the accuracy of log linear analysis and sampled permutation testing for the purpose of comparing sequential associations within a single dyad.  相似文献   

10.
Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA-ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.  相似文献   

11.
We discuss and contrast 2 methods for investigating the dimensionality of data from tests and questionnaires: the popular principal components analysis (PCA) and the more recent Mokken scale analysis (MSA; Mokken, 1971). First, we discuss the theoretical similarities and differences between both methods. Then, we use both methods to analyze data collected by means of Larson and Chastain's (1990) Self-Concealment Scale (SCS). We present the different results and highlight the instances in which the methods complement one another so as to obtain a stronger result than would be obtained using only 1 method. Finally, we discuss the implications of the results for the dimensionality of the SCS and provide recommendations for both the further development of the SCS and the future use of PCA and MSA in personality research.  相似文献   

12.
The use of principal components analysis (PCA) for the study of evoked-response data may be complicated by variations from one trial to another in the latency of underlying brain events. Such variation can come from either random intra-and intersubject variability or from the effects of independent variables that are manipulated between conditions. The effect of such variability is investigated by simulation of these latency-varying events and by analysis of evoked responses in a behavioral task, the Sternberg memory search task, which is well known to generate variation in the latency of brain events. The results of PCA of within-subjects differences in these two situations are plausibly related to underlying stages of information processing, and the technique may augment reaction time data by providing information on the time of occurrence as well as the duration of stages of information processing.  相似文献   

13.
Cluster bias refers to measurement bias with respect to the clustering variable in multilevel data. The absence of cluster bias implies absence of bias with respect to any cluster‐level (level 2) variable. The variables that possibly cause the bias do not have to be measured to test for cluster bias. Therefore, the test for cluster bias serves as a global test of measurement bias with respect to any level 2 variable. However, the validity of the global test depends on the Type I and Type II error rates of the test. We compare the performance of the test for cluster bias with the restricted factor analysis (RFA) test, which can be used if the variable that leads to measurement bias is measured. It appeared that the RFA test has considerably more power than the test for cluster bias. However, the false positive rates of the test for cluster bias were generally around the expected values, while the RFA test showed unacceptably high false positive rates in some conditions. We conclude that if no significant cluster bias is found, still significant bias with respect to a level 2 violator can be detected with an RFA model. Although the test for cluster bias is less powerful, an advantage of the test is that the cause of the bias does not need to be measured, or even known.  相似文献   

14.
Process factor analysis (PFA) is a latent variable model for intensive longitudinal data. It combines P-technique factor analysis and time series analysis. The goodness-of-fit test in PFA is currently unavailable. In the paper, we propose a parametric bootstrap method for assessing model fit in PFA. We illustrate the test with an empirical data set in which 22 participants rated their effects everyday over a period of 90 days. We also explore Type I error and power of the parametric bootstrap test with simulated data.  相似文献   

15.
When the categories of the independent variable in an analysis of variance are quantitative, it is more informative to evaluate the trends in the treatment means than to simply compare differences among the treatment means. A permutation alternative to the conventional F test is shown to possess significant advantages when analyzing trend among quantitative treatments in a one-way analysis of variance. An example with and without an extreme data point illustrates the effectiveness of the permutation alternative for the analysis of trend when homogeneity of variance is compromised.  相似文献   

16.
Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.  相似文献   

17.
We compare development and learning of the visual control of movement from an ecological perspective. It is argued that although the constraints that are imposed upon development and learning are vastly different, both are best characterised as a change towards the use of more useful and specifying optic variables. Implicit learning, in which awareness is drawn away from movement execution, is most appropriate to accomplish this change in optic variable use, although its contribution in development is more contentious. Alternatively, learning can also be affected by explicit processes. We propose that explicit learning would typically invoke vision for perception processes instead of the designated vision for action processes. It is for that reason that after explicit learning performance is more easily compromised in the face of pressure or disorders. We present a way to deal with the issue of explicit learning during infancy.  相似文献   

18.
A particular strategy for investigating effects resulting from a MANOVA is proposed. The strategy involves multiple two-group multivariate analyses. The two groups result from considering multivariate pairwise group contrasts or multivariate complex group contrasts. Assuming a given two-group analysis yields real effects, the resultant single linear discriminant function (LDF) may be studied. A rationale based on a transformation of LDF weights, due to V. Y. Urbakh, is recommended for assessing variable relative contribution. The analysis strategy is described in detail and illustrated with real data sets.  相似文献   

19.
The standard methods for decomposition and analysis of evoked potentials are bandpass filtering, identification of peak amplitudes and latencies, and principal component analysis (PCA). We discuss the limitations of these and other approaches and introduce wavelet packet analysis. Then we propose the "single-channel wavelet packet model," a new approach in which a unique decomposition is achieved using prior time-frequency information and differences in the responses of the components to changes in experimental conditions. Orthogonal sets of wavelet packets allow a parsimonious time-frequency representation of the components. The method allows energy in some wavelet packets to be shared among two or more components, so the components are not necessarily orthogonal. The single-channel wavelet packet model and PCA both require constraints to achieve a unique decomposition. In PCA, however, the constraints are defined by mathematical convenience and may be unrealistic. In the single-channel wavelet packet model, the constraints are based on prior scientific knowledge. We give an application of the method to auditory evoked potentials recorded from cats. The good frequency resolution of wavelet packets allows us to separate superimposed components in these data. Our present approach yields estimates of component waveforms and the effects of experiment conditions on the amplitude of the components. We discuss future extensions that will provide confidence intervals and p values, allow for latency changes, and represent multichannel data.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号