期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Effects of Covariance Heterogeneity on Three Procedures for Analyzing Multivariate Repeated Measures Designs

《Multivariate behavioral research》2013,48(1):01-27

Empirical Type I error and power rates were estimated for (a) the doubly multivariate model, (b) the Welch-James multivariate solution developed by Keselman, Carriere and Lix (1993) using Johansen's results (1980), and for (c) the multivariate version of the modified Brown-Forsythe (1974) procedure. The performance of these procedures was investigated by testing within- blocks sources of variation in a multivariate split-plot design containing unequal covariance matrices. The results indicate that the doubly multivariate model did not provide effective Type I error control while the Welch-James procedure provided robust and powerful tests of the within-subjects main effect, however, this approach provided liberal tests of the interaction effect. The results also indicate that the modified Brown-Forsythe procedure provided robust tests of within-subjects main and interaction effects, especially when the design was balanced or when group sizes and covariance matrices were positively paired. 相似文献

2.

Power comparison of new tests to analyze repeated measures data

Livacic-Rojas P Vallejo G Fernández P 《Psicothema》2007,19(4):673-678

This work compares the sensitivity of five modern analytical techniques for detecting the effects of a design with measures which are partially repeated when the assumptions of the traditional ANOVA approach are not met, namely: the approach of the mixed model adjusted by means of the SAS Proc Mixed module, the Bootstrap-F approach, the Brown-Forsythe multivariate approach, the Welch-James multivariate approach and Welch-James multivariate approach with robust estimators. Previously, Livacic-Rojas, Vallejo and Fernández found out that these methods are comparable in terms of their Type I error rates. The results obtained suggest that the mixed model approach, as well as the Brown-Forsythe and Welch-James approaches, satisfactorily controlled the Type II error rates corresponding to the main effects of the measurement occasions under most of the conditions assessed. 相似文献

3.

A robust approach for analyzing unbalanced factorial designs with fixed levels

Guillermo Vallejo Manuel Ato M. Paula Fernández 《Behavior research methods》2010,42(2):607-617

The goal of this study was to investigate the performance of Hall’s transformation of the Brunner-Dette-Munk (BDM) and Welch-James (WJ) test statistics and Box-Cox’s data transformation in factorial designs when normality and variance homogeneity assumptions were violated separately and jointly. On the basis of unweighted marginal means, we performed a simulation study to explore the operating characteristics of the methods proposed for a variety of distributions with small sample sizes. Monte Carlo simulation results showed that when data were sampled from symmetric distributions, the error rates of the original BDM and WJ tests were scarcely affected by the lack of normality and homogeneity of variance. In contrast, when data were sampled from skewed distributions, the original BDM and WJ rates were not well controlled. Under such circumstances, the results clearly revealed that Hall’s transformation of the BDM and WJ tests provided generally better control of Type I error rates than did the same tests based on Box-Cox’s data transformation. Among all the methods considered in this study, we also found that Hall’s transformation of the BDM test yielded the best control of Type I errors, although it was often less powerful than either of the WJ tests when both approaches reasonably controlled the error rates. 相似文献

4.

Comparing means from nonnormal distributions: The bisquare-weighted analysis of variance

Rebecca Anne Regeth W. Wren Stine 《Behavior research methods》1998,30(4):707-712

A new procedure analogous to the analysis of variance (ANOVA), called the bisquare-weighted ANOVA (bANOVA), is described. When a traditional ANOVA is calculated, using samples from a distribution with heavy tails, the Type I error rates remain in check, but the Type II error rates increase, relative to those across samples from a normal distribution. The bANOVA is robust with respect to deviations from a normal distribution, maintaining high power with normal and heavy-tailed distributions alike. The more popular rank ANOVA (rANOVA) is also described briefly. However, the rANOVA is not as robust to large deviations from normality as is the bANOVA, and it generates high Type I error rates when applied to three-way designs. 相似文献

5.

Evaluating the robustness of repeated measures analyses: The case of small sample sizes and nonnormal data

Daniel Oberfeld Thomas Franke 《Behavior research methods》2013,45(3):792-812

Repeated measures analyses of variance are the method of choice in many studies from experimental psychology and the neurosciences. Data from these fields are often characterized by small sample sizes, high numbers of factor levels of the within-subjects factor(s), and nonnormally distributed response variables such as response times. For a design with a single within-subjects factor, we investigated Type I error control in univariate tests with corrected degrees of freedom, the multivariate approach, and a mixed-model (multilevel) approach (SAS PROC MIXED) with Kenward–Roger’s adjusted degrees of freedom. We simulated multivariate normal and nonnormal distributions with varied population variance–covariance structures (spherical and nonspherical), sample sizes (N), and numbers of factor levels (K). For normally distributed data, as expected, the univariate approach with Huynh–Feldt correction controlled the Type I error rate with only very few exceptions, even if samples sizes as low as three were combined with high numbers of factor levels. The multivariate approach also controlled the Type I error rate, but it requires N ≥ K. PROC MIXED often showed acceptable control of the Type I error rate for normal data, but it also produced several liberal or conservative results. For nonnormal data, all of the procedures showed clear deviations from the nominal Type I error rate in many conditions, even for sample sizes greater than 50. Thus, none of these approaches can be considered robust if the response variable is nonnormally distributed. The results indicate that both the variance heterogeneity and covariance heterogeneity of the population covariance matrices affect the error rates. 相似文献

6.

Evaluating clinical significance: Incorporating robust statistics with normative comparison tests

Katrina van Wieringen Robert A. Cribbie 《The British journal of mathematical and statistical psychology》2014,67(2):213-230

The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons. 相似文献

7.

Mixed-model pairwise multiple comparisons of repeated measures means

Kowalchuk RK Keselman HJ 《心理学方法》2001,6(3):282-296

One approach to the analysis of repeated measures data allows researchers to model the covariance structure of the data rather than presume a certain structure, as is the case with conventional univariate and multivariate test statistics. This mixed-model approach was evaluated for testing all possible pairwise differences among repeated measures marginal means in a Between-Subjects x Within-Subjects design. Specifically, the authors investigated Type I error and power rates for a number of simultaneous and stepwise multiple comparison procedures using SAS (1999) PROC MIXED in unbalanced designs when normality and covariance homogeneity assumptions did not hold. J. P. Shaffer's (1986) sequentially rejective step-down and Y. Hochberg's (1988) sequentially acceptive step-up Bonferroni procedures, based on an unstructured covariance structure, had superior Type I error control and power to detect true pairwise differences across the investigated conditions. 相似文献

8.

An improved Hochberg procedure for multiple tests of significance

Dror M. Rom 《The British journal of mathematical and statistical psychology》2013,66(1):189-196

We propose a simple modification of Hochberg's step‐up Bonferroni procedure for multiple tests of significance. The proposed procedure is always more powerful than Hochberg's procedure for more than two tests, and is more powerful than Hommel's procedure for three and four tests. A numerical analysis of the new procedure indicates that its Type I error is controlled under independence of the test statistics, at a level equal to or just below the nominal Type I error. Examination of various non‐null configurations of hypotheses shows that the modified procedure has a power advantage over Hochberg's procedure which increases in relationship to the number of false hypotheses. 相似文献

9.

A one-way random effects model for trimmed means 总被引：1，自引：0，他引：1

Rand R. Wilcox 《Psychometrika》1994,59(3):289-306

The random effects ANOVA model plays an important role in many psychological studies, but the usual model suffers from at least two serious problems. The first is that even under normality, violating the assumption of equal variances can have serious consequences in terms of Type I errors or significance levels, and it can affect power as well. The second and perhaps more serious concern is that even slight departures from normality can result in a substantial loss of power when testing hypotheses. Jeyaratnam and Othman (1985) proposed a method for handling unequal variances, under the assumption of normality, but no results were given on how their procedure performs when distributions are nonnormal. A secondary goal in this paper is to address this issue via simulations. As will be seen, problems arise with both Type I errors and power. Another secondary goal is to provide new simulation results on the Rust-Fligner modification of the Kruskal-Wallis test. The primary goal is to propose a generalization of the usual random effects model based on trimmed means. The resulting test of no differences among J randomly sampled groups has certain advantages in terms of Type I errors, and it can yield substantial gains in power when distributions have heavy tails and outliers. This last feature is very important in applied work because recent investigations indicate that heavy-tailed distributions are common. Included is a suggestion for a heteroscedastic Winsorized analog of the usual intraclass correlation coefficient. 相似文献

10.

Comparing one-step m-estimators of location corresponding to two independent groups

Rand R. Wilcox 《Psychometrika》1992,57(1):141-154

Experience with real data indicates that psychometric measures often have heavy-tailed distributions. This is known to be a serious problem when comparing the means of two independent groups because heavy-tailed distributions can have a serious effect on power. Another problem that is common in some areas is outliers. This paper suggests an approach to these problems based on the one-step M-estimator of location. Simulations indicate that the new procedure provides very good control over the probability of a Type I error even when distributions are skewed, have different shapes, and the variances are unequal. Moreover, the new procedure has considerably more power than Welch's method when distributions have heavy tails, and it compares well to Yuen's method for comparing trimmed means. Wilcox's median procedure has about the same power as the proposed procedure, but Wilcox's method is based on a statistic that has a finite sample breakdown point of only 1/n, wheren is the sample size. Comments on other methods for comparing groups are also included. 相似文献

11.

Testing for adverse impact when sample size is small

Collins MW Morris SB 《The Journal of applied psychology》2008,93(2):463-471

Adverse impact evaluations often call for evidence that the disparity between groups in selection rates is statistically significant, and practitioners must choose which test statistic to apply in this situation. To identify the most effective testing procedure, the authors compared several alternate test statistics in terms of Type I error rates and power, focusing on situations with small samples. Significance testing was found to be of limited value because of low power for all tests. Among the alternate test statistics, the widely-used Z-test on the difference between two proportions performed reasonably well, except when sample size was extremely small. A test suggested by G. J. G. Upton (1982) provided slightly better control of Type I error under some conditions but generally produced results similar to the Z-test. Use of the Fisher Exact Test and Yates's continuity-corrected chi-square test are not recommended because of overly conservative Type I error rates and substantially lower power than the Z-test. 相似文献

12.

Testing the significance of a correlation with nonnormal data: Comparison of Pearson, Spearman, transformation, and resampling approaches 总被引：2，自引：0，他引：2

Bishara AJ Hittner JB 《心理学方法》2012,17(3):399-417

It is well known that when data are nonnormally distributed, a test of the significance of Pearson's r may inflate Type I error rates and reduce power. Statistics textbooks and the simulation literature provide several alternatives to Pearson's correlation. However, the relative performance of these alternatives has been unclear. Two simulation studies were conducted to compare 12 methods, including Pearson, Spearman's rank-order, transformation, and resampling approaches. With most sample sizes (n ≥ 20), Type I and Type II error rates were minimized by transforming the data to a normal shape prior to assessing the Pearson correlation. Among transformation approaches, a general purpose rank-based inverse normal transformation (i.e., transformation to rankit scores) was most beneficial. However, when samples were both small (n ≤ 10) and extremely nonnormal, the permutation test often outperformed other alternatives, including various bootstrap tests. (PsycINFO Database Record (c) 2012 APA, all rights reserved). 相似文献

13.

Adaptive robust estimation and testing

《The British journal of mathematical and statistical psychology》2007,60(2):267-293

We examined nine adaptive methods of trimming, that is, methods that empirically determine when data should be trimmed and the amount to be trimmed from the tails of the empirical distribution. Over the 240 empirical values collected for each method investigated, in which we varied the total percentage of data trimmed, sample size, degree of variance heterogeneity, pairing of variances and group sizes, and population shape, one method resulted in exceptionally good control of Type I errors. However, under less extreme cases of non‐normality and variance heterogeneity a number of methods exhibited reasonably good Type I error control. With regard to the power to detect non‐null treatment effects, we found that the choice among the methods depended on the degree of non‐normality and variance heterogeneity. Recommendations are offered. 相似文献

14.

Non‐parametric three‐way mixed ANOVA with aligned rank tests

下载免费PDF全文

Juan C. Oliver‐Rodríguez X. T. Wang 《The British journal of mathematical and statistical psychology》2015,68(1):23-42

Research problems that require a non‐parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two‐way interaction is proposed for the analysis of the typical sources of variation in a three‐way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non‐normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non‐normal distributions and large sample sizes. Degrees‐of‐freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods. 相似文献

15.

Regression-based techniques for statistical decision making in single-case designs

Manolov R Arnau J Solanas A Bono R 《Psicothema》2010,22(4):1026-1032

The present study evaluates the performance of four methods for estimating regression coefficients used to make statistical decisions about intervention effectiveness in single-case designs. Ordinary least square estimation is compared to two correction techniques dealing with general trend and a procedure that eliminates autocorrelation whenever it is present. Type I error rates and statistical power are studied for experimental conditions defined by the presence or absence of treatment effect (change in level or in slope), general trend, and serial dependence. The results show that empirical Type I error rates do not approach the nominal ones in the presence of autocorrelation or general trend when ordinary and generalized least squares are applied. The techniques controlling trend show lower false alarm rates, but prove to be insufficiently sensitive to existing treatment effects. Consequently, the use of the statistical significance of the regression coefficients for detecting treatment effects is not recommended for short data series. 相似文献

16.

Type I error rates and power analyses for single-point sensitivity measures

Rotello CM Masson ME Verde MF 《Perception & psychophysics》2008,70(2):389-401

Experiments often produce a hit rate and a false alarm rate in each of two conditions. These response rates are summarized into a single-point sensitivity measure such as d', and t tests are conducted to test for experimental effects. Using large-scale Monte Carlo simulations, we evaluate the Type I error rates and power that result from four commonly used single-point measures: d', A', percent correct, and gamma. We also test a newly proposed measure called gammaC. For all measures, we consider several ways of handling cases in which false alarm rate = 0 or hit rate = 1. The results of our simulations indicate that power is similar for these measures but that the Type I error rates are often unacceptably high. Type I errors are minimized when the selected sensitivity measure is theoretically appropriate for the data. 相似文献

17.

Heterogeneous heterogeneity by default: Testing categorical moderators in mixed-effects meta-analysis

Josue E. Rodriguez Donald R. Williams Paul-Christian Bürkner 《The British journal of mathematical and statistical psychology》2023,76(2):402-433

Categorical moderators are often included in mixed-effects meta-analysis to explain heterogeneity in effect sizes. An assumption in tests of categorical moderator effects is that of a constant between-study variance across all levels of the moderator. Although it rarely receives serious thought, there can be statistical ramifications to upholding this assumption. We propose that researchers should instead default to assuming unequal between-study variances when analysing categorical moderators. To achieve this, we suggest using a mixed-effects location-scale model (MELSM) to allow group-specific estimates for the between-study variance. In two extensive simulation studies, we show that in terms of Type I error and statistical power, little is lost by using the MELSM for moderator tests, but there can be serious costs when an equal variance mixed-effects model (MEM) is used. Most notably, in scenarios with balanced sample sizes or equal between-study variance, the Type I error and power rates are nearly identical between the MEM and the MELSM. On the other hand, with imbalanced sample sizes and unequal variances, the Type I error rate under the MEM can be grossly inflated or overly conservative, whereas the MELSM does comparatively well in controlling the Type I error across the majority of cases. A notable exception where the MELSM did not clearly outperform the MEM was in the case of few studies (e.g., 5). With respect to power, the MELSM had similar or higher power than the MEM in conditions where the latter produced non-inflated Type 1 error rates. Together, our results support the idea that assuming unequal between-study variances is preferred as a default strategy when testing categorical moderators. 相似文献

18.

Robust step‐down tests for multivariate independent group designs

《The British journal of mathematical and statistical psychology》2007,60(2):245-265

A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T² and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T² across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness. 相似文献

19.

Four applications of permutation methods to testing a single-mediator model

Taylor AB MacKinnon DP 《Behavior research methods》2012,44(3):806-844

Four applications of permutation tests to the single-mediator model are described and evaluated in this study. Permutation tests work by rearranging data in many possible ways in order to estimate the sampling distribution for the test statistic. The four applications to mediation evaluated here are the permutation test of ab, the permutation joint significance test, and the noniterative and iterative permutation confidence intervals for ab. A Monte Carlo simulation study was used to compare these four tests with the four best available tests for mediation found in previous research: the joint significance test, the distribution of the product test, and the percentile and bias-corrected bootstrap tests. We compared the different methods on Type I error, power, and confidence interval coverage. The noniterative permutation confidence interval for ab was the best performer among the new methods. It successfully controlled Type I error, had power nearly as good as the most powerful existing methods, and had better coverage than any existing method. The iterative permutation confidence interval for ab had lower power than do some existing methods, but it performed better than any other method in terms of coverage. The permutation confidence interval methods are recommended when estimating a confidence interval is a primary concern. SPSS and SAS macros that estimate these confidence intervals are provided. 相似文献

20.

Power and Type I errors for pairwise comparisons of means in the unequal variances case

Philip H. Ramsey Patricia P. Ramsey 《The British journal of mathematical and statistical psychology》2009,62(2):263-281

A Monte Carlo simulation was conducted to compare pairwise multiple comparison procedures. The number of means varied from 4 to 8 and the sample sizes varied from 2 to 500. Procedures were evaluated on the basis of Type I errors, any‐pair power and all‐pairs power. Two modifications of the Games and Howell procedure were shown to make it conservative. No procedure was found to be uniformly most powerful. For any pair power the Games and Howell procedure was found to be generally most powerful even when applied at more stringent levels to control Type I errors. For all pairs power the Peritz procedure applied with modified Brown–Forsythe tests was found to be most powerful in most conditions. 相似文献