首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The data obtained from one‐way independent groups designs is typically non‐normal in form and rarely equally variable across treatment populations (i.e. population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e. the analysis of variance F test) typically provides invalid results (e.g. too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non‐normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e. trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non‐normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non‐normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non‐normal.  相似文献   

2.
Researchers often want to demonstrate a lack of interaction between two categorical predictors on an outcome. To justify a lack of interaction, researchers typically accept the null hypothesis of no interaction from a conventional analysis of variance (ANOVA). This method is inappropriate as failure to reject the null hypothesis does not provide statistical evidence to support a lack of interaction. This study proposes a bootstrap‐based intersection–union test for negligible interaction that provides coherent decisions between the omnibus test and post hoc interaction contrast tests and is robust to violations of the normality and variance homogeneity assumptions. Further, a multiple comparison strategy for testing interaction contrasts following a non‐significant omnibus test is proposed. Our simulation study compared the Type I error control, omnibus power and per‐contrast power of the proposed approach to the non‐centrality‐based negligible interaction test of Cheng and Shao (2007, Statistica Sinica, 17, 1441). For 2 × 2 designs, the empirical Type I error rates of the Cheng and Shao test were very close to the nominal α level when the normality and variance homogeneity assumptions were satisfied; however, only our proposed bootstrapping approach was satisfactory under non‐normality and/or variance heterogeneity. In general a × b designs, although the omnibus Cheng and Shao test, as expected, is the most powerful, it is not robust to assumption violation and results in incoherent omnibus and interaction contrast decisions that are not possible with the intersection–union approach.  相似文献   

3.
The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons.  相似文献   

4.
Wilcox, Keselman, Muska and Cribbie (2000) found a method for comparing the trimmed means of dependent groups that performed well in simulations, in terms of Type I errors, with a sample size as small as 21. Theory and simulations indicate that little power is lost under normality when using trimmed means rather than untrimmed means, and trimmed means can result in substantially higher power when sampling from a heavy‐tailed distribution. However, trimmed means suffer from two practical concerns described in this paper. Replacing trimmed means with a robust M‐estimator addresses these concerns, but control over the probability of a Type I error can be unsatisfactory when the sample size is small. Methods based on a simple modification of a one‐step M‐estimator that address the problems with trimmed means are examined. Several omnibus tests are compared, one of which performed well in simulations, even with a sample size of 11.  相似文献   

5.
A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T2 and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T2 across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness.  相似文献   

6.
Abstract This article considers the problem of comparing two independent groups in terms of some measure of location. It is well known that with Student's two-independent-sample t test, the actual level of significance can be well above or below the nominal level, confidence intervals can have inaccurate probability coverage, and power can be low relative to other methods. A solution to deal with heterogeneity is Welch's (1938) test. Welch's test deals with heteroscedasticity but can have poor power under arbitrarily small departures from normality. Yuen (1974) generalized Welch's test to trimmed means; her method provides improved control over the probability of a Type I error, but problems remain. Transformations for skewness improve matters, but the probability of a Type I error remains unsatisfactory in some situations. We find that a transformation for skewness combined with a bootstrap method improves Type I error control and probability coverage even if sample sizes are small.  相似文献   

7.
Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of freedom heteroscedastic statistic for independent and correlated groups designs in order to achieve robustness to the biasing effects of nonnormality and variance heterogeneity. The authors describe a nonparametric bootstrap methodology that can provide improved Type I error control. In addition, the authors indicate how researchers can set robust confidence intervals around a robust effect size parameter estimate. In an online supplement, the authors use several examples to illustrate the application of an SAS program to implement these statistical methods.  相似文献   

8.
Researchers can adopt one of many different measures of central tendency to examine the effect of a treatment variable across groups. These include least squares means, trimmed means, M‐estimators and medians. In addition, some methods begin with a preliminary test to determine the shapes of distributions before adopting a particular estimator of the typical score. We compared a number of recently developed adaptive robust methods with respect to their ability to control Type I error and their sensitivity to detect differences between the groups when data were non‐normal and heterogeneous, and the design was unbalanced. In particular, two new approaches to comparing the typical score across treatment groups, due to Babu, Padmanabhan, and Puri, were compared to two new methods presented by Wilcox and by Keselman, Wilcox, Othman, and Fradette. The procedures examined generally resulted in good Type I error control and therefore, on the basis of this critetion, it would be difficult to recommend one method over the other. However, the power results clearly favour one of the methods presented by Wilcox and Keselman; indeed, in the vast majority of the cases investigated, this most favoured approach had substantially larger power values than the other procedures, particularly when there were more than two treatment groups.  相似文献   

9.
Researchers in the behavioural sciences have been presented with a host of pairwise multiple comparison procedures that attempt to obtain an optimal combination of Type I error control, power, and ease of application. However, these procedures share one important limitation: intransitive decisions. Moreover, they can be characterized as a piecemeal approach to the problem rather than a holistic approach. Dayton has recently proposed a new approach to pairwise multiple comparisons testing that eliminates intransitivity through a model selection procedure. The present study compared the model selection approach (and a protected version) with three powerful and easy‐to‐use stepwise multiple comparison procedures in terms of the proportion of times that the procedure identified the true pattern of differences among a set of means across several one‐way layouts. The protected version of the model selection approach selected the true model a significantly greater proportion of times than the stepwise procedures and, in most cases, was not affected by variance heterogeneity and non‐normality.  相似文献   

10.
The goal of this study was to investigate the performance of Hall’s transformation of the Brunner-Dette-Munk (BDM) and Welch-James (WJ) test statistics and Box-Cox’s data transformation in factorial designs when normality and variance homogeneity assumptions were violated separately and jointly. On the basis of unweighted marginal means, we performed a simulation study to explore the operating characteristics of the methods proposed for a variety of distributions with small sample sizes. Monte Carlo simulation results showed that when data were sampled from symmetric distributions, the error rates of the original BDM and WJ tests were scarcely affected by the lack of normality and homogeneity of variance. In contrast, when data were sampled from skewed distributions, the original BDM and WJ rates were not well controlled. Under such circumstances, the results clearly revealed that Hall’s transformation of the BDM and WJ tests provided generally better control of Type I error rates than did the same tests based on Box-Cox’s data transformation. Among all the methods considered in this study, we also found that Hall’s transformation of the BDM test yielded the best control of Type I errors, although it was often less powerful than either of the WJ tests when both approaches reasonably controlled the error rates.  相似文献   

11.
In this journal, Zimmerman (2004, 2011) has discussed preliminary tests that researchers often use to choose an appropriate method for comparing locations when the assumption of normality is doubtful. The conceptual problem with this approach is that such a two‐stage process makes both the power and the significance of the entire procedure uncertain, as type I and type II errors are possible at both stages. A type I error at the first stage, for example, will obviously increase the probability of a type II error at the second stage. Based on the idea of Schmider et al. (2010) , which proposes that simulated sets of sample data be ranked with respect to their degree of normality, this paper investigates the relationship between population non‐normality and sample non‐normality with respect to the performance of the ANOVA, Brown–Forsythe test, Welch test, and Kruskal–Wallis test when used with different distributions, sample sizes, and effect sizes. The overall conclusion is that the Kruskal–Wallis test is considerably less sensitive to the degree of sample normality when populations are distinctly non‐normal and should therefore be the primary tool used to compare locations when it is known that populations are not at least approximately normal.  相似文献   

12.
There is no formal and generally accepted procedure for choosing an appropriate significance test for sample data when the assumption of normality is doubtful. Various tests of normality that have been proposed over the years have been found to have limited usefulness, and sometimes a preliminary test makes the situation worse. The present paper investigates a specific and easily applied rule for choosing between a parametric and non-parametric test, the Student t test and the Wilcoxon-Mann-Whitney test, that does not require a preliminary significance test of normality. Simulations reveal that the rule, which can be applied to sample data automatically by computer software, protects the Type I error rate and increases power for various sample sizes, significance levels, and non-normal distribution shapes. Limitations of the procedure in the case of heterogeneity of variance are discussed.  相似文献   

13.
Repeated measures analyses of variance are the method of choice in many studies from experimental psychology and the neurosciences. Data from these fields are often characterized by small sample sizes, high numbers of factor levels of the within-subjects factor(s), and nonnormally distributed response variables such as response times. For a design with a single within-subjects factor, we investigated Type I error control in univariate tests with corrected degrees of freedom, the multivariate approach, and a mixed-model (multilevel) approach (SAS PROC MIXED) with Kenward–Roger’s adjusted degrees of freedom. We simulated multivariate normal and nonnormal distributions with varied population variance–covariance structures (spherical and nonspherical), sample sizes (N), and numbers of factor levels (K). For normally distributed data, as expected, the univariate approach with Huynh–Feldt correction controlled the Type I error rate with only very few exceptions, even if samples sizes as low as three were combined with high numbers of factor levels. The multivariate approach also controlled the Type I error rate, but it requires NK. PROC MIXED often showed acceptable control of the Type I error rate for normal data, but it also produced several liberal or conservative results. For nonnormal data, all of the procedures showed clear deviations from the nominal Type I error rate in many conditions, even for sample sizes greater than 50. Thus, none of these approaches can be considered robust if the response variable is nonnormally distributed. The results indicate that both the variance heterogeneity and covariance heterogeneity of the population covariance matrices affect the error rates.  相似文献   

14.
Research problems that require a non‐parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two‐way interaction is proposed for the analysis of the typical sources of variation in a three‐way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non‐normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non‐normal distributions and large sample sizes. Degrees‐of‐freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods.  相似文献   

15.
A one-way random effects model for trimmed means   总被引:1,自引:0,他引:1  
The random effects ANOVA model plays an important role in many psychological studies, but the usual model suffers from at least two serious problems. The first is that even under normality, violating the assumption of equal variances can have serious consequences in terms of Type I errors or significance levels, and it can affect power as well. The second and perhaps more serious concern is that even slight departures from normality can result in a substantial loss of power when testing hypotheses. Jeyaratnam and Othman (1985) proposed a method for handling unequal variances, under the assumption of normality, but no results were given on how their procedure performs when distributions are nonnormal. A secondary goal in this paper is to address this issue via simulations. As will be seen, problems arise with both Type I errors and power. Another secondary goal is to provide new simulation results on the Rust-Fligner modification of the Kruskal-Wallis test. The primary goal is to propose a generalization of the usual random effects model based on trimmed means. The resulting test of no differences among J randomly sampled groups has certain advantages in terms of Type I errors, and it can yield substantial gains in power when distributions have heavy tails and outliers. This last feature is very important in applied work because recent investigations indicate that heavy-tailed distributions are common. Included is a suggestion for a heteroscedastic Winsorized analog of the usual intraclass correlation coefficient.  相似文献   

16.
Three approaches to the analysis of main and interaction effect hypotheses in nonorthogonal designs were compared in a 2×2 design for data that was neither normal in form nor equal in variance. The approaches involved either least squares or robust estimators of central tendency and variability and/or a test statistic that either pools or does not pool sources of variance. Specifically, we compared the ANOVA F test which used trimmed means and Winsorized variances, the Welch-James test with the usual least squares estimators for central tendency and variability and the Welch-James test using trimmed means and Winsorized variances. As hypothesized, we found that the latter approach provided excellent Type I error control, whereas the former two did not.Financial support for this research was provided by grants to the first author from the National Sciences and Engineering Research Council of Canada (#OGP0015855) and the Social Sciences and Humanities Research Council (#410-95-0006). The authors would like to express their appreciation to the Associate Editor as well as the reviewers who provided valuable comments on an earlier version of this paper.  相似文献   

17.
For one‐way fixed effects ANOVA, it is well known that the conventional F test of the equality of means is not robust to unequal variances, and numerous methods have been proposed for dealing with heteroscedasticity. On the basis of extensive empirical evidence of Type I error control and power performance, Welch's procedure is frequently recommended as the major alternative to the ANOVA F test under variance heterogeneity. To enhance its practical usefulness, this paper considers an important aspect of Welch's method in determining the sample size necessary to achieve a given power. Simulation studies are conducted to compare two approximate power functions of Welch's test for their accuracy in sample size calculations over a wide variety of model configurations with heteroscedastic structures. The numerical investigations show that Levy's (1978a) approach is clearly more accurate than the formula of Luh and Guo (2011) for the range of model specifications considered here. Accordingly, computer programs are provided to implement the technique recommended by Levy for power calculation and sample size determination within the context of the one‐way heteroscedastic ANOVA model.  相似文献   

18.
The statistical simulation program DATASIM is designed to conduct large-scale sampling experiments on microcomputers. Monte Carlo procedures are used to investigate the Type I and Type II error rates for statistical tests when one or more assumptions are systematically violated-assumptions, for example, regarding normality, homogeneity of variance or covariance, mini-mum expected cell frequencies, and the like. In the present paper, we report several initial tests of the data-generating algorithms employed by DATASIM. The results indicate that the uniform and standard normal deviate generators perform satisfactorily. Furthermore, Kolmogorov-Smirnov tests show that the sampling distributions ofz, t, F, χ2, andr generated by DATASIM simulations follow the appropriate theoretical distributions. Finally, estimates of Type I error rates obtained by DATASIM under various patterns of violations of assumptions are in close agreement with the results of previous analytical and empirical studies; These converging lines of evidence suggest that DATASIM may well prove to be a reliable and productive tool for conducting statistical simulation research.  相似文献   

19.
When the underlying variances are unknown or/and unequal, using the conventional F test is problematic in the two‐factor hierarchical data structure. Prompted by the approximate test statistics (Welch and Alexander–Govern methods), the authors develop four new heterogeneous test statistics to test factor A and factor B nested within A for the unbalanced fixed‐effect two‐stage nested design under variance heterogeneity. The actual significance levels and statistical power of the test statistics were compared in a simulation study. The results show that the proposed procedures maintain better Type I error rate control and have greater statistical power than those obtained by the conventional F test in various conditions. Therefore, the proposed test statistics are recommended in terms of robustness and easy implementation.  相似文献   

20.
Experience with real data indicates that psychometric measures often have heavy-tailed distributions. This is known to be a serious problem when comparing the means of two independent groups because heavy-tailed distributions can have a serious effect on power. Another problem that is common in some areas is outliers. This paper suggests an approach to these problems based on the one-step M-estimator of location. Simulations indicate that the new procedure provides very good control over the probability of a Type I error even when distributions are skewed, have different shapes, and the variances are unequal. Moreover, the new procedure has considerably more power than Welch's method when distributions have heavy tails, and it compares well to Yuen's method for comparing trimmed means. Wilcox's median procedure has about the same power as the proposed procedure, but Wilcox's method is based on a statistic that has a finite sample breakdown point of only 1/n, wheren is the sample size. Comments on other methods for comparing groups are also included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号