首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We propose a simple modification of Hochberg's step‐up Bonferroni procedure for multiple tests of significance. The proposed procedure is always more powerful than Hochberg's procedure for more than two tests, and is more powerful than Hommel's procedure for three and four tests. A numerical analysis of the new procedure indicates that its Type I error is controlled under independence of the test statistics, at a level equal to or just below the nominal Type I error. Examination of various non‐null configurations of hypotheses shows that the modified procedure has a power advantage over Hochberg's procedure which increases in relationship to the number of false hypotheses.  相似文献   

2.
Categorical moderators are often included in mixed-effects meta-analysis to explain heterogeneity in effect sizes. An assumption in tests of categorical moderator effects is that of a constant between-study variance across all levels of the moderator. Although it rarely receives serious thought, there can be statistical ramifications to upholding this assumption. We propose that researchers should instead default to assuming unequal between-study variances when analysing categorical moderators. To achieve this, we suggest using a mixed-effects location-scale model (MELSM) to allow group-specific estimates for the between-study variance. In two extensive simulation studies, we show that in terms of Type I error and statistical power, little is lost by using the MELSM for moderator tests, but there can be serious costs when an equal variance mixed-effects model (MEM) is used. Most notably, in scenarios with balanced sample sizes or equal between-study variance, the Type I error and power rates are nearly identical between the MEM and the MELSM. On the other hand, with imbalanced sample sizes and unequal variances, the Type I error rate under the MEM can be grossly inflated or overly conservative, whereas the MELSM does comparatively well in controlling the Type I error across the majority of cases. A notable exception where the MELSM did not clearly outperform the MEM was in the case of few studies (e.g., 5). With respect to power, the MELSM had similar or higher power than the MEM in conditions where the latter produced non-inflated Type 1 error rates. Together, our results support the idea that assuming unequal between-study variances is preferred as a default strategy when testing categorical moderators.  相似文献   

3.
Empirical Type I error and power rates were estimated for (a) the doubly multivariate model, (b) the Welch-James multivariate solution developed by Keselman, Carriere and Lix (1993) using Johansen's results (1980), and for (c) the multivariate version of the modified Brown-Forsythe (1974) procedure. The performance of these procedures was investigated by testing within- blocks sources of variation in a multivariate split-plot design containing unequal covariance matrices. The results indicate that the doubly multivariate model did not provide effective Type I error control while the Welch-James procedure provided robust and powerful tests of the within-subjects main effect, however, this approach provided liberal tests of the interaction effect. The results also indicate that the modified Brown-Forsythe procedure provided robust tests of within-subjects main and interaction effects, especially when the design was balanced or when group sizes and covariance matrices were positively paired.  相似文献   

4.
Methods as Tools     
O'Keefe argues that the logic of experiment‐wise error correction is flawed, presenting a number of counterexamples as evidence for his claim. He asserts that there is no consistent principle that discriminates legitimate from absurd uses of this logic. I supply such a principle and defend it with his own counterexamples. In sum, O'Keefe's critique raises important methodological questions, provokes discussion that may help answer them, but goes too far in indicting the logic of experiment‐wise error correction.  相似文献   

5.
A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T2 and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T2 across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness.  相似文献   

6.
Thomas F. O'Dea's classic The Mormons identified “Mormonism's encounter with modern secular thought” as perhaps the church's greatest problem. In line with secularization theory, O'Dea predicted an attenuation of traditional Mormonism, and an adaptation and gradual liberalization of Mormon theology as the literal interpretation of Mormon origins “dissolved” in the solvent of modernity. O'Dea's views on the crisis facing Mormonism were based, in part, on ethnographic field work in Salt Lake City in the summer of 1950, as interpreted by a modernist Catholic sociologist. A review of his field notes suggests that key informants who “hosted” much of O'Dea's research activity were liberal Mormon academics who defined the church's traditional theology as a problem. This viewpoint agreed with O'Dea's preconceptions about the “education and apostasy” dilemma. A comparison of O'Dea's published “reading” of his field notes with the notes themselves suggests the plausibility of alternative readings. One such alternative is offered here, an interpretation of the interplay of education, Mormon theology, and Mormon intellectualism drawn from O'Dea's field notes, but with a different emphasis from that of his essay. Finally, reverting to a modernist perspective, I offer some hints from survey research suggesting that the predicted liberalization of Mormon theology has yet to occur.  相似文献   

7.
Type I error is a risk undertaken whenever significance tests are conducted, and the chances of committing a Type I error increase as the number of significance tests increases. But adjusting the alpha level because of the number of tests conducted in a given study has no principled basis, commits one to absurd beliefs and practices, and reduces statistical power. The practice of requiring or employing such adjustments should be abandoned.  相似文献   

8.
Two types of global testing procedures for item fit to the Rasch model were evaluated using simulation studies. The first type incorporates three tests based on first‐order statistics: van den Wollenberg's Q1 test, Glas's R1 test, and Andersen's LR test. The second type incorporates three tests based on second‐order statistics: van den Wollenberg's Q2 test, Glas's R2 test, and a non‐parametric test proposed by Ponocny. The Type I error rates and the power against the violation of parallel item response curves, unidimensionality and local independence were analysed in relation to sample size and test length. In general, the outcomes indicate a satisfactory performance of all tests, except the Q2 test which exhibits an inflated Type I error rate. Further, it was found that both types of tests have power against all three types of model violation. A possible explanation is the interdependencies among the assumptions underlying the model.  相似文献   

9.
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M, a, increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y, b, was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a. Implications of these findings are discussed.  相似文献   

10.
Many books on statistical methods advocate a ‘conditional decision rule’ when comparing two independent group means. This rule states that the decision as to whether to use a ‘pooled variance’ test that assumes equality of variance or a ‘separate variance’ Welch t test that does not should be based on the outcome of a variance equality test. In this paper, we empirically examine the Type I error rate of the conditional decision rule using four variance equality tests and compare this error rate to the unconditional use of either of the t tests (i.e. irrespective of the outcome of a variance homogeneity test) as well as several resampling‐based alternatives when sampling from 49 distributions varying in skewness and kurtosis. Several unconditional tests including the separate variance test performed as well as or better than the conditional decision rule across situations. These results extend and generalize the findings of previous researchers who have argued that the conditional decision rule should be abandoned.  相似文献   

11.
12.
We examine methods for measuring performance in signal-detection-like tasks when each participant provides only a few observations. Monte Carlo simulations demonstrate that standard statistical techniques applied to ad’ analysis can lead to large numbers of Type I errors (incorrectly rejecting a hypothesis of no difference). Various statistical methods were compared in terms of their Type I and Type II error (incorrectly accepting a hypothesis of no difference) rates. Our conclusions are the same whether these two types of errors are weighted equally or Type I errors are weighted more heavily. The most promising method is to combine an aggregated’ measure with a percentile bootstrap confidence interval, a computerintensive nonparametric method of statistical inference. Researchers who prefer statistical techniques more commonly used in psychology, such as a repeated measurest test, should useγ (Goodman & Kruskal, 1954), since it performs slightly better than or nearly as well asd’. In general, when repeated measurest tests are used,γ is more conservative thand’: It makes more Type II errors, but its Type I error rate tends to be much closer to that of the traditional .05 α level. It is somewhat surprising thatγ performs as well as it does, given that the simulations that generated the hypothetical data conformed completely to thed’ model. Analyses in which H—FA was used had the highest Type I error rates. Detailed simulation results can be downloaded fromwww.psychonomic.org/archive/Schooler-BRM-2004.zip.  相似文献   

13.
The goal of this study was to investigate the performance of Hall’s transformation of the Brunner-Dette-Munk (BDM) and Welch-James (WJ) test statistics and Box-Cox’s data transformation in factorial designs when normality and variance homogeneity assumptions were violated separately and jointly. On the basis of unweighted marginal means, we performed a simulation study to explore the operating characteristics of the methods proposed for a variety of distributions with small sample sizes. Monte Carlo simulation results showed that when data were sampled from symmetric distributions, the error rates of the original BDM and WJ tests were scarcely affected by the lack of normality and homogeneity of variance. In contrast, when data were sampled from skewed distributions, the original BDM and WJ rates were not well controlled. Under such circumstances, the results clearly revealed that Hall’s transformation of the BDM and WJ tests provided generally better control of Type I error rates than did the same tests based on Box-Cox’s data transformation. Among all the methods considered in this study, we also found that Hall’s transformation of the BDM test yielded the best control of Type I errors, although it was often less powerful than either of the WJ tests when both approaches reasonably controlled the error rates.  相似文献   

14.
The important assumption of independent errors should be evaluated routinely in the application of interrupted time-series regression models. The two most frequently recommended tests of this assumption [Mood's runs test and the Durbin-Watson (D-W) bounds test] have several weaknesses. The former has poor small sample Type I error performance and the latter has the bothersome property that results are often declared to be "inconclusive." The test proposed in this article is simple to compute (special software is not required), there is no inconclusive region, an exact p-value is provided, and it has good Type I error and power properties relative to competing procedures. It is shown that these desirable properties hold when design matrices of a specified form are used to model the response variable. A Monte Carlo evaluation of the method, including comparisons with other tests (viz., runs, D-W bounds, and D-W beta), and examples of application are provided.  相似文献   

15.
The Type I error rates and powers of three recent tests for analyzing nonorthogonal factorial designs under departures from the assumptions of homogeneity and normality were evaluated using Monte Carlo simulation. Specifically, this work compared the performance of the modified Brown-Forsythe procedure, the generalization of Box's method proposed by Brunner, Dette, and Munk, and the mixed-model procedure adjusted by the Kenward-Roger solution available in the SAS statistical package. With regard to robustness, the three approaches adequately controlled Type I error when the data were generated from symmetric distributions; however, this study's results indicate that, when the data were extracted from asymmetric distributions, the modified Brown-Forsythe approach controlled the Type I error slightly better than the other procedures. With regard to sensitivity, the higher power rates were obtained when the analyses were done with the MIXED procedure of the SAS program. Furthermore, results also identified that, when the data were generated from symmetric distributions, little power was sacrificed by using the generalization of Box's method in place of the modified Brown-Forsythe procedure.  相似文献   

16.
17.
The statistical simulation program DATASIM is designed to conduct large-scale sampling experiments on microcomputers. Monte Carlo procedures are used to investigate the Type I and Type II error rates for statistical tests when one or more assumptions are systematically violated-assumptions, for example, regarding normality, homogeneity of variance or covariance, mini-mum expected cell frequencies, and the like. In the present paper, we report several initial tests of the data-generating algorithms employed by DATASIM. The results indicate that the uniform and standard normal deviate generators perform satisfactorily. Furthermore, Kolmogorov-Smirnov tests show that the sampling distributions ofz, t, F, χ2, andr generated by DATASIM simulations follow the appropriate theoretical distributions. Finally, estimates of Type I error rates obtained by DATASIM under various patterns of violations of assumptions are in close agreement with the results of previous analytical and empirical studies; These converging lines of evidence suggest that DATASIM may well prove to be a reliable and productive tool for conducting statistical simulation research.  相似文献   

18.
The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons.  相似文献   

19.
We discuss the statistical testing of three relevant hypotheses involving Cronbach's alpha: one where alpha equals a particular criterion; a second testing the equality of two alpha coefficients for independent samples; and a third testing the equality of two alpha coefficients for dependent samples. For each of these hypotheses, various statistical tests have been proposed. Over the years, these tests have depended on progressively fewer assumptions. We propose a new approach to testing the three hypotheses that relies on even fewer assumptions, is especially suited for discrete item scores, and can be applied easily to tests containing large numbers of items. The new approach uses marginal modelling. We compared the Type I error rate and the power of the marginal modelling approach to several of the available tests in a simulation study using realistic conditions. We found that the marginal modelling approach had the most accurate Type I error rates, whereas the power was similar across the statistical tests.  相似文献   

20.
Random effects meta‐regression is a technique to synthesize results of multiple studies. It allows for a test of an overall effect, as well as for tests of effects of study characteristics, that is, (discrete or continuous) moderator effects. We describe various procedures to test moderator effects: the z, t, likelihood ratio (LR), Bartlett‐corrected LR (BcLR), and resampling tests. We compare the Type I error of these tests, and conclude that the common z test, and to a lesser extent the LR test, do not perform well since they may yield Type I error rates appreciably larger than the chosen alpha. The error rate of the resampling test is accurate, closely followed by the BcLR test. The error rate of the t test is less accurate but arguably tolerable. With respect to statistical power, the BcLR and t tests slightly outperform the resampling test. Therefore, our recommendation is to use either the resampling or the BcLR test. If these statistics are unavailable, then the t test should be used since it is certainly superior to the z test.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号