首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When the underlying variances are unknown or/and unequal, using the conventional F test is problematic in the two‐factor hierarchical data structure. Prompted by the approximate test statistics (Welch and Alexander–Govern methods), the authors develop four new heterogeneous test statistics to test factor A and factor B nested within A for the unbalanced fixed‐effect two‐stage nested design under variance heterogeneity. The actual significance levels and statistical power of the test statistics were compared in a simulation study. The results show that the proposed procedures maintain better Type I error rate control and have greater statistical power than those obtained by the conventional F test in various conditions. Therefore, the proposed test statistics are recommended in terms of robustness and easy implementation.  相似文献   

2.
In the application of the analysis of variance to data obtained in educational methods experiments which involve several classes of several schools, one assumption is that of homogeneity in the variances of pupil scores from school to school. It is shown that such variances on representative educational achievement tests are heterogeneous. The effects of this heterogeneity upon theF-tests of significance commonly employed in methods experiments are investigated by comparing the actual distribution ofF values for a large number of experiments involving marked heterogeneity with a theoretical distribution based on the assumption of homogeneity. Although the findings, which vary somewhat with the type of variance ratio, are not entirely conclusive, they apparently demonstrate that departure from homogeneity does not invalidate the use of the customaryF-tests for evaluating results of the typical methods experiment.  相似文献   

3.
The validity conditions for univariate repeated measures designs are described. Attention is focused on the sphericity requirement. For av degree of freedom family of comparisons among the repeated measures, sphericity exists when all contrasts contained in thev dimensional space have equal variances. Under nonsphericity, upper and lower bounds on test size and power of a priori, repeated measures,F tests are derived. The effects of nonsphericity are illustrated by means of a set of charts. The charts reveal that small departures from sphericity (.97 <1.00) can seriously affect test size and power. It is recommended that separate rather than pooled error term procedures be routinely used to test a priori hypotheses.Appreciation is extended to Milton Parnes for his insightful assistance.  相似文献   

4.
The variable criteria sequential stopping rule (vcSSR) is an efficient way to add sample size to planned ANOVA tests while holding the observed rate of Type I errors, αo, constant. The only difference from regular null hypothesis testing is that criteria for stopping the experiment are obtained from a table based on the desired power, rate of Type I errors, and beginning sample size. The vcSSR was developed using between-subjects ANOVAs, but it should work with p values from any type of F test. In the present study, the αo remained constant at the nominal level when using the previously published table of criteria with repeated measures designs with various numbers of treatments per subject, Type I error rates, values of ρ, and four different sample size models. New power curves allow researchers to select the optimal sample size model for a repeated measures experiment. The criteria held αo constant either when used with a multiple correlation that varied the sample size model and the number of predictor variables, or when used with MANOVA with multiple groups and two levels of a within-subject variable at various levels of ρ. Although not recommended for use with χ2 tests such as the Friedman rank ANOVA test, the vcSSR produces predictable results based on the relation between F and χ2. Together, the data confirm the view that the vcSSR can be used to control Type I errors during sequential sampling with any t- or F-statistic rather than being restricted to certain ANOVA designs.  相似文献   

5.
Inconsistencies in the research findings on F-test robustness to variance heterogeneity could be related to the lack of a standard criterion to assess robustness or to the different measures used to quantify heterogeneity. In the present paper we use Monte Carlo simulation to systematically examine the Type I error rate of F-test under heterogeneity. One-way, balanced, and unbalanced designs with monotonic patterns of variance were considered. Variance ratio (VR) was used as a measure of heterogeneity (1.5, 1.6, 1.7, 1.8, 2, 3, 5, and 9), the coefficient of sample size variation as a measure of inequality between group sizes (0.16, 0.33, and 0.50), and the correlation between variance and group size as an indicator of the pairing between them (1, .50, 0, ?.50, and ?1). Overall, the results suggest that in terms of Type I error a VR above 1.5 may be established as a rule of thumb for considering a potential threat to F-test robustness under heterogeneity with unequal sample sizes.  相似文献   

6.
A great deal of educational and social data arises from cluster sampling designs where clusters involve schools, classrooms, or communities. A mistake that is sometimes encountered in the analysis of such data is to ignore the effect of clustering and analyse the data as if it were based on a simple random sample. This typically leads to an overstatement of the precision of results and too liberal conclusions about precision and statistical significance of mean differences. This paper gives simple corrections to the test statistics that would be computed in an analysis of variance if clustering were (incorrectly) ignored. The corrections are multiplicative factors depending on the total sample size, the cluster size, and the intraclass correlation structure. For example, the corrected F statistic has Fisher's F distribution with reduced degrees of freedom. The corrected statistic reduces to the F statistic computed by ignoring clustering when the intraclass correlations are zero. It reduces to the F statistic computed using cluster means when the intraclass correlations are unity, and it is in between otherwise. A similar adjustment to the usual statistic for testing a linear contrast among group means is described.  相似文献   

7.
For one‐way fixed effects ANOVA, it is well known that the conventional F test of the equality of means is not robust to unequal variances, and numerous methods have been proposed for dealing with heteroscedasticity. On the basis of extensive empirical evidence of Type I error control and power performance, Welch's procedure is frequently recommended as the major alternative to the ANOVA F test under variance heterogeneity. To enhance its practical usefulness, this paper considers an important aspect of Welch's method in determining the sample size necessary to achieve a given power. Simulation studies are conducted to compare two approximate power functions of Welch's test for their accuracy in sample size calculations over a wide variety of model configurations with heteroscedastic structures. The numerical investigations show that Levy's (1978a) approach is clearly more accurate than the formula of Luh and Guo (2011) for the range of model specifications considered here. Accordingly, computer programs are provided to implement the technique recommended by Levy for power calculation and sample size determination within the context of the one‐way heteroscedastic ANOVA model.  相似文献   

8.
The data obtained from one‐way independent groups designs is typically non‐normal in form and rarely equally variable across treatment populations (i.e. population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e. the analysis of variance F test) typically provides invalid results (e.g. too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non‐normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e. trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non‐normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non‐normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non‐normal.  相似文献   

9.
In connection with a least-squares solution for fitting one matrix,A, to another,B, under optimal choice of a rigid motion and a dilation, Schönemann and Carroll suggested two measures of fit: a raw measure,e, and a refined similarity measure,e s , which is symmetric. Both measures share the weakness of depending upon the norm of the target matrix,B,e.g.,e(A,kB) ≠e(A,B) fork ≠ 1. Therefore, both measures are useless for answering questions of the type: “DoesA fitB better thanA fitsC?”. In this note two new measures of fit are suggested which do not depend upon the norms ofA andB, which are (0, 1)-bounded, and which, therefore, provide meaningful answers for comparative analyses.  相似文献   

10.
Alberic of Paris put forward an argument, ‘the most embarrassing of all twelfth-century arguments’ according to Christopher Martin, which shows that the connexive principles contradict some other logical principles that have become deeply entrenched in our most widely accepted logical theories. Building upon some of Everett Nelson’s ideas, we will show that the steps in Alberic of Paris’ argument that should be rejected are precisely the ones that presuppose the validity of schemas that are nowadays taken as some of the most trivial logical truths: (AB) →AB A and (AB) →AB B, i.e. Simplification.  相似文献   

11.
Analysis of variance (ANOVA), the workhorse analysis of experimental designs, consists of F-tests of main effects and interactions. Yet, testing, including traditional ANOVA, has been recently critiqued on a number of theoretical and practical grounds. In light of these critiques, model comparison and model selection serve as an attractive alternative. Model comparison differs from testing in that one can support a null or nested model vis-a-vis a more general alternative by penalizing more flexible models. We argue this ability to support more simple models allows for more nuanced theoretical conclusions than provided by traditional ANOVA F-tests. We provide a model comparison strategy and show how ANOVA models may be reparameterized to better address substantive questions in data analysis.  相似文献   

12.
Abstract

Inference of variance components in linear mixed modeling (LMM) provides evidence of heterogeneity between individuals or clusters. When only nonnegative variances are allowed, there is a boundary (i.e., 0) in the variances’ parameter space, and regular inference statistical procedures for such a parameter could be problematic. The goal of this article is to introduce a practically feasible permutation method to make inferences about variance components while considering the boundary issue in LMM. The permutation tests with different settings (i.e., constrained vs. unconstrained estimation, specific vs. generalized test, different ways of calculating p values, and different ways of permutation) were examined with both normal data and non-normal data. In addition, the permutation tests were compared to likelihood ratio (LR) tests with a mixture of chi-squared distributions as the reference distribution. We found that the unconstrained permutation test with the one-sided p-value approach performed better than the other permutation tests and is a useful alternative when the LR tests are not applicable. An R function is provided to facilitate the implementation of the permutation tests, and a real data example is used to illustrate the application. We hope our results will help researchers choose appropriate tests when testing variance components in LMM.  相似文献   

13.
We show that power and sample size tables developed by Cohen (1988, pp. 289–354, 381–389) produce incorrect estimates for factorial designs: power is underestimated, and sample size is overestimated. The source of this bias is shrinkage in the implied value of the noncentrality parameter, λ, caused by using Cohen’s adjustment ton for factorial designs (pp. 365 and 396). The adjustment was intended to compensate for differences in the actual versus presumed (by the tables) error degrees of freedom; however, more accurate estimates are obtained if the tables are used without adjustment. The problems with Cohen’s procedure were discovered while testing subroutines in DATASIM 1.2 for computing power and sample size in completely randomized, randomized-blocks, and split-plot factorial designs. The subroutines give the user the ability to generate power and sample size tables that are as easy to use as Cohen’s, but that eliminate the conservative bias of his tables. We also implemented several improvements relative to “manual” use of Cohen’s tables: (1) Since the user can control the specific values of 1- β,n, andf used on the rows and columns of the table, interpolation is never required; (2) exact as opposed to approximate solutions for the noncentralF distribution are employed; (3) solutions for factorial designs, including those with repeated measures factors, take into account the actual error degrees of freedom for the effect being tested; and (4) provision is made for the computation of power for applications involving the doubly noncentralF distribution.  相似文献   

14.
Research problems that require a non‐parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two‐way interaction is proposed for the analysis of the typical sources of variation in a three‐way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non‐normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non‐normal distributions and large sample sizes. Degrees‐of‐freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods.  相似文献   

15.
The specification of sample size is an important aspect of the planning of every experiment. When the investigator intends to use the techniques of analysis of variance in the study of treatments effects, he should, in specifying sample size, take into consideration the power of theF tests which will be made. The charts presented in this paper make possible a simple and direct estimate of the sample size required forF tests of specified power.  相似文献   

16.
This note explains an error in Restall’s ‘Simplified Semantics for Relevant Logics (and some of their rivals)’ (Restall, J Philos Logic 22(5):481–511, 1993) concerning the modelling conditions for the axioms of assertion A → ((AB) → B) (there called c6) and permutation (A → (BC)) → (B → (AC)) (there called c7). We show that the modelling conditions for assertion and permutation proposed in ‘Simplified Semantics’ overgenerate. In fact, they overgenerate so badly that the proposed semantics for the relevant logic R validate the rule of disjunctive syllogism. The semantics provides for no models of R in which the “base point” is inconsistent. This problem is not restricted to ‘Simplified Semantics.’ The techniques of that paper are used in Graham Priest’s textbook An Introduction to Non-Classical Logic (Priest, 2001), which is in wide circulation: it is important to find a solution. In this article, we explain this result, diagnose the mistake in ‘Simplified Semantics’ and propose two different corrections.  相似文献   

17.
The paper shows how multiple comparison procedures for repeated measures means employing a pooled estimate of error variance must conform to the sphericity assumptions of the design in order to provide a valid test. Since it is highly unlikely that behavioral science data will satisfy this condition the paper presents a test statistic that, depending upon the design, will provide either an exact or robust test and is generalizable to designs containing any number of repeated factors. Finally, various critical values are enumerated to limit the joint level of significance at α.  相似文献   

18.
When a meta-analysis on results from experimental studies is conducted, differences in the study design must be taken into consideration. A method for combining results across independent-groups and repeated measures designs is described, and the conditions under which such an analysis is appropriate are discussed. Combining results across designs requires that (a) all effect sizes be transformed into a common metric, (b) effect sizes from each design estimate the same treatment effect, and (c) meta-analysis procedures use design-specific estimates of sampling variance to reflect the precision of the effect size estimates.  相似文献   

19.
Man-Wai Chan 《Psychometrika》1964,29(3):233-240
A model for an experimental design involving a complete set of Latin squares for testing the homogeneity of treatment effects was constructed and analyzed by Gourlay. In his analysis, however, if one or both of the preliminaryF-tests are significant, the analysis cannot differentiate. He then suggests the use of a less desirable test which is biased and has fewer degrees of freedom, regardless of the number of replications (the d.f. cannot be increased by increasing the replications). Further, when heterogeneity of variance occurs, Gourlay's test procedures are in general invalid. The present paper reviews Gourlay's analysis and proposes a modified test procedure.My thanks are due to Drs. R. S. Hirsch, R. L. Erdmann, and R. M. Simons of IBM Corporation for many stimulating discussions, and to Professor D. Teichroew of Stanford University for permission to refer to his paper [6] and his assistance. I also wish to thank B. A. Snyder for correcting many linguistic mistakes.  相似文献   

20.
I compared the randomization/permutation test and theF test for a two-cell comparative experiment. I varied (1) the number of observations per cell, (2) the size of the treatment effect, (3) the shape of the underlying distribution of error and, (4) for cases with skewed error, whether or not the skew was correlated with the treatment. With normal error, there was little difference between the tests. When error was skewed, by contrast, the randomization test was more sensitive than theF test, and if the amount of skew was correlated with the treatment, the advantage for the randomization test was both large and positively correlated with the treatment. I conclude that, because the randomization test was never less powerful than theF test, it should replace theF test in routine work.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号