首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
L. V. Jones and J. W. Tukey (2000) pointed out that the usual 2-sided, equal-tails null hypothesis test at level alpha can be reinterpreted as simultaneous tests of 2 directional inequality hypotheses, each at level alpha/2, and that the maximum probability of a Type I error is alpha/2 if the truth of the null hypothesis is considered impossible. This article points out that in multiple testing with familywise error rate controlled at alpha, the directional error rate (assuming all null hypotheses are false) is greater than alpha/2 and can be arbitrarily close to alpha. Single-step, step-down, and step-up procedures are analyzed, and other error rates, including the false discovery rate, are discussed. Implications for confidence interval estimation and hypothesis testing practices are considered.  相似文献   

2.
Bayes factor approaches for testing interval null hypotheses   总被引:1,自引:0,他引:1  
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue in hypothesis testing is that constraints may hold only approximately rather than exactly, and the reason for small deviations may be trivial or uninteresting. In the large-sample limit, these uninteresting, small deviations lead to the rejection of a useful constraint. In this article, we develop several Bayes factor 1-sample tests for the assessment of approximate equality and ordinal constraints. In these tests, the null hypothesis covers a small interval of non-0 but negligible effect sizes around 0. These Bayes factors are alternatives to previously developed Bayes factors, which do not allow for interval null hypotheses, and may especially prove useful to researchers who use statistical equivalence testing. To facilitate adoption of these Bayes factor tests, we provide easy-to-use software.  相似文献   

3.
The log-linear model for contingency tables expresses the logarithm of a cell frequency as an additive function of main effects, interactions, etc., in a way formally identical with an analysis of variance model. Exact statistical tests are developed to test hypotheses that specific effects or sets of effects are zero, yielding procedures for exploring relationships among qualitative variables which are suitable for small samples. The tests are analogous to Fisher's exact test for a 2 × 2 contingency table. Given a hypothesis, the exact probability of the obtained table is determined, conditional on fixed marginals or other functions of the cell frequencies. The sum of the probabilities of the obtained table and of all less probable ones is the exact probability to be considered in testing the null hypothesis. Procedures for obtaining exact probabilities are explained in detail, with examples given.  相似文献   

4.
Issues involved in the evaluation of null hypotheses are discussed. The use of equivalence testing is recommended as a possible alternative to the use of simple t or F tests for evaluating a null hypothesis. When statistical power is low and larger sample sizes are not available or practical, consideration should be given to using one-tailed tests or less conservative levels for determining criterion levels of statistical significance. Effect sizes should always be reported along with significance levels, as both are needed to understand results of research. Probabilities alone are not enough and are especially problematic for very large or very small samples. Pre-existing group differences should be tested and properly accounted for when comparing independent groups on dependent variables. If confirmation of a null hypothesis is expected, potential suppressor variables should be considered. If different methods are used to select the samples to be compared, controls for social desirability bias should be implemented. When researchers deviate from these standards or appear to assume that such standards are unimportant or irrelevant, their results should be deemed less credible than when such standards are maintained and followed. Several examples of recent violations of such standards in family social science, comparing gay, lesbian, bisexual, and transgender families with heterosexual families, are provided. Regardless of their political values or expectations, researchers should strive to test null hypotheses rigorously, in accordance with the best professional standards.  相似文献   

5.
Tryon WW  Lewis C 《心理学方法》2008,13(3):272-277
Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H-sub-0 is not evidence supportive of it. Tests of statistical equivalence are needed. This article corrects the inferential confidence interval (ICI) reduction factor introduced by W. W. Tryon (2001) and uses it to extend his discussion of statistical equivalence. This method is shown to be algebraically equivalent with D. J. Schuirmann's (1987) use of 2 one-sided t tests, a highly regarded and accepted method of testing for statistical equivalence. The ICI method provides an intuitive graphic method for inferring statistical difference as well as equivalence. Trivial difference occurs when a test of difference and a test of equivalence are both passed. Statistical indeterminacy results when both tests are failed. Hybrid confidence intervals are introduced that impose ICI limits on standard confidence intervals. These intervals are recommended as replacements for error bars because they facilitate inferences.  相似文献   

6.
Abstract— Significance testing of null hypotheses is the standard epistemologicat method for advancing scientific knowledge in psychology, even though a has drawbacks and it leads to common inferential mistakes These mistakes include accepting the null hypothesis when it fails to be rejected, automatically interpreting rejected null hypotheses as theoretically meaningful, and failing to consider the likelihood of Type II errors Although these mistakes have been discussed repeatedly for decades, there is no evidence that the academic discussion has had an impact A group of methodologists is proposing a new approach simply ban significance tests in psychology journals The impact of a similar ban in public-health and epidemiology journals is reported  相似文献   

7.
Valid use of the traditional independent samples ANOVA procedure requires that the population variances are equal. Previous research has investigated whether variance homogeneity tests, such as Levene's test, are satisfactory as gatekeepers for identifying when to use or not to use the ANOVA procedure. This research focuses on a novel homogeneity of variance test that incorporates an equivalence testing approach. Instead of testing the null hypothesis that the variances are equal against an alternative hypothesis that the variances are not equal, the equivalence-based test evaluates the null hypothesis that the difference in the variances falls outside or on the border of a predetermined interval against an alternative hypothesis that the difference in the variances falls within the predetermined interval. Thus, with the equivalence-based procedure, the alternative hypothesis is aligned with the research hypothesis (variance equality). A simulation study demonstrated that the equivalence-based test of population variance homogeneity is a better gatekeeper for the ANOVA than traditional homogeneity of variance tests.  相似文献   

8.
According to Bayesians, the null hypothesis significance-testing procedure is not deductively valid because it involves the retention or rejection of the null hypothesis under conditions where the posterior probability of that hypothesis is not known. Other criticisms are that this procedure is pointless and encourages imprecise hypotheses. However, according to non-Bayesians, there is no way of assigning a prior probability to the null hypothesis, and so Bayesian statistics do not work either. Consequently, no procedure has been accepted by both groups as providing a compelling reason to accept or reject hypotheses. The author aims to provide such a method. In the process, the author distinguishes between probability and epistemic estimation and argues that, although both are important in a science that is not completely deterministic, epistemic estimation is most relevant for hypothesis testing. Based on this analysis, the author proposes that hypotheses be evaluated via epistemic ratios and explores the implications of this proposal. One implication is that it is possible to encourage precise theorizing by imposing a penalty for imprecise hypotheses.  相似文献   

9.
Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations.  相似文献   

10.
In a positive hypothesis test a person generates or examines evidence that is expected to have the property of interest if the hypothesis is correct, whereas in a negative hypothesis test a person generates or examines evidence that is not expected to have the property of interest if the hypothesis is correct. Two experiments assessed the effectiveness of positive versus negative hypothesis tests on inductive and deductive rule learning problems. In Experiment 1 problem solvers induced a rule by proposing hypotheses and selecting evidence in the eight conditions of a factorial design defined by instructions to use a positive or negative hypothesis test on each of trials 1-5, 6-10, and 11-15. Instructions to use positive tests resulted in more examples, fewer strategic hypotheses, and a higher weighted score for five types of hypotheses than instructions to use negative tests. In Experiment 2 problem solvers identified 1 of a possible 1296 correct rules in the deductive rule learning game Mastermind. When problems were classified in the 16 possible combinations of positive or negative hypothesis tests on trials 2, 3, 4, and 5 there were fewer trials to solution for positive tests on each of the four trials and fewer trials to solution with increasing positive tests. We conclude that positive hypothesis tests are generally more effective than negative hypothesis tests in both inductive and deductive rule learning. Copyright 1999 Academic Press.  相似文献   

11.
Psychological researchers in different fields sometimes encounter circular or directional data. Circular data are data measured in the form of angles or two-dimensional orientations. As an example, experiments investigating the development of spatial memory and the influence of visual experience on haptic orientation perception are presented. Three permutation tests are proposed for the evaluation of ordered hypotheses. The quality of the permutation tests is investigated by means of several simulation studies. The results of these studies show the expected increase in power when the permutation tests for ordered hypotheses are compared to a common non-directional test for circular data. The differences in power between the three tests for ordered alternatives are small.  相似文献   

12.
In a rule induction problem positive hypothesis tests select evidence that the tester expects to be an example of the correct rule if the hypothesis is correct, whereas negative hypothesis tests select evidence that the tester expects to be a nonexample if the hypothesis is correct. Previous research indicates the general effectiveness of a positive test strategy for individuals, but there has been very little research with cooperative groups. We extend the analysis of Klayman and Ha (Psychological Review, 1987) of ambiguous verification or conclusive falsification of five possible types of hypotheses by positive and negative tests by emphasizing the importance of further examples following hypothesis tests. In two experiments four-person cooperative groups solved rule induction problems by proposing a hypothesis and selecting evidence to test the hypothesis on each of four arrays on each trial. In different conditions the groups were instructed to use different combinations of positive and negative tests on the four arrays. Positive tests were more likely to lead to further examples than negative tests, and the proportion of correct hypotheses corresponded to the proportion of positive tests, in both experiments. We suggest that positive tests are more effective than negative hypothesis tests in generating further evidence, and thus in inducing the correct rule, in experimental rule induction tasks with a criterion of certainty imposed by the researcher.  相似文献   

13.
Randomization tests are valid alternatives to parametric tests like the t test and analysis of variance when the normality or random sampling assumptions of these tests are violated. Three SPSS programs are listed and described that will conduct approximate randomization tests for testing the null hypotheses that two or more means or distributions are the same or that two variables are independent (i.e., uncorrelated or “randomly associated”). The programs will work on both desktop and mainframe versions of SPSS. Although the SPSS programs are slower on desktop machines than software designed explicitly for randomization tests, these programs bring randomization tests into the reach of researchers who prefer the SPSS computing environment for data analysis.  相似文献   

14.
We show how to test hypotheses for coefficient alpha in three different situations: (1) hypothesis tests of whether coefficient alpha equals a prespecified value, (2) hypothesis tests involving two statistically independent sample alphas as may arise when testing the equality of coefficient alpha across groups, and (3) hypothesis tests involving two statistically dependent sample alphas as may arise when testing the equality of alpha across time or when testing the equality of alpha for two test scores within the same sample. We illustrate how these hypotheses may be tested in a structural equation-modeling framework under the assumption of normally distributed responses and also under asymptotically distribution free assumptions. The formulas for the hypothesis tests and computer code are given for four different applied examples. Supplemental materials for this article may be downloaded from http://brm.psychonomic-journals.org/content/supplemental.  相似文献   

15.
Researchers often have expectations about the research outcomes in regard to inequality constraints between, e.g., group means. Consider the example of researchers who investigated the effects of inducing a negative emotional state in aggressive boys. It was expected that highly aggressive boys would, on average, score higher on aggressive responses toward other peers than moderately aggressive boys, who would in turn score higher than nonaggressive boys. In most cases, null hypothesis testing is used to evaluate such hypotheses. We show, however, that hypotheses formulated using inequality constraints between the group means are generally not evaluated properly. The wrong hypotheses are tested, i.e.. the null hypothesis that group means are equal. In this article, we propose an innovative solution to these above-mentioned issues using Bayesian model selection, which we illustrate using a case study.  相似文献   

16.
Researchers studying the movements of the human body often encounter data measured in angles (e.g., angular displacements of joints). The evaluation of these circular data requires special statistical methods. The authors introduce a new test for the analysis of order-constrained hypotheses for circular data. Through this test, researchers can evaluate their expectations regarding the outcome of an experiment directly by representing their ideas in the form of a hypothesis containing inequality constraints. The resulting data analysis is generally more powerful than one using standard null hypothesis testing. Two examples of circular data from human movement science are presented to illustrate the use of the test. Results from a simulation study show that the test performs well.  相似文献   

17.
When participants are asked to respond in the same way to stimuli from different sources (e.g., auditory and visual), responses are often observed to be substantially faster when both stimuli are presented simultaneously (redundancy gain). Different models account for this effect, the two most important being race models and coactivation models. Redundancy gains consistent with the race model have an upper limit, however, which is given by the well-known race model inequality (Miller, 1982). A number of statistical tests have been proposed for testing the race model inequality in single participants and groups of participants. All of these tests use the race model as the null hypothesis, and rejection of the null hypothesis is considered evidence in favor of coactivation. We introduce a statistical test in which the race model prediction is the alternative hypothesis. This test controls the Type I error if a theory predicts that the race model prediction holds in a given experimental condition.  相似文献   

18.
We propose a simple modification of Hochberg's step‐up Bonferroni procedure for multiple tests of significance. The proposed procedure is always more powerful than Hochberg's procedure for more than two tests, and is more powerful than Hommel's procedure for three and four tests. A numerical analysis of the new procedure indicates that its Type I error is controlled under independence of the test statistics, at a level equal to or just below the nominal Type I error. Examination of various non‐null configurations of hypotheses shows that the modified procedure has a power advantage over Hochberg's procedure which increases in relationship to the number of false hypotheses.  相似文献   

19.
Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.  相似文献   

20.
When psychologists test a commonsense (CS) hypothesis and obtain no support, they tend to erroneously conclude that the CS belief is wrong. In many such cases it appears, after many years, that the CS hypothesis was valid after all. It is argued that this error of accepting the "theoretical" null hypothesis reflects confusion between the operationalized hypothesis and the theory or generalization that it is designed to test. That is, on the basis of reliable null data one can accept the operationalized null hypothesis (e.g., "A measure of attitude x is not correlated with a measure of behavior y"). In contrast, one cannot generalize from the findings and accept the abstract or theoretical null (e.g., "We know that attitudes do not predict behavior"). The practice of accepting the theoretical null hypothesis hampers research and reduces the trust of the public in psychological research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号