期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Some Thoughts Regarding the Sensitivity of t to Error in Estimated Within-Group Variance

William P. Dunlap 《The Journal of general psychology》2013,140(1):59-65

A. J. Riopelle (2003) has eloquently demonstrated that the null hypothesis assessed by the t test involves not only mean differences but also error in the estimation of the within-group standard deviation, s. He is correct in his conclusion that the precision of the interpretation of a significant t and the null hypothesis tested is complex, particularly when sample sizes are small. In this article, the author expands on Riopelle's thoughts by comparing t with some equivalent or closely related tests that make the reliance of t on the accurate estimation of error perhaps more salient and by providing a simulation that may address more directly the magnitude of the interpretational problem. 相似文献

2.

Multiplicity,directional (type III) errors,and the null hypothesis

Shaffer JP 《心理学方法》2002,7(3):356-369

L. V. Jones and J. W. Tukey (2000) pointed out that the usual 2-sided, equal-tails null hypothesis test at level alpha can be reinterpreted as simultaneous tests of 2 directional inequality hypotheses, each at level alpha/2, and that the maximum probability of a Type I error is alpha/2 if the truth of the null hypothesis is considered impossible. This article points out that in multiple testing with familywise error rate controlled at alpha, the directional error rate (assuming all null hypotheses are false) is greater than alpha/2 and can be arbitrarily close to alpha. Single-step, step-down, and step-up procedures are analyzed, and other error rates, including the false discovery rate, are discussed. Implications for confidence interval estimation and hypothesis testing practices are considered. 相似文献

3.

Functional anatomy of the null hypothesis and of tests of it

Riopelle AJ 《The Journal of general psychology》2003,130(1):47-57

The author compared simulations of the "true" null hypothesis (zeta) test, in which sigma was known and fixed, with the t test, in which s, an estimate of sigma, was calculated from the sample because the t test was used to emulate the "true" test. The true null hypothesis test bears exclusively on calculating the probability that a sample distance (mean) is larger than a specified value. The results showed that the value of t was sensitive to sampling fluctuations in both distance and standard error. Large values of t reflect small standard errors when n is small. The value of t achieves sensitivity primarily to distance only when the sample sizes are large. One cannot make a definitive statement about the probability or "significance" of a distance solely on the basis of the value of t. 相似文献

4.

A sensible formulation of the significance test

Jones LV Tukey JW 《心理学方法》2000,5(4):411-414

The conventional procedure for null hypothesis significance testing has long been the target of appropriate criticism. A more reasonable alternative is proposed, one that not only avoids the unrealistic postulation of a null hypothesis but also, for a given parametric difference and a given error probability, is more likely to report the detection of that difference. 相似文献

5.

Using epistemic ratios to evaluate hypotheses: an imprecision penalty for imprecise hypotheses 总被引：1，自引：0，他引：1

Trafimow D 《Genetic, social, and general psychology monographs》2006,132(4):431-462

According to Bayesians, the null hypothesis significance-testing procedure is not deductively valid because it involves the retention or rejection of the null hypothesis under conditions where the posterior probability of that hypothesis is not known. Other criticisms are that this procedure is pointless and encourages imprecise hypotheses. However, according to non-Bayesians, there is no way of assigning a prior probability to the null hypothesis, and so Bayesian statistics do not work either. Consequently, no procedure has been accepted by both groups as providing a compelling reason to accept or reject hypotheses. The author aims to provide such a method. In the process, the author distinguishes between probability and epistemic estimation and argues that, although both are important in a science that is not completely deterministic, epistemic estimation is most relevant for hypothesis testing. Based on this analysis, the author proposes that hypotheses be evaluated via epistemic ratios and explores the implications of this proposal. One implication is that it is possible to encourage precise theorizing by imposing a penalty for imprecise hypotheses. 相似文献

6.

The error of accepting the "theoretical" null hypothesis: the rise, fall, and resurrection of commonsense hypotheses in psychology

Kluger AN Tikochinsky J 《Psychological bulletin》2001,127(3):408-423

When psychologists test a commonsense (CS) hypothesis and obtain no support, they tend to erroneously conclude that the CS belief is wrong. In many such cases it appears, after many years, that the CS hypothesis was valid after all. It is argued that this error of accepting the "theoretical" null hypothesis reflects confusion between the operationalized hypothesis and the theory or generalization that it is designed to test. That is, on the basis of reliable null data one can accept the operationalized null hypothesis (e.g., "A measure of attitude x is not correlated with a measure of behavior y"). In contrast, one cannot generalize from the findings and accept the abstract or theoretical null (e.g., "We know that attitudes do not predict behavior"). The practice of accepting the theoretical null hypothesis hampers research and reduces the trust of the public in psychological research. 相似文献

7.

The fallacy of the null hypothesis in soft psychology

Niels G. Waller 《Applied and Preventive Psychology》2004,11(1):83-86

In his classic article on the fallacy of the null hypothesis in soft psychology [J. Consult. Clin. Psychol. 46 (1978)], Paul Meehl claimed that, in nonexperimental settings, the probability of rejecting the null hypothesis of nil group differences in favor of a directional alternative was 0.50—a value that is an order of magnitude higher than the customary Type I error rate. In a series of real data simulations, using Minnesota Multiphasic Personality Inventory-Revised (MMPI-2) data collected from more than 80,000 individuals, I found strong support for Meehl’s claim. 相似文献

8.

THE COUNTERNULL VALUE OF AN EFFECT SIZE:

Robert Rosenthal Donald B. Rubin 《Psychological science》1994,5(6):329-334

We introduce a new, readily computed statistic, the counternull value of an obtained effect size, which is the nonnull magnitude of effect size that is supported by exactly the same amount of evidence as supports the null value of the effect size In other words, if the counternull value were taken as the null hypothesis, the resulting p value would be the same as the obtained p value for the actual null hypothesis Reporting the counternull, in addition to the p value, virtually eliminates two common errors (a) equating failure to reject the null with the estimation of the effect size as equal to zero and (b) takmg the rejection of a null hypothesis on the basis of a significant p value to imply a scientifically important finding In many common situations with a one-degree-of-freedom effect size, the value of the counternull is simply twice the magnitude of the obtained effect size, but the counternull is defined in general, even with multidegree-of-freedom effect sizes, and therefore can be applied when a confidence interval cannot be The use of the counternull can be especially useful in meta-analyses when evaluating the scientific importance of summary effect sizes 相似文献

9.

An inferential confidence interval method of establishing statistical equivalence that corrects Tryon's (2001) reduction factor

Tryon WW Lewis C 《心理学方法》2008,13(3):272-277

Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H-sub-0 is not evidence supportive of it. Tests of statistical equivalence are needed. This article corrects the inferential confidence interval (ICI) reduction factor introduced by W. W. Tryon (2001) and uses it to extend his discussion of statistical equivalence. This method is shown to be algebraically equivalent with D. J. Schuirmann's (1987) use of 2 one-sided t tests, a highly regarded and accepted method of testing for statistical equivalence. The ICI method provides an intuitive graphic method for inferring statistical difference as well as equivalence. Trivial difference occurs when a test of difference and a test of equivalence are both passed. Statistical indeterminacy results when both tests are failed. Hybrid confidence intervals are introduced that impose ICI limits on standard confidence intervals. These intervals are recommended as replacements for error bars because they facilitate inferences. 相似文献

10.

Bayesian t tests for accepting and rejecting the null hypothesis

Jeffrey N. Rouder Paul L. Speckman Dongchu Sun Richard D. Morey Geoffrey Iverson 《Psychonomic bulletin & review》2009,16(2):225-237

Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations. 相似文献

11.

SNOOP: A program for demonstrating the consequences of premature and repeated null hypothesis testing

Strube MJ 《Behavior research methods》2006,38(1):24-27

The ease with which data can be collected and analyzed via personal computer makes it potentially attractive to “peek” at the data before a target sample size is achieved. This tactic might seem appealing because data collection could be stopped early, which would save valuable resources, if a peek revealed a significant effect. Unfortunately, such data snooping comes with a cost. When the null hypothesis is true, the Type I error rate is inflated, sometimes quite substantially. If the null hypothesis is false, premature significance testing leads to inflated estimates of power and effect size. This program provides simulation results for a wide variety of premature and repeated null hypothesis testing scenarios. It gives researchers the ability to know in advance the consequences of data peeking so that appropriate corrective action can be taken. 相似文献

12.

Functional Anatomy of the Null Hypothesis and of Tests of It

Arthur J. Riopelle 《The Journal of general psychology》2013,140(1):47-57

The author compared simulations of the “true” null hypothesis (z) test, in which ò was known and fixed, with the t test, in which s, an estimate of ò, was calculated from the sample because the t test was used to emulate the “true” test. The true null hypothesis test bears exclusively on calculating the probability that a sample distance (mean) is larger than a specified value. The results showed that the value of t was sensitive to sampling fluctuations in both distance and standard error. Large values of t reflect small standard errors when n is small. The value of t achieves sensitivity primarily to distance only when the sample sizes are large. One cannot make a definitive statement about the probability or “significance” of a distance solely on the basis of the value of t. 相似文献

13.

In pursuit of the proper null: Reply to

Brad J. Sagarin John J. Skowronski 《Journal of experimental social psychology》2009,45(2):428-430

Chen argued that the proper null hypothesis for free-choice studies examining shifts in choice was 66.7%. Sagarin and Skowronki (2009) questioned the appropriateness of this value, noting that it was based on an unwarranted assumption that subjects always choose preferred options over less preferred options. In this paper, we respond to the points raised by Chen and Risen (2009), noting that: (a) violations of an additional unwarranted assumption (perfect transitivity) also move the proper null hypothesis towards 50%; (b) the validation of pretest measures would enable researchers to estimate an upper bound on the proper null; (c) the “blind” choice methodology proposed by Sagarin and Skowronski places the null unambiguously at 50%; and (d) Sagarin and Skowronski correctly call for null-hypothesis tests where needed. In the end, we again endorse the idea that this debate is best resolved empirically, but we believe the empirical avenues available are wider than those endorsed by Chen and Risen. 相似文献

14.

Testing equivalence with repeated measures: tests of the difference model of two-alternative forced-choice performance

García-Pérez MA Alcalá-Quintana R 《The Spanish journal of psychology》2011,14(2):1023-1049

Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses. 相似文献

15.

Bayesian inference for psychology,part IV: parameter estimation and Bayes factors

Jeffrey N. Rouder Julia M. Haaf Joachim Vandekerckhove 《Psychonomic bulletin & review》2018,25(1):102-113

In the psychological literature, there are two seemingly different approaches to inference: that from estimation of posterior intervals and that from Bayes factors. We provide an overview of each method and show that a salient difference is the choice of models. The two approaches as commonly practiced can be unified with a certain model specification, now popular in the statistics literature, called spike-and-slab priors. A spike-and-slab prior is a mixture of a null model, the spike, with an effect model, the slab. The estimate of the effect size here is a function of the Bayes factor, showing that estimation and model comparison can be unified. The salient difference is that common Bayes factor approaches provide for privileged consideration of theoretically useful parameter values, such as the value corresponding to the null hypothesis, while estimation approaches do not. Both approaches, either privileging the null or not, are useful depending on the goals of the analyst. 相似文献

16.

Accepting the null hypothesis

Robert W. Frick 《Memory & cognition》1995,23(1):132-138

This article concerns acceptance of the null hypothesis that one variable has no effect on another. Despite frequent opinions to the contrary, this null hypothesis can be correct in some situations. Appropriate criteria for accepting the null hypothesis are (1) that the null hypothesis is possible; (2) that the results are consistent with the null hypothesis; and (3) that the experiment was a good effort to find an effect. These criteria are consistent with the meta-rules for psychology. The good-effort criterion is subjective, which is somewhat undesirable, but the alternative—never accepting the null hypothesis—is neither desirable nor practical. 相似文献

17.

Measurement error and ANCOVA: Functional and structural relationship approaches

Jeroen G. W. Raaijmakers Jo P. M. Pieters 《Psychometrika》1987,52(4):521-538

This article discusses alternative procedures to the standardF-test for ANCOVA in case the covariate is measured with error. Both a functional and a structural relationship approach are described. Examples of both types of analysis are given for the simple two-group design. Several cases are discussed and special attention is given to issues of model identifiability. An approximate statistical test based on the functional relationship approach is described. On the basis of Monte Carlo simulation results it is concluded that this testing procedure is to be preferred to the conventionalF-test of the ANCOVA null hypothesis. It is shown how the standard null hypothesis may be tested in a structural relationship approach. It is concluded that some knowledge of the reliability of the covariate is necessary in order to obtain meaningful results. 相似文献

18.

Showing that the race model inequality is not violated

Gondan M Riehl V Blurton SP 《Behavior research methods》2012,44(1):248-255

When participants are asked to respond in the same way to stimuli from different sources (e.g., auditory and visual), responses are often observed to be substantially faster when both stimuli are presented simultaneously (redundancy gain). Different models account for this effect, the two most important being race models and coactivation models. Redundancy gains consistent with the race model have an upper limit, however, which is given by the well-known race model inequality (Miller, 1982). A number of statistical tests have been proposed for testing the race model inequality in single participants and groups of participants. All of these tests use the race model as the null hypothesis, and rejection of the null hypothesis is considered evidence in favor of coactivation. We introduce a statistical test in which the race model prediction is the alternative hypothesis. This test controls the Type I error if a theory predicts that the race model prediction holds in a given experimental condition. 相似文献

19.

Age effects may be global, not local: comment on Fisk and Rogers (1991) 总被引：1，自引：0，他引：1

J Cerella 《Journal of experimental psychology. General》1991,120(2):215-223

A series of analyses of variance on target search times allowed Fisk and Rogers (1991) to reject the null hypothesis that age had a uniform, additive effect across search conditions. It does not, however, follow that age affected some conditions in an exceptional way, as Fisk and Rogers concluded. Age may have had a uniform but nonadditive effect across conditions. In this article, it is shown that age had a uniform linear, or perhaps slightly curvilinear, effect on search times. This "null hypothesis" adequately accounted for the age effects in all 27 search conditions. Indeed, it accounted for the age effects in 107 conditions abstracted from other visual search studied and for the age effects in 154 conditions abstracted from a miscellaneous collection of nonsearch processing-time studies. The only variation in age outcomes across studies was consistent with sampling error, given the known variance in response times. It is concluded that age is experienced as a generalized slowing of the central nervous system uniformly affecting all information processes. 相似文献

20.

Evaluating Factorial Invariance: An Interval Estimation Approach Using Bayesian Structural Equation Modeling

Dexin Shi Hairong Song Christine DiStefano Alberto Maydeu-Olivares Heather L. McDaniel 《Multivariate behavioral research》2019,54(2):224-245

In this study, we introduce an interval estimation approach based on Bayesian structural equation modeling to evaluate factorial invariance. For each tested parameter, the size of noninvariance with an uncertainty interval (i.e. highest density interval [HDI]) is assessed via Bayesian parameter estimation. By comparing the most credible values (i.e. 95% HDI) with a region of practical equivalence (ROPE), the Bayesian approach allows researchers to (1) support the null hypothesis of practical invariance, and (2) examine the practical importance of the noninvariant parameter. Compared to the traditional likelihood ratio test, simulation results suggested that the proposed Bayesian approach could offer additional insight into evaluating factorial invariance, thus, leading to more informative conclusions. We provide an empirical example to demonstrate the procedures necessary to implement the proposed method in applied research. The importance of and influences on the choice of an appropriate ROPE are discussed. 相似文献