首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within-subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.  相似文献   

2.
Zou GY 《心理学方法》2007,12(4):399-413
Confidence intervals are widely accepted as a preferred way to present study results. They encompass significance tests and provide an estimate of the magnitude of the effect. However, comparisons of correlations still rely heavily on significance testing. The persistence of this practice is caused primarily by the lack of simple yet accurate procedures that can maintain coverage at the nominal level in a nonlopsided manner. The purpose of this article is to present a general approach to constructing approximate confidence intervals for differences between (a) 2 independent correlations, (b) 2 overlapping correlations, (c) 2 nonoverlapping correlations, and (d) 2 independent R2s. The distinctive feature of this approach is its acknowledgment of the asymmetry of sampling distributions for single correlations. This approach requires only the availability of confidence limits for the separate correlations and, for correlated correlations, a method for taking into account the dependency between correlations. These closed-form procedures are shown by simulation studies to provide very satisfactory results in small to moderate sample sizes. The proposed approach is illustrated with worked examples.  相似文献   

3.
4.
Repeated measures designs are common in experimental psychology. Because of the correlational structure in these designs, the calculation and interpretation of confidence intervals is nontrivial. One solution was provided by Loftus and Masson (Psychonomic Bulletin & Review 1:476-490, 1994). This solution, although widely adopted, has the limitation of implying same-size confidence intervals for all factor levels, and therefore does not allow for the assessment of variance homogeneity assumptions (i.e., the circularity assumption, which is crucial for the repeated measures ANOVA). This limitation and the method's perceived complexity have sometimes led scientists to use a simplified variant, based on a per-subject normalization of the data (Bakeman & McArthur, Behavior Research Methods, Instruments, & Computers 28:584-589, 1996; Cousineau, Tutorials in Quantitative Methods for Psychology 1:42-45, 2005; Morey, Tutorials in Quantitative Methods for Psychology 4:61-64, 2008; Morrison & Weaver, Behavior Research Methods, Instruments, & Computers 27:52-56, 1995). We show that this normalization method leads to biased results and is uninformative with regard to circularity. Instead, we provide a simple, intuitive generalization of the Loftus and Masson method that allows for assessment of the circularity assumption.  相似文献   

5.
Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. For this reason, we caution against relying upon confidence interval theory to justify interval estimates, and suggest that other theories of interval estimation should be used instead.  相似文献   

6.
Null hypothesis significance testing (NHST) is undoubtedly the most common inferential technique used to justify claims in the social sciences. However, even staunch defenders of NHST agree that its outcomes are often misinterpreted. Confidence intervals (CIs) have frequently been proposed as a more useful alternative to NHST, and their use is strongly encouraged in the APA Manual. Nevertheless, little is known about how researchers interpret CIs. In this study, 120 researchers and 442 students—all in the field of psychology—were asked to assess the truth value of six particular statements involving different interpretations of a CI. Although all six statements were false, both researchers and students endorsed, on average, more than three statements, indicating a gross misunderstanding of CIs. Self-declared experience with statistics was not related to researchers’ performance, and, even more surprisingly, researchers hardly outperformed the students, even though the students had not received any education on statistical inference whatsoever. Our findings suggest that many researchers do not know the correct interpretation of a CI. The misunderstandings surrounding p-values and CIs are particularly unfortunate because they constitute the main tools by which psychologists draw conclusions from data.  相似文献   

7.
If the model for the data are strictly speaking incorrect, then how can one test whether the model fits? Standard goodness-of-fit (GOF) tests rely on strictly correct or incorrect models. But in practice the correct model is not assumed to be available. It would still be of interest to determine how good or how bad the approximation is. But how can this be achieved? If it is determined that a model is a good approximation and hence a good explanation of the data, how can reliable confidence intervals be constructed? In this paper, an attempt is made to answer the above questions. Several GOF tests and methods of constructing confidence intervals are evaluated both in a simulation and with real data from the internet-based daily news memory test.  相似文献   

8.
We argue that to best comprehend many data sets, plotting judiciously selected sample statistics with associated confidence intervals can usefully supplement, or even replace, standard hypothesis-testing procedures. We note that most social science statistics textbooks limit discussion of confidence intervals to their use in between-subject designs. Our central purpose in this article is to describe how to compute an analogous confidence interval that can be used in within-subject designs. This confidence interval rests on the reasoning that because between-subject variance typically plays no role in statistical analyses of within-subject designs, it can legitimately be ignored; hence, an appropriate confidence interval can be based on the standard within-subject error term—that is, on the variability due to the subject × condition interaction. Computation of such a confidence interval is simple and is embodied in Equation 2 on p. 482 of this article. This confidence interval has two useful properties. First, it is based on the same error term as is the corresponding analysis of variance, and hence leads to comparable conclusions. Second, it is related by a known factor (√2) to a confidence interval of the difference between sample means; accordingly, it can be used to infer the faith one can put in some pattern of sample means as a reflection of the underlying pattern of population means. These two properties correspond to analogous properties of the more widely used between-subject confidence interval.  相似文献   

9.
Large-sample confidence intervals (CI) for reliability, validity, and unattenuated validity are presented. The CI for unattenuated validity is based on the Bonferroni inequality, which relies on one CI for test-retest reliability and one for validity. Covered are four reliability-validity situations: (a) both estimates were from random samples; (b) reliability was from a random sample but validity was from a selected sample; (c) validity was from a random sample but reliability was from a selected sample; and (d) both estimates were from selected samples. All CIs were evaluated by using a simulation. CIs on reliability, validity, or unattenuated validity are accurate as long as selection ratio is at least 20% and selected sample size is 100 or larger. When selection ratio is less than 20%, estimators tend to underestimate their parameters.  相似文献   

10.
Researchers misunderstand confidence intervals and standard error bars   总被引:1,自引:0,他引:1  
Little is known about researchers' understanding of confidence intervals (CIs) and standard error (SE) bars. Authors of journal articles in psychology, behavioral neuroscience, and medicine were invited to visit a Web site where they adjusted a figure until they judged 2 means, with error bars, to be just statistically significantly different (p < .05). Results from 473 respondents suggest that many leading researchers have severe misconceptions about how error bars relate to statistical significance, do not adequately distinguish CIs and SE bars, and do not appreciate the importance of whether the 2 means are independent or come from a repeated measures design. Better guidelines for researchers and less ambiguous graphical conventions are needed before the advantages of CIs for research communication can be realized.  相似文献   

11.
Calculating and graphing within-subject confidence intervals for ANOVA   总被引:1,自引:0,他引:1  
The psychological and statistical literature contains several proposals for calculating and plotting confidence intervals (CIs) for within-subjects (repeated measures) ANOVA designs. A key distinction is between intervals supporting inference about patterns of means (and differences between pairs of means, in particular) and those supporting inferences about individual means. In this report, it is argued that CIs for the former are best accomplished by adapting intervals proposed by Cousineau (Tutorials in Quantitative Methods for Psychology, 1, 42–45, 2005) and Morey (Tutorials in Quantitative Methods for Psychology, 4, 61–64, 2008) so that nonoverlapping CIs for individual means correspond to a confidence for their difference that does not include zero. CIs for the latter can be accomplished by fitting a multilevel model. In situations in which both types of inference are of interest, the use of a two-tiered CI is recommended. Free, open-source, cross-platform software for such interval estimates and plots (and for some common alternatives) is provided in the form of R functions for one-way within-subjects and two-way mixed ANOVA designs. These functions provide an easy-to-use solution to the difficult problem of calculating and displaying within-subjects CIs.  相似文献   

12.
Principal covariate regression (PCOVR) is a method for regressing a set of criterion variables with respect to a set of predictor variables when the latter are many in number and/or collinear. This is done by extracting a limited number of components that simultaneously synthesize the predictor variables and predict the criterion ones. So far, no procedure has been offered for estimating statistical uncertainties of the obtained PCOVR parameter estimates. The present paper shows how this goal can be achieved, conditionally on the model specification, by means of the bootstrap approach. Four strategies for estimating bootstrap confidence intervals are derived and their statistical behaviour in terms of coverage is assessed by means of a simulation experiment. Such strategies are distinguished by the use of the varimax and quartimin procedures and by the use of Procrustes rotations of bootstrap solutions towards the sample solution. In general, the four strategies showed appropriate statistical behaviour, with coverage tending to the desired level for increasing sample sizes. The main exception involved strategies based on the quartimin procedure in cases characterized by complex underlying structures of the components. The appropriateness of the statistical behaviour was higher when the proper number of components were extracted.  相似文献   

13.
Loftus and Masson (1994) proposed a method for computing confidence intervals (CIs) in repeated measures (RM) designs and later proposed that RM CIs for factorial designs should be based on number of observations rather than number of participants (Masson & Loftus, 2003). However, determining the correct number of observations for a particular effect can be complicated, given that its value depends on the relation between the effect and the overall design. To address this, we recently defined a general number-of-observations principle, explained why it obtains, and provided step-by-step instructions for constructing CIs for various effect types (Jarmasz & Hollands, 2009). In this note, we provide a brief summary of our approach.  相似文献   

14.
The statistical power of a hypothesis test is closely related to the precision of the accompanying confidence interval. In the case of a z-test, the width of the confidence interval is a function of statistical power for the planned study. If minimum effect size is used in power analysis, the width of the confidence interval is the minimum effect size times a multiplicative factor φ. The index φ, or the precision-to-effect ratio, is a function of the computed statistical power. In the case of a t-test, statistical power affects the probability of achieving a certain width of confidence interval, which is equivalent to the probability of obtaining a certain value of φ. To consider estimate precision in conjunction with statistical power, we can choose a sample size to obtain a desired probability of achieving a short width conditional on the rejection of the null hypothesis.  相似文献   

15.
Confidence intervals for the mean function of the true proportion score ( x ), where andx respectively denote the true proportion and observed test scores, can be approximated by the Efron, Bayesian, and parametric empirical Bayes (PEB) bootstrap procedures. The similarity of results yielded by all the bootstrap methods suggests the following: the unidentifiability problem of the prior distributiong() can be bypassed with respect to the construction of confidence intervals for the mean function, and a beta distribution forg() is a reasonable assumption for the test scores in compliance with a negative hypergeometric distribution. The PEB bootstrap, which reflects the construction of Morris intervals, is introduced for computing predictive confidence bands for x. It is noted that the effect of test reliability on the precision of interval estimates varies with the two types of confidence statements concerned.The Authors are indebted to the Editor and anonymous reviewers for constructive suggestions and comments. The authors wish to thank Min-Te Chao and Cheng-Der Fuh for some useful suggestions at earlier stages of writing this paper.  相似文献   

16.
A recent paper by Wainer and Thissen has renewed the interest in Gini's mean difference,G, by pointing out its robust characteristics. This note presents distribution-free asymptotic confidence intervals for its population value,γ, in the one sample case and for the difference Δ=(γ 1?γ 2) in the two sample situations. Both procedures are based on a technique of jackknifingU-statistics developed by Arvesen.  相似文献   

17.
Many statistics packages print skewness and kurtosis statistics with estimates of their standard errors. The function most often used for the standard errors (e.g., in SPSS) assumes that the data are drawn from a normal distribution, an unlikely situation. Some textbooks suggest that if the statistic is more than about 2 standard errors from the hypothesized value (i.e., an approximate value for the critical value from the t distribution for moderate or large sample sizes when α = 5%), the hypothesized value can be rejected. This is an inappropriate practice unless the standard error estimate is accurate and the sampling distribution is approximately normal. We show distributions where the traditional standard errors provided by the function underestimate the actual values, often being 5 times too small, and distributions where the function overestimates the true values. Bootstrap standard errors and confidence intervals are more accurate than the traditional approach, although still imperfect. The reasons for this are discussed. We recommend that if you are using skewness and kurtosis statistics based on the 3rd and 4th moments, bootstrapping should be used to calculate standard errors and confidence intervals, rather than using the traditional standard. Software in the freeware R for this article provides these estimates.  相似文献   

18.
The use ofU-statistics based on rank correlation coefficients in estimating the strength of concordance among a group of rankers is examined for cases where the null hypothesis of random rankings is not tenable. The studentizedU-statistics is asymptotically distribution-free, and the Student-t approximation is used for small and moderate sized samples. An approximate confidence interval is constructed for the strength of concordance. Monte Carlo results indicate that the Student-t approximation can be improved by estimating the degrees of freedom.Research partially supported on ONR Contract N00014-82-K-0207.  相似文献   

19.
Exploratory factor analysis (EFA) has become a common procedure in educational and psychological research. In the course of performing an EFA, researchers often base the decision of how many factors to retain on the eigenvalues for the factors. However, many researchers do not realize that eigenvalues, like all sample statistics, are subject to sampling error, which means that confidence intervals (CIs) can be estimated for each eigenvalue. In the present article, we demonstrate two methods of estimating CIs for eigenvalues: one based on the mathematical properties of the central limit theorem, and the other based on bootstrapping. References to appropriate SAS and SPSS syntax are included. Supplemental materials for this article may be downloaded from http://brm.psychonomic-journals.org/content/supplemental.  相似文献   

20.
In this study, we analyzed the validity of the conventional 80% power. The minimal sample size and power needed to guarantee non-overlapping (1-alpha)% confidence intervals for population means were calculated. Several simulations indicate that the minimal power for two means (m = 2) to have non-overlapping CIs is .80, for (1-alpha) set to 95%. The minimal power becomes .86 for 99% CIs and .75 for 90% CIs. When multiple means are considered, the required minimal power increases considerably. This increase is even higher when the population means do not increase monotonically. Therefore, the often adopted criterion of a minimal power equal to .80 is not always adequate. Hence, to guarantee that the limits of the CIs do not overlap, most situations require a direct calculation of the minimum number of observations that should enter in a study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号