首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Researchers misunderstand confidence intervals and standard error bars   总被引:1,自引:0,他引:1  
Little is known about researchers' understanding of confidence intervals (CIs) and standard error (SE) bars. Authors of journal articles in psychology, behavioral neuroscience, and medicine were invited to visit a Web site where they adjusted a figure until they judged 2 means, with error bars, to be just statistically significantly different (p < .05). Results from 473 respondents suggest that many leading researchers have severe misconceptions about how error bars relate to statistical significance, do not adequately distinguish CIs and SE bars, and do not appreciate the importance of whether the 2 means are independent or come from a repeated measures design. Better guidelines for researchers and less ambiguous graphical conventions are needed before the advantages of CIs for research communication can be realized.  相似文献   

2.
Contrasts of means are often of interest because they describe the effect size among multiple treatments. High-quality inference of population effect sizes can be achieved through narrow confidence intervals (CIs). Given the close relation between CI width and sample size, we propose two methods to plan the sample size for an ANCOVA or ANOVA study, so that a sufficiently narrow CI for the population (standardized or unstandardized) contrast of interest will be obtained. The standard method plans the sample size so that the expected CI width is sufficiently small. Since CI width is a random variable, the expected width being sufficiently small does not guarantee that the width obtained in a particular study will be sufficiently small. An extended procedure ensures with some specified, high degree of assurance (e.g., 90% of the time) that the CI observed in a particular study will be sufficiently narrow. We also discuss the rationale and usefulness of two different ways to standardize an ANCOVA contrast, and compare three types of standardized contrast in the ANCOVA/ANOVA context. All of the methods we propose have been implemented in the freely available MBESS package in R so that they can be easily applied by researchers.  相似文献   

3.
Confidence intervals (CIs) give information about replication, but many researchers have misconceptions about this information. One problem is that the percentage of future replication means captured by a particular CI varies markedly, depending on where in relation to the population mean that CI falls. The authors investigated the distribution of this percentage for varsigma known and unknown, for various sample sizes, and for robust CIs. The distribution has strong negative skew: Most 95% CIs will capture around 90% or more of replication means, but some will capture a much lower proportion. On average, a 95% CI will include just 83.4% of future replication means. The authors present figures designed to assist understanding of what CIs say about replication, and they also extend the discussion to explain how p values give information about replication.  相似文献   

4.
Calculating and graphing within-subject confidence intervals for ANOVA   总被引:1,自引:0,他引:1  
The psychological and statistical literature contains several proposals for calculating and plotting confidence intervals (CIs) for within-subjects (repeated measures) ANOVA designs. A key distinction is between intervals supporting inference about patterns of means (and differences between pairs of means, in particular) and those supporting inferences about individual means. In this report, it is argued that CIs for the former are best accomplished by adapting intervals proposed by Cousineau (Tutorials in Quantitative Methods for Psychology, 1, 42–45, 2005) and Morey (Tutorials in Quantitative Methods for Psychology, 4, 61–64, 2008) so that nonoverlapping CIs for individual means correspond to a confidence for their difference that does not include zero. CIs for the latter can be accomplished by fitting a multilevel model. In situations in which both types of inference are of interest, the use of a two-tiered CI is recommended. Free, open-source, cross-platform software for such interval estimates and plots (and for some common alternatives) is provided in the form of R functions for one-way within-subjects and two-way mixed ANOVA designs. These functions provide an easy-to-use solution to the difficult problem of calculating and displaying within-subjects CIs.  相似文献   

5.
Exploratory factor analysis (EFA) has become a common procedure in educational and psychological research. In the course of performing an EFA, researchers often base the decision of how many factors to retain on the eigenvalues for the factors. However, many researchers do not realize that eigenvalues, like all sample statistics, are subject to sampling error, which means that confidence intervals (CIs) can be estimated for each eigenvalue. In the present article, we demonstrate two methods of estimating CIs for eigenvalues: one based on the mathematical properties of the central limit theorem, and the other based on bootstrapping. References to appropriate SAS and SPSS syntax are included. Supplemental materials for this article may be downloaded from http://brm.psychonomic-journals.org/content/supplemental.  相似文献   

6.
This study presents formulae for the covariances between parameter estimates in a single mediator model. These covariances are necessary to build confidence intervals (CI) for effect size measures in mediation studies. We first analytically derived the covariances between the parameter estimates in a single mediator model. Using the derived covariances, we computed the multivariate‐delta standard errors, and built the 95% CIs for the effect size measures. A simulation study evaluated the accuracy of the standard errors as well as the Type I error, power, and coverage of the CIs using various parameter values and sample sizes. Finally, we presented a numerical example and a SAS MACRO that calculates the CIs for the effect size measures.  相似文献   

7.
When designing a study that uses structural equation modeling (SEM), an important task is to decide an appropriate sample size. Historically, this task is approached from the power analytic perspective, where the goal is to obtain sufficient power to reject a false null hypothesis. However, hypothesis testing only tells if a population effect is zero and fails to address the question about the population effect size. Moreover, significance tests in the SEM context often reject the null hypothesis too easily, and therefore the problem in practice is having too much power instead of not enough power.

An alternative means to infer the population effect is forming confidence intervals (CIs). A CI is more informative than hypothesis testing because a CI provides a range of plausible values for the population effect size of interest. Given the close relationship between CI and sample size, the sample size for an SEM study can be planned with the goal to obtain sufficiently narrow CIs for the population model parameters of interest.

Latent curve models (LCMs) is an application of SEM with mean structure to studying change over time. The sample size planning method for LCM from the CI perspective is based on maximum likelihood and expected information matrix. Given a sample, to form a CI for the model parameter of interest in LCM, it requires the sample covariance matrix S, sample mean vector , and sample size N. Therefore, the width (w) of the resulting CI can be considered a function of S, , and N. Inverting the CI formation process gives the sample size planning process. The inverted process requires a proxy for the population covariance matrix Σ, population mean vector μ, and the desired width ω as input, and it returns N as output. The specification of the input information for sample size planning needs to be performed based on a systematic literature review. In the context of covariance structure analysis, Lai and Kelley (2011) discussed several practical methods to facilitate specifying Σ and ω for the sample size planning procedure.  相似文献   

8.
Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. For this reason, we caution against relying upon confidence interval theory to justify interval estimates, and suggest that other theories of interval estimation should be used instead.  相似文献   

9.
False memory effects were explored using unrelated list items (e.g., slope, reindeer, corn) that were related to mediators (e.g., ski, sleigh, flake) that all converged upon a single nonpresented critical item (CI; e.g., snow). In Experiment 1, participants completed either an initial recall test or arithmetic problems after study, followed by a final recognition test. Participants did not falsely recall CIs on the initial test; however, false alarms to CIs did occur in recognition, but only following an initial recall test. In Experiment 2, participants were instructed to guess the CI, followed by a recognition test. The results replicated Experiment 1, with an increase in CI false alarms. Experiment 3 controlled for item effects by replacing unrelated recognition items from Experiment 1 with both CIs and list items from nonpresented lists. Once again, CI false alarms were found when controlling for lexical characteristics, demonstrating that mediated false memory is not due simply to item differences.  相似文献   

10.
Impairment on standard tests of delayed recall is often already maximal in the aMCI stage of Alzheimer's Disease. Neuropathological work shows that the neural substrates of memory function continue to deteriorate throughout the progression of the disease, hinting that further changes in memory performance could be tracked by a more sensitive test of delayed recall. Recent work shows that retention in aMCI patients can be raised well above floor when the delay period is devoid of further material - 'Minimal Interference'. This memory enhancement is thought to be the result of improved memory consolidation. Here we used the minimal interference/interference paradigm (word list retention following 10 min of quiet resting vs. picture naming) in a group of 17 AD patients, 25 aMCI patients and 25 controls. We found (1) that retention can be improved significantly by minimal interference in patients with aMCI and patients with mild to moderate AD; (2) that the minimal interference paradigm is sensitive to decline in memory function with disease severity, even when performance on standard tests has reached floor; and (3) that this paradigm can differentiate well (80% sensitivity and 100% specificity) between aMCI patients who progress and do not progress to AD within 2 years. Our findings support the notion that the early memory dysfunction in AD is associated with an increased susceptibility to memory interference and are suggestive of a gradual decline in consolidation capacity with disease progression.  相似文献   

11.
Confidence intervals (CIs) in principal component analysis (PCA) can be based on asymptotic standard errors and on the bootstrap methodology. The present paper offers an overview of possible strategies for bootstrapping in PCA. A motivating example shows that CI estimates for the component loadings using different methods may diverge. We explain that this results from both differences in quality and in perspective on the rotational freedom of the population loadings. A comparative simulation study examines the quality of various estimated component loading CIs. The bootstrap approach is more flexible and generally yields better CIs than the asymptotic approach. However, in the case of a clear simple structure of varimax rotated loadings, one can be confident that the asymptotic estimates are reasonable as well.  相似文献   

12.
Null hypothesis significance testing (NHST) is undoubtedly the most common inferential technique used to justify claims in the social sciences. However, even staunch defenders of NHST agree that its outcomes are often misinterpreted. Confidence intervals (CIs) have frequently been proposed as a more useful alternative to NHST, and their use is strongly encouraged in the APA Manual. Nevertheless, little is known about how researchers interpret CIs. In this study, 120 researchers and 442 students—all in the field of psychology—were asked to assess the truth value of six particular statements involving different interpretations of a CI. Although all six statements were false, both researchers and students endorsed, on average, more than three statements, indicating a gross misunderstanding of CIs. Self-declared experience with statistics was not related to researchers’ performance, and, even more surprisingly, researchers hardly outperformed the students, even though the students had not received any education on statistical inference whatsoever. Our findings suggest that many researchers do not know the correct interpretation of a CI. The misunderstandings surrounding p-values and CIs are particularly unfortunate because they constitute the main tools by which psychologists draw conclusions from data.  相似文献   

13.
Wider use in psychology of confidence intervals (CIs), especially as error bars in figures, is a desirable development. However, psychologists seldom use CIs and may not understand them well. The authors discuss the interpretation of figures with error bars and analyze the relationship between CIs and statistical significance testing. They propose 7 rules of eye to guide the inferential use of figures with error bars. These include general principles: Seek bars that relate directly to effects of interest, be sensitive to experimental design, and interpret the intervals. They also include guidelines for inferential interpretation of the overlap of CIs on independent group means. Wider use of interval estimation in psychology has the potential to improve research communication substantially.  相似文献   

14.
Implementation integrity is a potentially critical issue for problem-solving teams (PST) and most response-to-intervention models. The current study hypothesized that providing performance feedback, which has consistently been shown to increase implementation integrity, to PSTs would enhance the procedural integrity of the process. The PSTs for three elementary schools were provided performance feedback with a 20-item checklist created from the literature. A multiple-baseline design across schools revealed an immediate change in level after providing performance feedback. The resulting percentages of non-overlapping data were 90.9%, 90.0%, and 100%. However, PSTs still did not monitor student progress, assess the effectiveness of the intervention, or measure the integrity with which the intervention was implemented even after receiving feedback. Thus, providing performance feedback could be a method to increase the fidelity with which critical components of data-based problem-solving are implemented, but these data suggest the need for additional research.  相似文献   

15.
The effects of tests on learning and forgetting   总被引:1,自引:0,他引:1  
In three experiments, we investigated whether memory tests enhance learning and reduce forgetting more than additional study opportunities do. Subjects learned obscure facts (Experiments 1 and 2) or Swahili-English word pairs (Experiment 3) by either completing a test with feedback (test/study) or receiving an additional study opportunity (study). Recall was tested after 5 min or 1, 2, 7, 14, or 42 days. We explored forgetting by means of an ANOVA and also by fitting a power function to the data. In all three experiments, testing enhanced overall recall more than restudying did. According to the power function, in two out of three experiments, testing also reduced forgetting more than restudying did, although this was not always the case according to the ANOVA. We discuss the implications of these results both for approaches to measuring forgetting and for the use of tests in promoting long-term retention. The stimuli used in these experiments may be found at www.psychonomic.org/archive.  相似文献   

16.
Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within-subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.  相似文献   

17.
刘彦楼 《心理学报》2022,54(6):703-724
认知诊断模型的标准误(Standard Error, SE; 或方差—协方差矩阵)与置信区间(Confidence Interval, CI)在模型参数估计不确定性的度量、项目功能差异检验、项目水平上的模型比较、Q矩阵检验以及探索属性层级关系等领域有重要的理论与实践价值。本研究提出了两种新的SE和CI计算方法:并行参数化自助法和并行非参数化自助法。模拟研究发现:模型完全正确设定时, 在高质量及中等质量项目条件下, 这两种方法在计算模型参数的SE和CI时均有好的表现; 模型参数存在冗余时, 在高质量及中等质量项目条件下, 对于大部分允许存在的模型参数而言, 其SE和CI有好的表现。通过实证数据展示了新方法的价值及计算效率提升效果。  相似文献   

18.
ABSTRACT— Taboo words are defined and sanctioned by institutions of power (e.g., religion, media), and prohibitions are reiterated in child-rearing practices. Native speakers acquire folk knowledge of taboo words, but it lacks the complexity that psychological science requires for an understanding of swearing. Misperceptions persist in psychological science and in society at large about how frequently people swear or what it means when they do. Public recordings of taboo words establish the commonplace occurrence of swearing (ubiquity), although frequency data are not always appreciated in laboratory research. A set of 10 words that has remained stable over the past 20 years accounts for 80% of public swearing. Swearing is positively correlated with extraversion and Type A hostility but negatively correlated with agreeableness, conscientiousness, religiosity, and sexual anxiety. The uniquely human facility for swearing evolved and persists because taboo words can communicate emotion information (anger, frustration) more readily than nontaboo words, allowing speakers to achieve a variety of personal and social goals with them (utility). A neuro-psycho-social framework is offered to unify taboo word research. Suggestions for future research are offered.  相似文献   

19.
The purpose of this investigation was to determine how confidence intervals (CIs) for pediatric neuropsychological norms vary as a function of sample size, and to determine optimal sample sizes for normative studies. First, the authors calculated 95% CIs for a set of published pediatric norms for four commonly used neuropsychological instruments. Second, 95% CIs were calculated for varying sample size (from n = 5 to n = 500). Results suggest that some pediatric norms have unacceptably wide CIs, and normative studies ought optimally to use 50 to 75 participants per cell. Smaller sample sizes may lead to overpathologizing results, while the cost of obtaining larger samples may not be justifiable.  相似文献   

20.
Sarnecka BW  Carey S 《Cognition》2008,108(3):662-674
This study compared 2- to 4-year-olds who understand how counting works (cardinal-principle-knowers) to those who do not (subset-knowers), in order to better characterize the knowledge itself. New results are that (1) Many children answer the question "how many" with the last word used in counting, despite not understanding how counting works; (2) Only children who have mastered the cardinal principle, or are just short of doing so, understand that adding objects to a set means moving forward in the numeral list whereas subtracting objects mean going backward; and finally (3) Only cardinal-principle-knowers understand that adding exactly 1 object to a set means moving forward exactly 1 word in the list, whereas subset-knowers do not understand the unit of change.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号