首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
统计推断在科学研究中起到关键作用, 然而当前科研中最常用的经典统计方法——零假设检验(Null hypothesis significance test, NHST)却因难以理解而被部分研究者误用或滥用。有研究者提出使用贝叶斯因子(Bayes factor)作为一种替代和(或)补充的统计方法。贝叶斯因子是贝叶斯统计中用来进行模型比较和假设检验的重要方法, 其可以解读为对零假设H0或者备择假设H1的支持程度。其与NHST相比有如下优势:同时考虑H0H1并可以用来支持H0、不“严重”地倾向于反对H0、可以监控证据强度的变化以及不受抽样计划的影响。目前, 贝叶斯因子能够很便捷地通过开放的统计软件JASP实现, 本文以贝叶斯t检验进行示范。贝叶斯因子的使用对心理学研究者来说具有重要的意义, 但使用时需要注意先验分布选择的合理性以及保持数据分析过程的透明与公开。  相似文献   

2.
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite—taking multiple parameter values—such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.  相似文献   

3.
Bayes factor approaches for testing interval null hypotheses   总被引:1,自引:0,他引:1  
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue in hypothesis testing is that constraints may hold only approximately rather than exactly, and the reason for small deviations may be trivial or uninteresting. In the large-sample limit, these uninteresting, small deviations lead to the rejection of a useful constraint. In this article, we develop several Bayes factor 1-sample tests for the assessment of approximate equality and ordinal constraints. In these tests, the null hypothesis covers a small interval of non-0 but negligible effect sizes around 0. These Bayes factors are alternatives to previously developed Bayes factors, which do not allow for interval null hypotheses, and may especially prove useful to researchers who use statistical equivalence testing. To facilitate adoption of these Bayes factor tests, we provide easy-to-use software.  相似文献   

4.
More than 30 years of research has established psychological hardiness as an important individual resiliency resource. One important question still remaining is whether psychological hardiness can be trained. The present study explored this question longitudinally within the context of a 3-year military academy training program. Cadets from 3 different Norwegian military academies (N = 293) completed hardiness questionnaires during the first week of their training, and then again at the end of each year, resulting in a total of 4 waves of data. Using hierarchical linear modeling, no statistically significant effect of time on hardiness scores was found. The nonsignificant growth parameter was examined further using Bayesian statistics as an indicator of the relative evidence for the null hypothesis of no change over time versus the alternative hypothesis of change. The resulting Bayes factor provided substantial support in our data for the null hypothesis of no hardiness development during the 3-year officer training programs.  相似文献   

5.
The conventional procedure for null hypothesis significance testing has long been the target of appropriate criticism. A more reasonable alternative is proposed, one that not only avoids the unrealistic postulation of a null hypothesis but also, for a given parametric difference and a given error probability, is more likely to report the detection of that difference.  相似文献   

6.
We present a suite of Bayes factor hypothesis tests that allow researchers to grade the decisiveness of the evidence that the data provide for the presence versus the absence of a correlation between two variables. For concreteness, we apply our methods to the recent work of Donnellan et al. (in press) who conducted nine replication studies with over 3,000 participants and failed to replicate the phenomenon that lonely people compensate for a lack of social warmth by taking warmer baths or showers. We show how the Bayes factor hypothesis test can quantify evidence in favor of the null hypothesis, and how the prior specification for the correlation coefficient can be used to define a broad range of tests that address complementary questions. Specifically, we show how the prior specification can be adjusted to create a two-sided test, a one-sided test, a sensitivity analysis, and a replication test.  相似文献   

7.
In the psychological literature, there are two seemingly different approaches to inference: that from estimation of posterior intervals and that from Bayes factors. We provide an overview of each method and show that a salient difference is the choice of models. The two approaches as commonly practiced can be unified with a certain model specification, now popular in the statistics literature, called spike-and-slab priors. A spike-and-slab prior is a mixture of a null model, the spike, with an effect model, the slab. The estimate of the effect size here is a function of the Bayes factor, showing that estimation and model comparison can be unified. The salient difference is that common Bayes factor approaches provide for privileged consideration of theoretically useful parameter values, such as the value corresponding to the null hypothesis, while estimation approaches do not. Both approaches, either privileging the null or not, are useful depending on the goals of the analyst.  相似文献   

8.
Model selection is a central issue in mathematical psychology. One useful criterion for model selection is generalizability; that is, the chosen model should yield the best predictions for future data. Some researchers in psychology have proposed that the Bayes factor can be used for assessing model generalizability. An alternative method, known as the generalization criterion, has also been proposed for the same purpose. We argue that these two methods address different levels of model generalizability (local and global), and will often produce divergent conclusions. We illustrate this divergence by applying the Bayes factor and the generalization criterion to a comparison of retention functions. The application of alternative model selection criteria will also be demonstrated within the framework of model generalizability.  相似文献   

9.
The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.  相似文献   

10.
This article concerns acceptance of the null hypothesis that one variable has no effect on another. Despite frequent opinions to the contrary, this null hypothesis can be correct in some situations. Appropriate criteria for accepting the null hypothesis are (1) that the null hypothesis is possible; (2) that the results are consistent with the null hypothesis; and (3) that the experiment was a good effort to find an effect. These criteria are consistent with the meta-rules for psychology. The good-effort criterion is subjective, which is somewhat undesirable, but the alternative—never accepting the null hypothesis—is neither desirable nor practical.  相似文献   

11.
In two studies, we investigated whether using three-dimensional (3D) manipulatives during assessment aided performance on a variety of preschool mathematics tasks compared to pictorial representations. On measures of children's understanding of counting and cardinality (n = 103), there was no difference in performance between manipulatives and pictures, with Bayes factors suggesting moderate evidence in favor of the null hypothesis. On a measure of children's shape identification (n = 93), there was no difference in performance between objects and pictures, with Bayes factors suggesting moderate evidence in favor of the null hypothesis. These results suggest flexibility in the materials that can be used during assessment. Pictures, or 2D renderings of 3D objects, which can be easily printed and reproduced, may be sufficient for assessing counting and shape knowledge without the need for more cumbersome concrete manipulatives.  相似文献   

12.
不显著结果(如, p > 0.05)在心理学研究中十分常见, 且容易被误解为接受零假设的证据, 并可能导致分组匹配研究的错误推断或者忽视被小样本的不显著结果掩盖的真实效应。但国内目前尚无实证研究对不显著结果的普遍性及其解读进行调查。本研究调查500篇中文心理学实证研究, 统计其摘要中出现与不显著结果相关的阴性陈述的频率, 判断并统计基于阴性陈述的推断准确性, 并使用贝叶斯因子对不显著结果中包含t值的研究进行重新评估。结果表明, 36%的摘要提及不显著结果, 共包含236个阴性陈述。其中, 41%的阴性陈述对不显著结果的解读出现偏差(如, 解读为支持了零假设)。对包含t值的研究进行贝叶斯因子分析, 结果显示仅有5.1%的不显著结果可以提供强证据支持零假设(BF01 > 10)。与先前对国际心理学期刊的调查结果相比(32%的摘要包含阴性陈述; 72%的阴性陈述对不显著结果的解读错误), 中文心理学期刊中报告不显著结果的比例更高, 且对不显著结果解读错误的比例更低。但国内研究者仍需进一步加强对不显著结果的认识, 推广适于评估不显著结果的统计方法。  相似文献   

13.
The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.  相似文献   

14.
One of the most important methodological problems in psychological research is assessing the reasonableness of null models, which typically constrain a parameter to a specific value such as zero. Bayes factor has been recently advocated in the statistical and psychological literature as a principled means of measuring the evidence in data for various models, including those where parameters are set to specific values. Yet, it is rarely adopted in substantive research, perhaps because of the difficulties in computation. Fortunately, for this problem, the Savage–Dickey density ratio (Dickey & Lientz, 1970) provides a conceptually simple approach to computing Bayes factor. Here, we review methods for computing the Savage–Dickey density ratio, and highlight an improved method, originally suggested by Gelfand and Smith (1990) and advocated by Chib (1995), that outperforms those currently discussed in the psychological literature. The improved method is based on conditional quantities, which may be integrated by Markov chain Monte Carlo sampling to estimate Bayes factors. These conditional quantities efficiently utilize all the information in the MCMC chains, leading to accurate estimation of Bayes factors. We demonstrate the method by computing Bayes factors in one-sample and one-way designs, and show how it may be implemented in WinBUGS.  相似文献   

15.
Procedures used for statistical inference are receiving increased scrutiny as the scientific community studies the factors associated with insuring reproducible research. This note addresses recent negative attention directed at p values, the relationship of confidence intervals and tests, and the role of Bayesian inference and Bayes factors, with an eye toward better understanding these different strategies for statistical inference. We argue that researchers and data analysts too often resort to binary decisions (e.g., whether to reject or accept the null hypothesis) in settings where this may not be required.  相似文献   

16.
In comparing characteristics of independent populations, researchers frequently expect a certain structure of the population variances. These expectations can be formulated as hypotheses with equality and/or inequality constraints on the variances. In this article, we consider the Bayes factor for testing such (in)equality-constrained hypotheses on variances. Application of Bayes factors requires specification of a prior under every hypothesis to be tested. However, specifying subjective priors for variances based on prior information is a difficult task. We therefore consider so-called automatic or default Bayes factors. These methods avoid the need for the user to specify priors by using information from the sample data. We present three automatic Bayes factors for testing variances. The first is a Bayes factor with equal priors on all variances, where the priors are specified automatically using a small share of the information in the sample data. The second is the fractional Bayes factor, where a fraction of the likelihood is used for automatic prior specification. The third is an adjustment of the fractional Bayes factor such that the parsimony of inequality-constrained hypotheses is properly taken into account. The Bayes factors are evaluated by investigating different properties such as information consistency and large sample consistency. Based on this evaluation, it is concluded that the adjusted fractional Bayes factor is generally recommendable for testing equality- and inequality-constrained hypotheses on variances.  相似文献   

17.
Valid use of the traditional independent samples ANOVA procedure requires that the population variances are equal. Previous research has investigated whether variance homogeneity tests, such as Levene's test, are satisfactory as gatekeepers for identifying when to use or not to use the ANOVA procedure. This research focuses on a novel homogeneity of variance test that incorporates an equivalence testing approach. Instead of testing the null hypothesis that the variances are equal against an alternative hypothesis that the variances are not equal, the equivalence-based test evaluates the null hypothesis that the difference in the variances falls outside or on the border of a predetermined interval against an alternative hypothesis that the difference in the variances falls within the predetermined interval. Thus, with the equivalence-based procedure, the alternative hypothesis is aligned with the research hypothesis (variance equality). A simulation study demonstrated that the equivalence-based test of population variance homogeneity is a better gatekeeper for the ANOVA than traditional homogeneity of variance tests.  相似文献   

18.
Knowing which properties of visual displays facilitate statistical reasoning bears practical and theoretical implications. Therefore, we studied the effect of one property of visual diplays?– iconicity (i.e., the resemblance of a visual sign to its referent)?– on Bayesian reasoning. Two main accounts of statistical reasoning predict different effect of iconicity on Bayesian reasoning. The ecological-rationality account predicts a positive iconicity effect, because more highly iconic signs resemble more individuated objects, which tap better into an evolutionary-designed frequency-coding mechanism that, in turn, facilitates Bayesian reasoning. The nested-sets account predicts a null iconicity effect, because iconicity does not affect the salience of a nested-sets structure—the factor facilitating Bayesian reasoning processed by a general reasoning mechanism. In two well-powered experiments (N = 577), we found no support for a positive iconicity effect across different iconicity levels that were manipulated in different visual displays (meta-analytical overall effect: log OR = ?0.13, 95 % CI [?0.53, 0.28]). A Bayes factor analysis provided strong evidence in favor of the null hypothesis—the null iconicity effect. Thus, these findings corroborate the nested-sets rather than the ecological-rationality account of statistical reasoning.  相似文献   

19.
We tested whether the acquisition of grapheme-color synesthesia during childhood is related to difficulties in written language learning by measuring whether it is more frequent in 79 children receiving speech and language therapy for such difficulties than in the general population of children (1.3%). By using criteria as similar as possible to those used in the reference study (Simner et al., 2009), we did not identify any synesthete (Bayesian 95% credible interval [0, 4.5]% for a flat prior). The odds of the null model (no difference between 0/79 and 1.3%) over alternative models is 28 (Bayes Factor). A higher prevalence of grapheme-color synesthetes among children with learning difficulties is therefore very unlikely, questioning the hypothesis of a link between synesthesia and difficulties in language acquisition. We also describe the difficulty of diagnosing synesthesia in children and discuss the need for new approaches to do so.  相似文献   

20.
The analysis of R×C contingency tables usually features a test for independence between row and column counts. Throughout the social sciences, the adequacy of the independence hypothesis is generally evaluated by the outcome of a classical p-value null-hypothesis significance test. Unfortunately, however, the classical p-value comes with a number of well-documented drawbacks. Here we outline an alternative, Bayes factor method to quantify the evidence for and against the hypothesis of independence in R×C contingency tables. First we describe different sampling models for contingency tables and provide the corresponding default Bayes factors as originally developed by Gunel and Dickey (Biometrika, 61(3):545–557 (1974)). We then illustrate the properties and advantages of a Bayes factor analysis of contingency tables through simulations and practical examples. Computer code is available online and has been incorporated in the “BayesFactor” R package and the JASP program (jasp-stats.org).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号