首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 125 毫秒
人们熟知的零假设显著性检验,受到一次次质疑与辩护,地位并未动摇,报告检验结果仍然是统计分析的习惯做法。不过,其局限性促使研究者探寻更多的统计方法如区间估计、效应量分析、检验力分析等。本文先介绍假设检验与置信区间的关系;然后讨论检验力与两类错误率和效应量的关系;最后在理顺上述统计方法的基础上,提供一个可操作的统计分析流程。  相似文献   

由于零假设显著性检验存在的问题,近年来国外许多研究者、期刊编辑和研究学会建议或要求定量研究结果报告效应值作为对显著性检验结果的补充,然而国内心理学、教育学等社会科学领域还很少有学者对效应值进行专门研究。文章将讨论显著性检验存在的问题,效应值的定义及其重要性,效应值的分类、效应值的计算方法和效应值的解释标准。  相似文献   

不显著结果(如, p > 0.05)在心理学研究中十分常见, 且容易被误解为接受零假设的证据, 并可能导致分组匹配研究的错误推断或者忽视被小样本的不显著结果掩盖的真实效应。但国内目前尚无实证研究对不显著结果的普遍性及其解读进行调查。本研究调查500篇中文心理学实证研究, 统计其摘要中出现与不显著结果相关的阴性陈述的频率, 判断并统计基于阴性陈述的推断准确性, 并使用贝叶斯因子对不显著结果中包含t值的研究进行重新评估。结果表明, 36%的摘要提及不显著结果, 共包含236个阴性陈述。其中, 41%的阴性陈述对不显著结果的解读出现偏差(如, 解读为支持了零假设)。对包含t值的研究进行贝叶斯因子分析, 结果显示仅有5.1%的不显著结果可以提供强证据支持零假设(BF01 > 10)。与先前对国际心理学期刊的调查结果相比(32%的摘要包含阴性陈述; 72%的阴性陈述对不显著结果的解读错误), 中文心理学期刊中报告不显著结果的比例更高, 且对不显著结果解读错误的比例更低。但国内研究者仍需进一步加强对不显著结果的认识, 推广适于评估不显著结果的统计方法。  相似文献   

在心理学研究中,研究者可能需要评估效应是否不存在,但常用的原假设显著性检验无法提供支持零效应的证据。因此,实践中研究者要么对p>0.05的情况进行回避,要么错误地认为p>0.05支持了原假设。近年来,等价检验、贝叶斯估计和贝叶斯因子逐渐被用于评估零效应。文章介绍了这三种方法的原理,并通过两个实例分析,展示三种方法的实际应用。这三种评估零效应的方法能够帮助心理学研究者在实际研究中进行合理的统计推断和研究决策。  相似文献   

内疚作为一种典型的道德情绪, 被认为具有亲社会作用, 但很多研究却发现内疚并不总能促进亲社会行为。为了明确内疚对亲社会行为的作用, 分析造成结论分歧的可能原因, 本研究采用元分析方法探讨了特质内疚与亲社会行为的关系以及状态内疚对亲社会行为的影响。共有46篇文献92个独立样本纳入元分析(N = 17248)。元分析结果表明:(1)特质内疚与亲社会行为之间存在中等程度的正相关, 二者之间的关系受到亲社会行为类型的调节, 相比较捐赠、助人、环保行为等, 特质内疚与补偿之间的相关更强; (2)启动内疚状态能显著提升个体的亲社会行为, 但两者之间的关联呈较小的效应量, 亲社会行为对象在其中起到调节作用, 感到内疚的个体更愿意对受害方做出亲社会行为; (3) p曲线(p-curve)分析发现, 两个元分析研究的p曲线均呈显著右偏态, 表明特质内疚与亲社会行为的关系以及状态内疚对亲社会行为的影响均存在真实的效应, 而不是出版偏倚或者p hacking导致。  相似文献   

郑昊敏  温忠麟  吴艳 《心理科学进展》2011,19(12):1868-1878
效应量在量化方面弥补了零假设检验的不足。除了报告检验结果外, 许多期刊还要求在研究报告中包括效应量。效应量可以分为三大类别:差异类、相关类和组重叠类, 它们在不同的研究设计(如单因素和多因素被试间、被试内和混合实验设计)或在不同的数据条件下(如小样本、方差异质等)可能有不同的计算方法和用法, 但许多效应量可以相互转换。我们梳理出一个表格有助应用工作者根据研究目的和研究类型选用合适的效应量。  相似文献   

中介效应的检验方法和效果量测量:回顾与展望   总被引:3,自引:0,他引:3       下载免费PDF全文
通过中介效应检验方法之间的比较和效果量指标之间的比较,建议放弃将总效应c显著作为中介效应检验的前提条件,放弃基于直接效应c'显著性的完全和部分中介的提法,推荐使用偏差校正的百分位Bootstrap法直接对中介效应ab进行检验,使用κ2Rmed2等中介效果量指标并报告效果量的置信区间。作为示例,用R软件的MBESS软件包对某消防员饮食健康调查进行了中介效应检验和效果量测量。随后展望了中介效应检验方法和效果量测量的拓展方向。  相似文献   

统计推断在科学研究中起到关键作用, 然而当前科研中最常用的经典统计方法——零假设检验(Null hypothesis significance test, NHST)却因难以理解而被部分研究者误用或滥用。有研究者提出使用贝叶斯因子(Bayes factor)作为一种替代和(或)补充的统计方法。贝叶斯因子是贝叶斯统计中用来进行模型比较和假设检验的重要方法, 其可以解读为对零假设H0或者备择假设H1的支持程度。其与NHST相比有如下优势:同时考虑H0H1并可以用来支持H0、不“严重”地倾向于反对H0、可以监控证据强度的变化以及不受抽样计划的影响。目前, 贝叶斯因子能够很便捷地通过开放的统计软件JASP实现, 本文以贝叶斯t检验进行示范。贝叶斯因子的使用对心理学研究者来说具有重要的意义, 但使用时需要注意先验分布选择的合理性以及保持数据分析过程的透明与公开。  相似文献   

该文以平均数差异显著性检验为例,对实验数据进行假设检验后,继续对其统计检验力和效果大小进行估计的基本原理和方法作一介绍。  相似文献   

王阳  温忠麟  付媛姝 《心理科学进展》2020,28(11):1961-1969
常用的结构方程模型拟合指数存在一定局限, 如χ 2以传统零假设为目标假设, 无法验证模型, 而RMSEA和CFI等描述性的拟合指数不具备推断统计性质, 等效性检验有效弥补了这些问题。首先说明等效性检验如何评价单个模型的拟合, 并解释其与零假设检验的不同, 然后介绍等效性检验如何分析测量不变性, 接着用实证数据展示了等效性检验在单个模型评价和测量不变性检验中的效果, 并与传统模型评价方法比较。  相似文献   

吕小康 《心理科学》2012,35(6):1502-1506
假设检验思想的提出者Fisher与Neyman–Pearson在统计模型的方法论基础、两类错误的性质、显著性水平的理解、以及假设检验的功能等方面存在诸多分歧, 使得心理统计中最常用的原假设显著性检验模式呈现出隐含的各种矛盾, 从而引发了应用上的争议。心理统计不仅需要检讨现有检验模型的模糊之处和提出其他补充性的统计推论方式,更应注重反思心理统计的教育传统, 以建立更加开放和多元的统计应用视野, 使心理统计为更好地心理学研究服务。  相似文献   

Aim: This paper highlights some of the areas where there are problems with the way that statistics are conducted and reported in psychology journals. Recommendations are given for improving these problems. Sample: The choice of topics is based largely on the questions that authors, reviewers, and editors have asked in recent years. The focus is on null hypothesis significance testing (NHST), choosing a statistical test, and what should be included in results sections. Results: There are several ways to improve how statistics are reported. These should improve both the authors' and the readers' understanding of the data. Conclusions: Psychology as a discipline will improve if the way in which statistics are conducted and reported is improved. This will require effort from authors, scrutiny from reviewers, and stubbornness from editors.  相似文献   

For comparing nested covariance structure models, the standard procedure is the likelihood ratio test of the difference in fit, where the null hypothesis is that the models fit identically in the population. A procedure for determining statistical power of this test is presented where effect size is based on a specified difference in overall fit of the models. A modification of the standard null hypothesis of zero difference in fit is proposed allowing for testing an interval hypothesis that the difference in fit between models is small, rather than zero. These developments are combined yielding a procedure for estimating power of a test of a null hypothesis of small difference in fit versus an alternative hypothesis of larger difference.  相似文献   

Valid use of the traditional independent samples ANOVA procedure requires that the population variances are equal. Previous research has investigated whether variance homogeneity tests, such as Levene's test, are satisfactory as gatekeepers for identifying when to use or not to use the ANOVA procedure. This research focuses on a novel homogeneity of variance test that incorporates an equivalence testing approach. Instead of testing the null hypothesis that the variances are equal against an alternative hypothesis that the variances are not equal, the equivalence-based test evaluates the null hypothesis that the difference in the variances falls outside or on the border of a predetermined interval against an alternative hypothesis that the difference in the variances falls within the predetermined interval. Thus, with the equivalence-based procedure, the alternative hypothesis is aligned with the research hypothesis (variance equality). A simulation study demonstrated that the equivalence-based test of population variance homogeneity is a better gatekeeper for the ANOVA than traditional homogeneity of variance tests.  相似文献   

Tryon WW  Lewis C 《心理学方法》2008,13(3):272-277
Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H-sub-0 is not evidence supportive of it. Tests of statistical equivalence are needed. This article corrects the inferential confidence interval (ICI) reduction factor introduced by W. W. Tryon (2001) and uses it to extend his discussion of statistical equivalence. This method is shown to be algebraically equivalent with D. J. Schuirmann's (1987) use of 2 one-sided t tests, a highly regarded and accepted method of testing for statistical equivalence. The ICI method provides an intuitive graphic method for inferring statistical difference as well as equivalence. Trivial difference occurs when a test of difference and a test of equivalence are both passed. Statistical indeterminacy results when both tests are failed. Hybrid confidence intervals are introduced that impose ICI limits on standard confidence intervals. These intervals are recommended as replacements for error bars because they facilitate inferences.  相似文献   

Chow SL 《The Behavioral and brain sciences》1998,21(2):169-94; discussion 194-239
The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.  相似文献   

Issues involved in the evaluation of null hypotheses are discussed. The use of equivalence testing is recommended as a possible alternative to the use of simple t or F tests for evaluating a null hypothesis. When statistical power is low and larger sample sizes are not available or practical, consideration should be given to using one-tailed tests or less conservative levels for determining criterion levels of statistical significance. Effect sizes should always be reported along with significance levels, as both are needed to understand results of research. Probabilities alone are not enough and are especially problematic for very large or very small samples. Pre-existing group differences should be tested and properly accounted for when comparing independent groups on dependent variables. If confirmation of a null hypothesis is expected, potential suppressor variables should be considered. If different methods are used to select the samples to be compared, controls for social desirability bias should be implemented. When researchers deviate from these standards or appear to assume that such standards are unimportant or irrelevant, their results should be deemed less credible than when such standards are maintained and followed. Several examples of recent violations of such standards in family social science, comparing gay, lesbian, bisexual, and transgender families with heterosexual families, are provided. Regardless of their political values or expectations, researchers should strive to test null hypotheses rigorously, in accordance with the best professional standards.  相似文献   

The practice of statistical inference in psychological research is critically reviewed. Particular emphasis is put on the fast pace of change from the sole reliance on null hypothesis significance testing (NHST) to the inclusion of effect size estimates, confidence intervals, and an interest in the Bayesian approach. We conclude that these developments are helpful for psychologists seeking to extract a maximum of useful information from statistical research data, and that seven decades of criticism against NHST is finally having an effect.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号