期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Utilisation des intervalles de confiance au lieu des tests de signification : aspects épistémologiques et pratiques

《Psychologie du Travail et des Organisations》2008,14(1):9-42

The use of confidence intervals instead of significance tests is strongly recommended by the fifth edition of the manual of the American Psychological Association (2001). This possibility as well as other improvements in statistical practice are discussed in the framework of the major theoretical options subtending statistical inference and the way they have been applied in psychology for about 50 years. First, the suggestion of a complete ban on statistical testing is examined and rejected. Next, a procedure consisting in measuring the fit of two competing models based on the likelihood ratio is judged interesting and commendable. Finally, the superiority of an approach based on confidence intervals instead of significance tests is assessed and illustrated by its application to an experimental study aiming to demonstrate the absence instead of the presence of an effect of the independent variable. 相似文献

2.

A Test by Any Other Name: P Values,Bayes Factors,and Statistical Inference

Hal S. Stern 《Multivariate behavioral research》2016,51(1):23-29

Procedures used for statistical inference are receiving increased scrutiny as the scientific community studies the factors associated with insuring reproducible research. This note addresses recent negative attention directed at p values, the relationship of confidence intervals and tests, and the role of Bayesian inference and Bayes factors, with an eye toward better understanding these different strategies for statistical inference. We argue that researchers and data analysts too often resort to binary decisions (e.g., whether to reject or accept the null hypothesis) in settings where this may not be required. 相似文献

3.

Generalizations and Extensions of the Probability of Superiority Effect Size Estimator

John Ruscio Benjamin Lee Gera 《Multivariate behavioral research》2013,48(2):208-219

Researchers are strongly encouraged to accompany the results of statistical tests with appropriate estimates of effect size. For 2-group comparisons, a probability-based effect size estimator (A) has many appealing properties (e.g., it is easy to understand, robust to violations of parametric assumptions, insensitive to outliers). We review generalizations of the A statistic to extend its use to applications with discrete data, with weighted data, with k > 2 groups, and with correlated samples. These generalizations are illustrated through reanalyses of data from published studies on sex differences in the acceptance of hypothetical offers of casual sex and in scores on a measure of economic enlightenment, on age differences in reported levels of Authentic Pride, and in differences between the numbers of promises made and kept in romantic relationships. Drawing from research on the construction of confidence intervals for the A statistic, we recommend a bootstrap method that can be used for each generalization of A. We provide a suite of programs that should make it easy to use the A statistic and accompany it with a confidence interval in a wide variety of research contexts. 相似文献

4.

Fisher transformation based confidence intervals of correlations in fixed- and random-effects meta-analysis

Thilo Welz Philipp Doebler Markus Pauly 《The British journal of mathematical and statistical psychology》2022,75(1):1-22

Meta-analyses of correlation coefficients are an important technique to integrate results from many cross-sectional and longitudinal research designs. Uncertainty in pooled estimates is typically assessed with the help of confidence intervals, which can double as hypothesis tests for two-sided hypotheses about the underlying correlation. A standard approach to construct confidence intervals for the main effect is the Hedges-Olkin-Vevea Fisher-z (HOVz) approach, which is based on the Fisher-z transformation. Results from previous studies (Field, 2005, Psychol. Meth., 10, 444; Hafdahl and Williams, 2009, Psychol. Meth., 14, 24), however, indicate that in random-effects models the performance of the HOVz confidence interval can be unsatisfactory. To this end, we propose improvements of the HOVz approach, which are based on enhanced variance estimators for the main effect estimate. In order to study the coverage of the new confidence intervals in both fixed- and random-effects meta-analysis models, we perform an extensive simulation study, comparing them to established approaches. Data were generated via a truncated normal and beta distribution model. The results show that our newly proposed confidence intervals based on a Knapp-Hartung-type variance estimator or robust heteroscedasticity consistent sandwich estimators in combination with the integral z-to-r transformation (Hafdahl, 2009, Br. J. Math. Stat. Psychol., 62, 233) provide more accurate coverage than existing approaches in most scenarios, especially in the more appropriate beta distribution simulation model. 相似文献

5.

The value of RCT evidence depends on the quality of statistical analysis

Faulkner C Fidler F Cumming G 《Behaviour research and therapy》2008,46(2):270-281

The authors examined statistical practices in 193 randomized controlled trials (RCTs) of psychological therapies published in prominent psychology and psychiatry journals during 1999-2003. Statistical significance tests were used in 99% of RCTs, 84% discussed clinical significance, but only 46% considered-even minimally-statistical power, 31% interpreted effect size and only 2% interpreted confidence intervals. In a second study, 42 respondents to an email survey of the authors of RCTs analyzed in the first study indicated they consider it very important to know the magnitude and clinical importance of the effect, in addition to whether a treatment effect exists. The present authors conclude that published RCTs focus on statistical significance tests ("Is there an effect or difference?"), and neglect other important questions: "How large is the effect?" and "Is the effect clinically important?" They advocate improved statistical reporting of RCTs especially by reporting and interpreting clinical significance, effect sizes and confidence intervals. 相似文献

6.

The statistical recommendations of the American Psychological Association Publication Manual: Effect sizes,confidence intervals,and meta‐analysis

Fiona Fidler Pav Kalinowski Jerry Lai 《Australian journal of psychology》2012,64(3):138-146

Estimation based on effect sizes, confidence intervals, and meta‐analysis usually provides a more informative analysis of empirical results than does statistical significance testing, which has long been the conventional choice in psychology. The sixth edition of the American Psychological Association Publication Manual now recommends that psychologists should, wherever possible, use estimation and base their interpretation of research results on point and interval estimates. We outline the Manual's recommendations and suggest how they can be put into practice: adopt an estimation framework, starting with the formulation of research aims as ‘How much?’ or ‘To what extent?’ questions. Calculate from your data effect size estimates and confidence intervals to answer those questions, then interpret. Wherever appropriate, use meta‐analysis to integrate evidence over studies. The Manual's recommendations can assist psychologists improve they way they do their statistics and help build a more quantitative and cumulative discipline. 相似文献

7.

MorePower 6.0 for ANOVA with relational confidence intervals and Bayesian analysis

Jamie I. D. Campbell Valerie A. Thompson 《Behavior research methods》2012,44(4):1255-1265

MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs. It also calculates relational confidence intervals for ANOVA effects based on formulas from Jarmasz and Hollands (Canadian Journal of Experimental Psychology 63:124–138, 2009), as well as Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson (Behavior Research Methods 43:679–690, 2011). The program is unique in affording direct comparison of these three approaches to the interpretation of ANOVA tests. Its high numerical precision and ability to work with complex ANOVA designs could facilitate researchers’ attention to issues of statistical power, Bayesian analysis, and the use of confidence intervals for data interpretation. MorePower 6.0 is available at https://wiki.usask.ca/pages/viewpageattachments.action?pageId=420413544. 相似文献

8.

Abstract: Inference and Interval Estimation for Indirect Effects With Latent Variable Models

Carl F. Falk Jeremy C. Biesanz 《Multivariate behavioral research》2013,48(6)

Models specifying indirect effects (or mediation) and structural equation modeling are both popular in the social sciences. Yet relatively little research has compared methods that test for indirect effects among latent variables and provided precise estimates of the effectiveness of different methods.

This simulation study provides an extensive comparison of methods for constructing confidence intervals and for making inferences about indirect effects with latent variables. We compared the percentile (PC) bootstrap, bias-corrected (BC) bootstrap, bias-corrected accelerated (BC_a) bootstrap, likelihood-based confidence intervals (Neale & Miller, 1997), partial posterior predictive (Biesanz, Falk, and Savalei, 2010), and joint significance tests based on Wald tests or likelihood ratio tests. All models included three reflective latent variables representing the independent, dependent, and mediating variables. The design included the following fully crossed conditions: (a) sample size: 100, 200, and 500; (b) number of indicators per latent variable: 3 versus 5; (c) reliability per set of indicators: .7 versus .9; (d) and 16 different path combinations for the indirect effect (α = 0, .14, .39, or .59; and β = 0, .14, .39, or .59). Simulations were performed using a WestGrid cluster of 1680 3.06GHz Intel Xeon processors running R and OpenMx.

Results based on 1,000 replications per cell and 2,000 resamples per bootstrap method indicated that the BC and BC_a bootstrap methods have inflated Type I error rates. Likelihood-based confidence intervals and the PC bootstrap emerged as methods that adequately control Type I error and have good coverage rates. 相似文献

9.

The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models

Xiang Liu Zhuangzhuang Han Matthew S. Johnson 《Psychometrika》2018,83(1):182-202

In educational and psychological measurement when short test forms are used, the asymptotic normality of the maximum likelihood estimator of the person parameter of item response models does not hold. As a result, hypothesis tests or confidence intervals of the person parameter based on the normal distribution are likely to be problematic. Inferences based on the exact distribution, on the other hand, do not suffer from this limitation. However, the computation involved for the exact distribution approach is often prohibitively expensive. In this paper, we propose a general framework for constructing hypothesis tests and confidence intervals for IRT models within the exponential family based on exact distribution. In addition, an efficient branch and bound algorithm for calculating the exact p value is introduced. The type-I error rate and statistical power of the proposed exact test as well as the coverage rate and the lengths of the associated confidence interval are examined through a simulation. We also demonstrate its practical use by analyzing three real data sets. 相似文献

10.

Calculating and graphing within-subject confidence intervals for ANOVA 总被引：1，自引：0，他引：1

Baguley T 《Behavior research methods》2012,44(1):158-175

The psychological and statistical literature contains several proposals for calculating and plotting confidence intervals (CIs) for within-subjects (repeated measures) ANOVA designs. A key distinction is between intervals supporting inference about patterns of means (and differences between pairs of means, in particular) and those supporting inferences about individual means. In this report, it is argued that CIs for the former are best accomplished by adapting intervals proposed by Cousineau (Tutorials in Quantitative Methods for Psychology, 1, 42–45, 2005) and Morey (Tutorials in Quantitative Methods for Psychology, 4, 61–64, 2008) so that nonoverlapping CIs for individual means correspond to a confidence for their difference that does not include zero. CIs for the latter can be accomplished by fitting a multilevel model. In situations in which both types of inference are of interest, the use of a two-tiered CI is recommended. Free, open-source, cross-platform software for such interval estimates and plots (and for some common alternatives) is provided in the form of R functions for one-way within-subjects and two-way mixed ANOVA designs. These functions provide an easy-to-use solution to the difficult problem of calculating and displaying within-subjects CIs. 相似文献

11.

An inferential confidence interval method of establishing statistical equivalence that corrects Tryon's (2001) reduction factor

Tryon WW Lewis C 《心理学方法》2008,13(3):272-277

Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H-sub-0 is not evidence supportive of it. Tests of statistical equivalence are needed. This article corrects the inferential confidence interval (ICI) reduction factor introduced by W. W. Tryon (2001) and uses it to extend his discussion of statistical equivalence. This method is shown to be algebraically equivalent with D. J. Schuirmann's (1987) use of 2 one-sided t tests, a highly regarded and accepted method of testing for statistical equivalence. The ICI method provides an intuitive graphic method for inferring statistical difference as well as equivalence. Trivial difference occurs when a test of difference and a test of equivalence are both passed. Statistical indeterminacy results when both tests are failed. Hybrid confidence intervals are introduced that impose ICI limits on standard confidence intervals. These intervals are recommended as replacements for error bars because they facilitate inferences. 相似文献

12.

A Meta-View of Multivariate Statistical Inference Methods in European Psychology Journals

Lisa L. Harlow Elly Korendijk Ellen L. Hamaker Joop Hox Sunny R. Duerr 《Multivariate behavioral research》2013,48(5):749-774

We investigated the extent and nature of multivariate statistical inferential procedures used in eight European psychology journals covering a range of content (i.e., clinical, social, health, personality, organizational, developmental, educational, and cognitive). Multivariate methods included those found in popular texts that focused on prediction, group difference, and advanced modeling: multiple regression, logistic regression, analysis of covariance, multivariate analysis of variance, factor or principal component analysis, structural equation modeling, multilevel modeling, and other methods. Results revealed that an average of 57% of the articles from these eight journals involved multivariate analyses with a third using multiple regression, 17% using structural modeling, and the remaining methods collectively comprising about 50% of the analyses. The most frequently occurring inferential procedures involved prediction weights, dichotomous p values, figures with data, and significance tests with very few articles involving confidence intervals, statistical mediation, longitudinal analyses, power analysis, or meta-analysis. Contributions, limitations and future directions are discussed. 相似文献

13.

The life of p: “Just significant” results are on the rise

《Quarterly journal of experimental psychology (2006)》2013,66(12):2303-2309

Null hypothesis significance testing uses the seemingly arbitrary probability of .05 as a means of objectively determining whether a tested effect is reliable. Within recent psychological articles, research has found an overrepresentation of p values around this cut-off. The present study examined whether this overrepresentation is a product of recent pressure to publish or whether it has existed throughout psychological research. Articles published in 1965 and 2005 from two prominent psychology journals were examined. Like previous research, the frequency of p values at and just below .05 was greater than expected compared to p frequencies in other ranges. While this overrepresentation was found for values published in both 1965 and 2005, it was much greater in 2005. Additionally, p values close to but over .05 were more likely to be rounded down to, or incorrectly reported as, significant in 2005 than in 1965. Modern statistical software and an increased pressure to publish may explain this pattern. The problem may be alleviated by reduced reliance on p values and increased reporting of confidence intervals and effect sizes. 相似文献

14.

How meta-analysis increases statistical power 总被引：1，自引：0，他引：1

Cohn LD Becker BJ 《心理学方法》2003,8(3):243-253

One of the most frequently cited reasons for conducting a meta-analysis is the increase in statistical power that it affords a reviewer. This article demonstrates that fixed-effects meta-analysis increases statistical power by reducing the standard error of the weighted average effect size (T.) and, in so doing, shrinks the confidence interval around T.. Small confidence intervals make it more likely for reviewers to detect nonzero population effects, thereby increasing statistical power. Smaller confidence intervals also represent increased precision of the estimated population effect size. Computational examples are provided for 3 effect-size indices: d (standardized mean difference), Pearson's r, and odds ratios. Random-effects meta-analyses also may show increased statistical power and a smaller standard error of the weighted average effect size. However, the authors demonstrate that increasing the number of studies in a random-effects meta-analysis does not always increase statistical power. 相似文献

15.

Bootstrap Confidence Intervals for Ordinary Least Squares Factor Loadings and Correlations in Exploratory Factor Analysis

Guangjian Zhang Kristopher J. Preacher Shanhong Luo 《Multivariate behavioral research》2013,48(1):104-134

This article is concerned with using the bootstrap to assign confidence intervals for rotated factor loadings and factor correlations in ordinary least squares exploratory factor analysis. Coverage performances of SE-based intervals, percentile intervals, bias-corrected percentile intervals, bias-corrected accelerated percentile intervals, and hybrid intervals are explored using simulation studies involving different sample sizes, perfect and imperfect models, and normal and elliptical data. The bootstrap confidence intervals are also illustrated using a personality data set of 537 Chinese men. The results suggest that the bootstrap is an effective method for assigning confidence intervals at moderately large sample sizes. 相似文献

16.

RMediation: An R package for mediation analysis confidence intervals 总被引：1，自引：0，他引：1

Tofighi D MacKinnon DP 《Behavior research methods》2011,43(3):692-700

This article describes the RMediation package,which offers various methods for building confidence intervals (CIs) for mediated effects. The mediated effect is the product of two regression coefficients. The distribution-of-the-product method has the best statistical performance of existing methods for building CIs for the mediated effect. RMediation produces CIs using methods based on the distribution of product, Monte Carlo simulations, and an asymptotic normal distribution. Furthermore, RMediation generates percentiles, quantiles, and the plot of the distribution and CI for the mediated effect. An existing program, called PRODCLIN, published in Behavior Research Methods, has been widely cited and used by researchers to build accurate CIs. PRODCLIN has several limitations: The program is somewhat cumbersome to access and yields no result for several cases. RMediation described herein is based on the widely available R software, includes several capabilities not available in PRODCLIN, and provides accurate results that PRODCLIN could not. 相似文献

17.

Point-biserial correlation: Interval estimation,hypothesis testing,meta-analysis,and sample size determination

Douglas G. Bonett 《The British journal of mathematical and statistical psychology》2020,73(Z1):113-144

The point-biserial correlation is a commonly used measure of effect size in two-group designs. New estimators of point-biserial correlation are derived from different forms of a standardized mean difference. Point-biserial correlations are defined for designs with either fixed or random group sample sizes and can accommodate unequal variances. Confidence intervals and standard errors for the point-biserial correlation estimators are derived from the sampling distributions for pooled-variance and separate-variance versions of a standardized mean difference. The proposed point-biserial confidence intervals can be used to conduct directional two-sided tests, equivalence tests, directional non-equivalence tests, and non-inferiority tests. A confidence interval for an average point-biserial correlation in meta-analysis applications performs substantially better than the currently used methods. Sample size formulas for estimating a point-biserial correlation with desired precision and testing a point-biserial correlation with desired power are proposed. R functions are provided that can be used to compute the proposed confidence intervals and sample size formulas. 相似文献

18.

SPSS macros to compare any two fitted values from a regression model

Bruce Weaver Sacha Dubois 《Behavior research methods》2012,44(4):1175-1190

In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests—particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value. 相似文献

19.

Exact distributions of intraclass correlation and Cronbach's alpha with Gaussian data and general covariance

Emily?O.?Kistner Email author Keith?E.?Muller 《Psychometrika》2004,69(3):459-474

Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact results allow calculating the exact distribution function and other properties of intraclass correlation and Cronbach's alpha, for Gaussian data with any covariance pattern, not just compound symmetry. Probabilities are computed in terms of the distribution function of a weighted sum of independent chi-square random variables. NewF approximations for the distribution functions of intraclass correlation and Cronbach's alpha are much simpler and faster to compute than the exact forms. Assuming the covariance matrix is known, the approximations typically provide sufficient accuracy, even with as few as ten observations. Either the exact or approximate distributions may be used to create confidence intervals around an estimate of reliability. Monte Carlo simulations led to a number of conclusions. Correctly assuming that the covariance matrix is compound symmetric leads to accurate confidence intervals, as was expected from previously known results. However, assuming and estimating a general covariance matrix produces somewhat optimistically narrow confidence intervals with 10 observations. Increasing sample size to 100 gives essentially unbiased coverage. Incorrectly assuming compound symmetry leads to pessimistically large confidence intervals, with pessimism increasing with sample size. In contrast, incorrectly assuming general covariance introduces only a modest optimistic bias in small samples. Hence the new methods seem preferable for creating confidence intervals, except when compound symmetry definitely holds. An earlier version of this paper was submitted in partial fulfillment of the requirements for the M.S. in Biostatistics, and also summarized in a presentation at the meetings of the Eastern North American Region of the International Biometric Society in March, 2001. Kistner's work was supported in part by NIEHS training grant ES07018-24 and NCI program project grant P01 CA47 982-04. She gratefully acknowledges the inspiration of A. Calandra's “Scoring formulas and probability considerations” (Psychometrika, 6, 1–9). Muller's work supported in part by NCI program project grant P01 CA47 982-04. 相似文献

20.

Optimal sample sizes for precise interval estimation of Welch’s procedure under various allocation and cost considerations

Shieh G Jan SL 《Behavior research methods》2012,44(1):202-212

Welch’s (Biometrika 29: 350–362, 1938) procedure has emerged as a robust alternative to the Student’s t test for comparing the means of two normal populations with unknown and possibly unequal variances. To facilitate the advocated statistical practice of confidence intervals and further improve the potential applicability of Welch’s procedure, in the present article, we consider exact approaches to optimize sample size determinations for precise interval estimation of the difference between two means under various allocation and cost considerations. The desired precision of a confidence interval is assessed with respect to the control of expected half-width, and to the assurance probability of interval half-width within a designated value. Furthermore, the design schemes in terms of participant allocation and cost constraints include (a) giving the ratio of group sizes, (b) specifying one sample size, (c) attaining maximum precision performance for a fixed cost, and (d) meeting a specified precision level for the least cost. The proposed methods provide useful alternatives to the conventional sample size procedures. Also, the developed programs expand the degree of generality for the existing statistical software packages and can be accessed at brm.psychonomic-journals.org/content/ supplemental. 相似文献