首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M, a, increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y, b, was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a. Implications of these findings are discussed.  相似文献   

2.
When the underlying variances are unknown or/and unequal, using the conventional F test is problematic in the two‐factor hierarchical data structure. Prompted by the approximate test statistics (Welch and Alexander–Govern methods), the authors develop four new heterogeneous test statistics to test factor A and factor B nested within A for the unbalanced fixed‐effect two‐stage nested design under variance heterogeneity. The actual significance levels and statistical power of the test statistics were compared in a simulation study. The results show that the proposed procedures maintain better Type I error rate control and have greater statistical power than those obtained by the conventional F test in various conditions. Therefore, the proposed test statistics are recommended in terms of robustness and easy implementation.  相似文献   

3.
The issue of the sample size necessary to ensure adequate statistical power has been the focus of considerableattention in scientific research. Conventional presentations of sample size determination do not consider budgetary and participant allocation scheme constraints, although there is some discussion in the literature. The introduction of additional allocation and cost concerns complicates study design, although the resulting procedure permits a practical treatment of sample size planning. This article presents exact techniques for optimizing sample size determinations in the context of Welch (Biometrika, 29, 350–362, 1938) test of the difference between two means under various design and cost considerations. The allocation schemes include cases in which (1) the ratio of group sizes is given and (2) one sample size is specified. The cost implications suggest optimally assigning subjects (1) to attain maximum power performance for a fixed cost and (2) to meet adesignated power level for the least cost. The proposed methods provide useful alternatives to the conventional procedures and can be readily implemented with the developed R and SAS programs that are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.  相似文献   

4.
Daubert required judges to base their decisions about the admissibility of expert witness testimony in large part on the reliability and validity of empirical observations. Because judges have a wide array of duties and may not be equipped to understand the complexities of statistical analysis, some jurists have recommended that court‐appointed experts assist judges in their gatekeeping function. To assist such experts in scrutinizing empirical papers, we propose a Structured Statistical Judgement (SSJ) that takes advantage of advances in the various statistical methods – such as effect sizes that adjust for error – which have allowed researchers to report increasingly more reliable and valid observations. We also include supplementary materials that court‐appointed experts can use both as a codebook to operationalize the SSJ and as a quick reference that will aid consultation with judges. An initial application of the SSJ examined all 93 empirical articles published in Psychology, Public Policy, and Law and Law and Human Behavior in 2015 and resulted in excellent interrater reliability (π = 0.83; π = 0.95; π = 0.97), at the same time it indicated that a majority of the articles fail to include the comprehensive and transparent statistical analysis that would be most useful to courts.  相似文献   

5.
Two statistics, one recent and one well known, are shown to be equivalent. The recent statistic, prep, gives the probability that the sign of an experimental effect is replicable by an experiment of equal power. That statistic is equivalent to the well‐known measure for the area under a receiver operating characteristic (ROC) curve for statistical power against significance level. Both statistics can be seen as exemplifying the area theorem of psychophysics.  相似文献   

6.
Our goal is to provide empirical scientists with practical tools and advice with which to test hypotheses related to individual differences in intra-individual variability using the mixed-effects location-scale model. To that end, we evaluate Type I error rates and power to detect and predict individual differences in intra-individual variability using this model and provide empirically-based guidelines for building scale models that include random and/or systematically-varying fixed effects. We also provide two power simulation programs that allow researchers to conduct a priori empirical power analyses. Our results aligned with statistical power theory, in that, greater power was observed for designs with more individuals, more repeated occasions, greater proportions of variance available to be explained, and larger effect sizes. In addition, our results indicated that Type I error rates were acceptable in situations when individual differences in intra-individual variability were not initially detectable as well as when the scale-model individual-level predictor explained all initially detectable individual differences in intra-individual variability. We conclude our paper by providing study design and model building advice for those interested in using the mixed-effects location-scale model in practice.  相似文献   

7.
G*Power (Erdfelder, Faul, & Buchner, 1996) was designed as a general stand-alone power analysis program for statistical tests commonly used in social and behavioral research. G*Power 3 is a major extension of, and improvement over, the previous versions. It runs on widely used computer platforms (i.e., Windows XP, Windows Vista, and Mac OS X 10.4) and covers many different statistical tests of thet, F, and χ2 test families. In addition, it includes power analyses forz tests and some exact tests. G*Power 3 provides improved effect size calculators and graphic options, supports both distribution-based and design-based input modes, and offers all types of power analyses in which users might be interested. Like its predecessors, G*Power 3 is free.  相似文献   

8.
9.
Implementing large‐scale empirical studies can be very expensive. Therefore, it is useful to optimize study designs without losing statistical power. In this paper, we show how study designs can be improved without changing statistical power by defining power equivalence, a relation between structural equation models (SEMs) that holds true if two SEMs have the same power on a likelihood ratio test to detect a given effect. We show systematic operations of SEMs that maintain power, and give an algorithm that efficiently reduces SEMs to power‐equivalent models with a minimal number of observed parameters. In this way, optimal study designs can be found without reducing statistical power. Furthermore, the algorithm can be used to drastically increase the speed of power computations when using Monte Carlo simulations or approximation methods.  相似文献   

10.
Researchers now know that when theoretical reliability increases, power can increase, decrease, or stay the same. However, no analytic research has examined the relationship of power to the most commonly used type of reliability—internal consistency—and the most commonly used measures of internal consistency, coefficient alpha and ICC(A,k). We examine the relationship between the power of independent samples t tests and internal consistency. We explicate the mathematical model upon which researchers usually calculate internal consistency, one in which total scores are calculated as the sum of observed scores on K measures. Using this model, we derive a new formula for effect size to show that power and internal consistency are influenced by many of the same parameters but not always in the same direction. Changing an experiment in one way (e.g., lengthening the measure) is likely to influence multiple parameters simultaneously; thus, there are no simple relationships between such changes and internal consistency or power. If researchers revise measures to increase internal consistency, this might not increase power. To increase power, researchers should increase sample size, select measures that assess areas where group differences are largest, and use more powerful statistical procedures (e.g., ANCOVA).  相似文献   

11.
Null hypothesis significance testing (NHST) is the most widely accepted and frequently used approach to statistical inference in quantitative communication research. NHST, however, is highly controversial, and several serious problems with the approach have been identified. This paper reviews NHST and the controversy surrounding it. Commonly recognized problems include a sensitivity to sample size, the null is usually literally false, unacceptable Type II error rates, and misunderstanding and abuse. Problems associated with the conditional nature of NHST and the failure to distinguish statistical hypotheses from substantive hypotheses are emphasized. Recommended solutions and alternatives are addressed in a companion article.  相似文献   

12.
In a classic 1978 Memory & Cognition article, Geoff Loftus explained why noncrossover interactions are removable. These removable interactions are tied to the scale of measurement for the dependent variable and therefore do not allow unambiguous conclusions about latent psychological processes. In the present article, we present concrete examples of how this insight helps prevent experimental psychologists from drawing incorrect conclusions about the effects of forgetting and aging. In addition, we extend the Loftus classification scheme for interactions to include those on the cusp between removable and nonremovable. Finally, we use various methods (i.e., a study of citation histories, a questionnaire for psychology students and faculty members, an analysis of statistical textbooks, and a review of articles published in the 2008 issue of Psychology and Aging) to show that experimental psychologists have remained generally unaware of the concept of removable interactions. We conclude that there is more to interactions in a 2 × 2 design than meets the eye.  相似文献   

13.
We derive the statistical power functions in multi‐site randomized trials with multiple treatments at each site, using multi‐level modelling. An F statistic is used to test multiple parameters in the multi‐level model instead of the Wald chi square test as suggested in the current literature. The F statistic is shown to be more conservative than the Wald statistic in testing any overall treatment effect among the multiple study conditions. In addition, we improvise an easy way to estimate the non‐centrality parameters for the means comparison t‐tests and the F test, using Helmert contrast coding in the multi‐level model. The variance of treatment means, which is difficult to fathom but necessary for power analysis, is decomposed into intuitive simple effect sizes in the contrast tests. The method is exemplified by a multi‐site evaluation study of the behavioural interventions for cannabis dependence.  相似文献   

14.
Abstract

Technological developments increasingly permit the collection of longitudinal data sets in which the data structure contains a large number of participants N and a large number of measurement occasions T. Promising new dynamical systems approaches to the analysis of large N, large T data sets have been proposed that utilize both between-subjects and within-subjects information. The COGITO project, begun over a decade ago, is an early large N?=?204, large T?=?100 study that collected high quality cognitive and psychosocial data. In this introduction, I describe the COGITO project and conceptual and statistical issues that arise in the analysis of large N, large T data sets. I provide a brief overview of the five papers in the special section which include conceptual pieces, a didactic presentation of a dynamic structural equation approach, and papers reporting new statistical analyses of the COGITO data set to answer substantive questions. Although many challenges remain, these new approaches offer the promise of improving scientific inquiry in the behavioral sciences.  相似文献   

15.
Abstract

We identify potential problems in the statistical analysis of social cognition model data, with special emphasis on the theories of reasoned action (TRA) and planned behaviour (TPB). Some statistical guidelines are presented for empirical studies of the TRA and the TPB based upon multiple linear regression and structural equation modelling (SEM). If the model is tested using multiple regression, the assumptions of this technique must be considered and variables transformed if necessary. Adjusted R2 (not R2) should be used as a measure of explained variance and semipartial correlations are useful in assessing each component's unique contribution to explained variance. R2 is not an indicator of model adequacy and residuals should be examined. Expectancy-value variables that are the product of expectancy and value measures represent the interaction term in a multiple regression and should not be used. SEM approaches make explicit the assumptions of unidimensionality of constructs in the TRA/TPB, assumptions that might usefully be challenged by competing models with multidimensional constructs. Finally, statistical power and sample size should be considered for both approaches. Inattention to any of these aspects of analysis threatens the validity of TRA/TPB research.  相似文献   

16.
According to Wollack and Schoenig (2018, The Sage encyclopedia of educational research, measurement, and evaluation. Thousand Oaks, CA: Sage, 260), benefiting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect examinees who may have benefited from item preknowledge for the case when the set of compromised items is known. The asymptotic distribution of the new statistic under no preknowledge is proved to be a simple mixture of two χ2 distributions. We perform a detailed simulation study to show that the Type I error rate of the new statistic is very close to the nominal level and that the power of the new statistic is satisfactory in comparison to that of the existing statistics for detecting item preknowledge based on both item scores and response times. We also include a real data example to demonstrate the usefulness of the suggested statistic.  相似文献   

17.
Research problems that require a non‐parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two‐way interaction is proposed for the analysis of the typical sources of variation in a three‐way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non‐normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non‐normal distributions and large sample sizes. Degrees‐of‐freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods.  相似文献   

18.
Although numerous computer programs for statistical power analysis are available, power is an under-used aspect of experimental analysis, perhaps because of the perceived difficulty of performing the necessary calculations or because existing computer software can be expensive or complicated to learn. For single-degree-of-freedom tests, however, it is possible to calculate power in a straightforward manner, using thet distribution. Because these calculations are based ont, they use easily understood and readily available quantities. These calculations can be performed with a desk calculator; we also present a simple-to-use program calledMorePower that will perform the necessary calculations. The straightforward nature of the calculations potentially will enable more researchers to consider issues of power when planning and reporting their experiments.  相似文献   

19.
The extent to which rank transformations result in the same statistical decisions as their non‐parametric counterparts is investigated. Simulations are presented using the Wilcoxon–Mann–Whitney test, the Wilcoxon signed‐rank test and the Kruskal–Wallis test, together with the rank transformations and t and F tests corresponding to each of those non‐parametric methods. In addition to Type I errors and power over all simulations, the study examines the consistency of the outcomes of the two methods on each individual sample. The results show how acceptance or rejection of the null hypothesis and differences in p‐values of the test statistics depend in a regular and predictable way on sample size, significance level, and differences between means, for normal and various non‐normal distributions.  相似文献   

20.
The purpose of this study was to examine the quality of quantitative articles published in the Journal of Counseling & Development. Quality concerns arose in regard to omissions of psychometric information of instruments, effect sizes, and statistical power. Type VI and II errors were found. Strengths included stated research questions and appropriateness of analyses. Implications of these results are provided.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号