期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Abstract: Inference and Interval Estimation for Indirect Effects With Latent Variable Models

Carl F. Falk Jeremy C. Biesanz 《Multivariate behavioral research》2013,48(6)

Models specifying indirect effects (or mediation) and structural equation modeling are both popular in the social sciences. Yet relatively little research has compared methods that test for indirect effects among latent variables and provided precise estimates of the effectiveness of different methods.

This simulation study provides an extensive comparison of methods for constructing confidence intervals and for making inferences about indirect effects with latent variables. We compared the percentile (PC) bootstrap, bias-corrected (BC) bootstrap, bias-corrected accelerated (BC_a) bootstrap, likelihood-based confidence intervals (Neale & Miller, 1997), partial posterior predictive (Biesanz, Falk, and Savalei, 2010), and joint significance tests based on Wald tests or likelihood ratio tests. All models included three reflective latent variables representing the independent, dependent, and mediating variables. The design included the following fully crossed conditions: (a) sample size: 100, 200, and 500; (b) number of indicators per latent variable: 3 versus 5; (c) reliability per set of indicators: .7 versus .9; (d) and 16 different path combinations for the indirect effect (α = 0, .14, .39, or .59; and β = 0, .14, .39, or .59). Simulations were performed using a WestGrid cluster of 1680 3.06GHz Intel Xeon processors running R and OpenMx.

Results based on 1,000 replications per cell and 2,000 resamples per bootstrap method indicated that the BC and BC_a bootstrap methods have inflated Type I error rates. Likelihood-based confidence intervals and the PC bootstrap emerged as methods that adequately control Type I error and have good coverage rates. 相似文献

2.

The Performance of Methods to Test Upper-Level Mediation in the Presence of Nonnormal Data

Keenan A. Pituch Laura M. Stapleton 《Multivariate behavioral research》2013,48(2):237-267

A Monte Carlo study compared the statistical performance of standard and robust multilevel mediation analysis methods to test indirect effects for a cluster randomized experimental design under various departures from normality. The performance of these methods was examined for an upper-level mediation process, where the indirect effect is a fixed effect and a group-implemented treatment is hypothesized to impact a person-level outcome via a person-level mediator. Two methods—the bias-corrected parametric percentile bootstrap and the empirical-M test—had the best overall performance. Methods designed for nonnormal score distributions exhibited elevated Type I error rates and poorer confidence interval coverage under some conditions. Although preliminary, the findings suggest that new mediation analysis methods may provide for robust tests of indirect effects. 相似文献

3.

Four applications of permutation methods to testing a single-mediator model

Taylor AB MacKinnon DP 《Behavior research methods》2012,44(3):806-844

Four applications of permutation tests to the single-mediator model are described and evaluated in this study. Permutation tests work by rearranging data in many possible ways in order to estimate the sampling distribution for the test statistic. The four applications to mediation evaluated here are the permutation test of ab, the permutation joint significance test, and the noniterative and iterative permutation confidence intervals for ab. A Monte Carlo simulation study was used to compare these four tests with the four best available tests for mediation found in previous research: the joint significance test, the distribution of the product test, and the percentile and bias-corrected bootstrap tests. We compared the different methods on Type I error, power, and confidence interval coverage. The noniterative permutation confidence interval for ab was the best performer among the new methods. It successfully controlled Type I error, had power nearly as good as the most powerful existing methods, and had better coverage than any existing method. The iterative permutation confidence interval for ab had lower power than do some existing methods, but it performed better than any other method in terms of coverage. The permutation confidence interval methods are recommended when estimating a confidence interval is a primary concern. SPSS and SAS macros that estimate these confidence intervals are provided. 相似文献

4.

中介效应的点估计和区间估计:乘积分布法、非参数Bootstrap和MCMC法 总被引：3，自引：0，他引：3

方杰张敏强《心理学报》2012,44(10):1408-1420

针对中介效应ab的抽样分布往往不是正态分布的问题,学者近年提出了三类无需对ab的抽样分布进行任何限制且适用于中、小样本的方法,包括乘积分布法、非参数Bootstrap和马尔科夫链蒙特卡罗(MCMC)方法.采用模拟技术比较了三类方法在中介效应分析中的表现.结果发现:1)有先验信息的MCMC方法的ab点估计最准确;2)有先验信息的MCMC方法的统计功效最高,但付出了低估第Ⅰ类错误率的代价,偏差校正的非参数百分位Bootstrap方法的统计功效其次,但付出了高估第Ⅰ类错误率的代价;3)有先验信息的MCMC方法的中介效应区间估计最准确.结果表明,当有先验信息时,推荐使用有先验信息的MCMC方法;当先验信息不可得时,推荐使用偏差校正的非参数百分位Bootstrap方法. 相似文献

5.

THE ALPHA PERCENTAGE AND EXPERIMENTWISE ERROR RATES IN COMMUNICATION RESEARCH

THOMAS M. STEINFATT 《人类交流研究》1979,5(4):366-374

Experimentwise error rates of the type proposed by Ryan (1959) are discussed and contrasted with anew measure of the likelihood that the results of a series of significance tests are Type I errors. This new measure, the Alpha Percentage (a%), shares the advantages of experimentwise error rates over individual alpha levels in reducing Type I errors in communication research, but the Alpha Percentage has much greater power than currently used experimentwise error rates to detect significant effects. Four arguments against the use of experimentwise error procedures are discussed and EW, EP, and a% rates are reported for Communication Monographs and Human Communication Research. 相似文献

6.

Testing for negligible interaction: A coherent and robust approach

下载免费PDF全文

Robert A. Cribbie Chantal Ragoonanan Alyssa Counsell 《The British journal of mathematical and statistical psychology》2016,69(2):159-174

Researchers often want to demonstrate a lack of interaction between two categorical predictors on an outcome. To justify a lack of interaction, researchers typically accept the null hypothesis of no interaction from a conventional analysis of variance (ANOVA). This method is inappropriate as failure to reject the null hypothesis does not provide statistical evidence to support a lack of interaction. This study proposes a bootstrap‐based intersection–union test for negligible interaction that provides coherent decisions between the omnibus test and post hoc interaction contrast tests and is robust to violations of the normality and variance homogeneity assumptions. Further, a multiple comparison strategy for testing interaction contrasts following a non‐significant omnibus test is proposed. Our simulation study compared the Type I error control, omnibus power and per‐contrast power of the proposed approach to the non‐centrality‐based negligible interaction test of Cheng and Shao (2007, Statistica Sinica, 17, 1441). For 2 × 2 designs, the empirical Type I error rates of the Cheng and Shao test were very close to the nominal α level when the normality and variance homogeneity assumptions were satisfied; however, only our proposed bootstrapping approach was satisfactory under non‐normality and/or variance heterogeneity. In general a × b designs, although the omnibus Cheng and Shao test, as expected, is the most powerful, it is not robust to assumption violation and results in incoherent omnibus and interaction contrast decisions that are not possible with the intersection–union approach. 相似文献

7.

Efficiently measuring recognition performance with sparse data

Schooler LJ Shiffrin RM 《Behavior research methods》2005,37(1):3-10

We examine methods for measuring performance in signal-detection-like tasks when each participant provides only a few observations. Monte Carlo simulations demonstrate that standard statistical techniques applied to ad’ analysis can lead to large numbers of Type I errors (incorrectly rejecting a hypothesis of no difference). Various statistical methods were compared in terms of their Type I and Type II error (incorrectly accepting a hypothesis of no difference) rates. Our conclusions are the same whether these two types of errors are weighted equally or Type I errors are weighted more heavily. The most promising method is to combine an aggregated’ measure with a percentile bootstrap confidence interval, a computerintensive nonparametric method of statistical inference. Researchers who prefer statistical techniques more commonly used in psychology, such as a repeated measurest test, should useγ (Goodman & Kruskal, 1954), since it performs slightly better than or nearly as well asd’. In general, when repeated measurest tests are used,γ is more conservative thand’: It makes more Type II errors, but its Type I error rate tends to be much closer to that of the traditional .05 α level. It is somewhat surprising thatγ performs as well as it does, given that the simulations that generated the hypothetical data conformed completely to thed’ model. Analyses in which H—FA was used had the highest Type I error rates. Detailed simulation results can be downloaded fromwww.psychonomic.org/archive/Schooler-BRM-2004.zip. 相似文献

8.

三类多层中介效应分析方法比较

方杰温忠麟《心理科学》2018,(4):962-967

比较了贝叶斯法、Monte Carlo法和参数Bootstrap法在2-1-1多层中介分析中的表现。结果发现：1)有先验信息的贝叶斯法的中介效应点估计和区间估计都最准确;2)无先验信息的贝叶斯法、Monte Carlo法、偏差校正和未校正的参数Bootstrap法的中介效应点估计和区间估计表现相当,但Monte Carlo法在第Ⅰ类错误率和区间宽度指标上表现略优于其他三种方法,偏差校正的Bootstrap法在统计检验力上表现略优于其他三种方法,但在第Ⅰ类错误率上表现最差;结果表明,当有先验信息时,推荐使用贝叶斯法;当先验信息不可得时,推荐使用Monte Carlo法。相似文献

9.

参数和非参数Bootstrap方法的简单中介效应分析比较

方杰张敏强《心理科学》2013,36(3):722-727

采用数据模拟技术比较了(偏差校正和未校正的)参数和非参数Bootstrap方法在简单中介效应分析中的表现。结果表明,1)偏差校正的Bootstrap法的总体表现优于未校正的Bootstrap方法,但在某些条件下会高估第Ⅰ类错误率,导致在时的置信区间偏差较大。2)参数Bootstrap方法优于非参数Bootstrap方法,偏差校正的参数百分位残差Bootstrap法的综合表现最优,且具有适用范围广,对原始样本依赖性小的优点,最具实用性。相似文献

10.

Properties of bootstrap tests for N‐of‐1 studies

下载免费PDF全文

Sharon X. Lin Leanne Morrison Peter W. F. Smith Charlie Hargood Mark Weal Lucy Yardley 《The British journal of mathematical and statistical psychology》2016,69(3):276-290

N‐of‐1 study designs involve the collection and analysis of repeated measures data from an individual not using an intervention and using an intervention. This study explores the use of semi‐parametric and parametric bootstrap tests in the analysis of N‐of‐1 studies under a single time series framework in the presence of autocorrelation. When the Type I error rates of bootstrap tests are compared to Wald tests, our results show that the bootstrap tests have more desirable properties. We compare the results for normally distributed errors with those for contaminated normally distributed errors and find that, except when there is relatively large autocorrelation, there is little difference between the power of the parametric and semi‐parametric bootstrap tests. We also experiment with two intervention designs: ABAB and AB, and show the ABAB design has more power. The results provide guidelines for designing N‐of‐1 studies, in the sense of how many observations and how many intervention changes are needed to achieve a certain level of power and which test should be performed. 相似文献

11.

Testing overall and moderator effects in random effects meta‐regression

Hilde M. Huizenga Ingmar Visser Conor V. Dolan 《The British journal of mathematical and statistical psychology》2011,64(1):1-19

Random effects meta‐regression is a technique to synthesize results of multiple studies. It allows for a test of an overall effect, as well as for tests of effects of study characteristics, that is, (discrete or continuous) moderator effects. We describe various procedures to test moderator effects: the z, t, likelihood ratio (LR), Bartlett‐corrected LR (BcLR), and resampling tests. We compare the Type I error of these tests, and conclude that the common z test, and to a lesser extent the LR test, do not perform well since they may yield Type I error rates appreciably larger than the chosen alpha. The error rate of the resampling test is accurate, closely followed by the BcLR test. The error rate of the t test is less accurate but arguably tolerable. With respect to statistical power, the BcLR and t tests slightly outperform the resampling test. Therefore, our recommendation is to use either the resampling or the BcLR test. If these statistics are unavailable, then the t test should be used since it is certainly superior to the z test. 相似文献

12.

Statistical properties of four effect-size measures for mediation models

Milica Miočević Holly P. O’Rourke David P. MacKinnon Hendricks C. Brown 《Behavior research methods》2018,50(1):285-301

This project examined the performance of classical and Bayesian estimators of four effect size measures for the indirect effect in a single-mediator model and a two-mediator model. Compared to the proportion and ratio mediation effect sizes, standardized mediation effect-size measures were relatively unbiased and efficient in the single-mediator model and the two-mediator model. Percentile and bias-corrected bootstrap interval estimates of ab/s _Y, and ab(s _X)/s _Y in the single-mediator model outperformed interval estimates of the proportion and ratio effect sizes in terms of power, Type I error rate, coverage, imbalance, and interval width. For the two-mediator model, standardized effect-size measures were superior to the proportion and ratio effect-size measures. Furthermore, it was found that Bayesian point and interval summaries of posterior distributions of standardized effect-size measures reduced excessive relative bias for certain parameter combinations. The standardized effect-size measures are the best effect-size measures for quantifying mediated effects. 相似文献

13.

Type I errors and power of the parametric bootstrap goodness‐of‐fit test: Full and limited information

《The British journal of mathematical and statistical psychology》2003,56(2):271-288

In sparse tables for categorical data well‐known goodness‐of‐fit statistics are not chi‐square distributed. A consequence is that model selection becomes a problem. It has been suggested that a way out of this problem is the use of the parametric bootstrap. In this paper, the parametric bootstrap goodness‐of‐fit test is studied by means of an extensive simulation study; the Type I error rates and power of this test are studied under several conditions of sparseness. In the presence of sparseness, models were used that were likely to violate the regularity conditions. Besides bootstrapping the goodness‐of‐fit usually used (full information statistics), corrected versions of these statistics and a limited information statistic are bootstrapped. These bootstrap tests were also compared to an asymptotic test using limited information. Results indicate that bootstrapping the usual statistics fails because these tests are too liberal, and that bootstrapping or asymptotically testing the limited information statistic works better with respect to Type I error and outperforms the other statistics by far in terms of statistical power. The properties of all tests are illustrated using categorical Markov models. 相似文献

14.

Evaluation of global testing procedures for item fit to the Rasch model

《The British journal of mathematical and statistical psychology》2003,56(1):127-143

Two types of global testing procedures for item fit to the Rasch model were evaluated using simulation studies. The first type incorporates three tests based on first‐order statistics: van den Wollenberg's Q₁ test, Glas's R₁ test, and Andersen's LR test. The second type incorporates three tests based on second‐order statistics: van den Wollenberg's Q₂ test, Glas's R₂ test, and a non‐parametric test proposed by Ponocny. The Type I error rates and the power against the violation of parallel item response curves, unidimensionality and local independence were analysed in relation to sample size and test length. In general, the outcomes indicate a satisfactory performance of all tests, except the Q₂ test which exhibits an inflated Type I error rate. Further, it was found that both types of tests have power against all three types of model violation. A possible explanation is the interdependencies among the assumptions underlying the model. 相似文献

15.

Heterogeneous heterogeneity by default: Testing categorical moderators in mixed-effects meta-analysis

Josue E. Rodriguez Donald R. Williams Paul-Christian Bürkner 《The British journal of mathematical and statistical psychology》2023,76(2):402-433

Categorical moderators are often included in mixed-effects meta-analysis to explain heterogeneity in effect sizes. An assumption in tests of categorical moderator effects is that of a constant between-study variance across all levels of the moderator. Although it rarely receives serious thought, there can be statistical ramifications to upholding this assumption. We propose that researchers should instead default to assuming unequal between-study variances when analysing categorical moderators. To achieve this, we suggest using a mixed-effects location-scale model (MELSM) to allow group-specific estimates for the between-study variance. In two extensive simulation studies, we show that in terms of Type I error and statistical power, little is lost by using the MELSM for moderator tests, but there can be serious costs when an equal variance mixed-effects model (MEM) is used. Most notably, in scenarios with balanced sample sizes or equal between-study variance, the Type I error and power rates are nearly identical between the MEM and the MELSM. On the other hand, with imbalanced sample sizes and unequal variances, the Type I error rate under the MEM can be grossly inflated or overly conservative, whereas the MELSM does comparatively well in controlling the Type I error across the majority of cases. A notable exception where the MELSM did not clearly outperform the MEM was in the case of few studies (e.g., 5). With respect to power, the MELSM had similar or higher power than the MEM in conditions where the latter produced non-inflated Type 1 error rates. Together, our results support the idea that assuming unequal between-study variances is preferred as a default strategy when testing categorical moderators. 相似文献

16.

Testing for independence between pairs of autocorrelated binomial data sequences

David G. Schlundt Clyde P. Donahoe Jr. 《Journal of psychopathology and behavioral assessment》1983,5(4):309-316

A problem arises in analyzing the existence of interdependence between the behavioral sequences of two individuals: tests involving a statistic such as chi-square assume independent observations within each behavioral sequence, a condition which may not exist in actual practice. Using Monte Carlo simulations of binomial data sequences, we found that the use of a chi-square test frequently results in unacceptable Type I error rates when the data sequences are autocorrelated. We compared these results to those from two other methods designed specifically for testing for intersequence independence in the presence of intrasequence autocorrelation. The first method directly tests the intersequence correlation using an approximation of the variance of the intersequence correlation estimated from the sample autocorrelations. The second method uses tables of critical values of the intersequence correlation computed by Nakamuraet al. (J. Am. Stat. Assoc., 1976,71, 214–222). Although these methods were originally designed for normally distributed data, we found that both methods produced much better results than the uncorrected chi-square test when applied to binomial autocorrelated sequences. The superior method appears to be the variance approximation method, which resulted in Type I error rates that were generally less than or equal to 5% when the level of significance was set at .05. 相似文献

17.

Confidence Intervals for the Probability of Superiority Effect Size Measure and the Area Under a Receiver Operating Characteristic Curve

John Ruscio Tara Mullen 《Multivariate behavioral research》2013,48(2):201-223

It is good scientific practice to the report an appropriate estimate of effect size and a confidence interval (CI) to indicate the precision with which a population effect was estimated. For comparisons of 2 independent groups, a probability-based effect size estimator (A) that is equal to the area under a receiver operating characteristic curve and closely related to the popular Wilcoxon-Mann-Whitney nonparametric statistical tests has many appealing properties (e.g., easy to understand, robust to violations of parametric assumptions, insensitive to outliers). We performed a simulation study to compare 9 analytic and 3 empirical (bootstrap) methods for constructing a CI for A that can yield very different CIs for the same data. The experimental design crossed 6 factors to yield a total of 324 cells representing challenging but realistic data conditions. Results were examined using several criteria, with emphasis placed on the extent to which observed CI coverage probabilities approximated nominal levels. Based on the simulation study results, the bias-corrected and accelerated bootstrap method is recommended for constructing a CI for the A statistic; bootstrap methods also provided the least biased and most accurate standard error of A. An empirical illustration examining score differences on a citation-based index of scholarly impact across faculty at low-ranked versus high-ranked research universities underscores the importance of choosing an appropriate CI method. 相似文献

18.

Statistical simulation on microcomputers

Drake R. Bradley Michael W. Senko Foster A. Stewart 《Behavior research methods》1990,22(2):236-246

The statistical simulation program DATASIM is designed to conduct large-scale sampling experiments on microcomputers. Monte Carlo procedures are used to investigate the Type I and Type II error rates for statistical tests when one or more assumptions are systematically violated-assumptions, for example, regarding normality, homogeneity of variance or covariance, mini-mum expected cell frequencies, and the like. In the present paper, we report several initial tests of the data-generating algorithms employed by DATASIM. The results indicate that the uniform and standard normal deviate generators perform satisfactorily. Furthermore, Kolmogorov-Smirnov tests show that the sampling distributions ofz, t, F, χ², andr generated by DATASIM simulations follow the appropriate theoretical distributions. Finally, estimates of Type I error rates obtained by DATASIM under various patterns of violations of assumptions are in close agreement with the results of previous analytical and empirical studies; These converging lines of evidence suggest that DATASIM may well prove to be a reliable and productive tool for conducting statistical simulation research. 相似文献

19.

Testing hypotheses involving Cronbach's alpha using marginal models

Renske E. Kuijpers L. Andries van der Ark Marcel A. Croon 《The British journal of mathematical and statistical psychology》2013,66(3):503-520

We discuss the statistical testing of three relevant hypotheses involving Cronbach's alpha: one where alpha equals a particular criterion; a second testing the equality of two alpha coefficients for independent samples; and a third testing the equality of two alpha coefficients for dependent samples. For each of these hypotheses, various statistical tests have been proposed. Over the years, these tests have depended on progressively fewer assumptions. We propose a new approach to testing the three hypotheses that relies on even fewer assumptions, is especially suited for discrete item scores, and can be applied easily to tests containing large numbers of items. The new approach uses marginal modelling. We compared the Type I error rate and the power of the marginal modelling approach to several of the available tests in a simulation study using realistic conditions. We found that the marginal modelling approach had the most accurate Type I error rates, whereas the power was similar across the statistical tests. 相似文献

20.

Assessing Mediational Models: Testing and Interval Estimation for Indirect Effects

Jeremy C. Biesanz Carl F. Falk Victoria Savalei 《Multivariate behavioral research》2013,48(4):661-701

Theoretical models specifying indirect or mediated effects are common in the social sciences. An indirect effect exists when an independent variable's influence on the dependent variable is mediated through an intervening variable. Classic approaches to assessing such mediational hypotheses (Baron &; Kenny, 1986 Baron, R. M. and Kenny, D. A. 1986. The moderator-mediator variable distinction in social psychological research: Conceptual, strategic and statistical considerations.. Journal of Personality and Social Psychology, 51: 1173–1182. [Crossref], [PubMed], [Web of Science ®] , [Google Scholar]; Sobel, 1982 Sobel, M. E. 1982. “Asymptotic confidence intervals for indirect effects in structural equation models.”. In Sociological methodology 1982 Edited by: Leinhardt, S. 290–312. San Francisco: Jossey-Bass. [Crossref] , [Google Scholar]) have in recent years been supplemented by computationally intensive methods such as bootstrapping, the distribution of the product methods, and hierarchical Bayesian Markov chain Monte Carlo (MCMC) methods. These different approaches for assessing mediation are illustrated using data from Dunn, Biesanz, Human, and Finn (2007). However, little is known about how these methods perform relative to each other, particularly in more challenging situations, such as with data that are incomplete and/or nonnormal. This article presents an extensive Monte Carlo simulation evaluating a host of approaches for assessing mediation. We examine Type I error rates, power, and coverage. We study normal and nonnormal data as well as complete and incomplete data. In addition, we adapt a method, recently proposed in statistical literature, that does not rely on confidence intervals (CIs) to test the null hypothesis of no indirect effect. The results suggest that the new inferential method—the partial posterior p value—slightly outperforms existing ones in terms of maintaining Type I error rates while maximizing power, especially with incomplete data. Among confidence interval approaches, the bias-corrected accelerated (BC_a) bootstrapping approach often has inflated Type I error rates and inconsistent coverage and is not recommended; In contrast, the bootstrapped percentile confidence interval and the hierarchical Bayesian MCMC method perform best overall, maintaining Type I error rates, exhibiting reasonable power, and producing stable and accurate coverage rates. 相似文献