期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A robust approach for analyzing unbalanced factorial designs with fixed levels

Guillermo Vallejo Manuel Ato M. Paula Fernández 《Behavior research methods》2010,42(2):607-617

The goal of this study was to investigate the performance of Hall’s transformation of the Brunner-Dette-Munk (BDM) and Welch-James (WJ) test statistics and Box-Cox’s data transformation in factorial designs when normality and variance homogeneity assumptions were violated separately and jointly. On the basis of unweighted marginal means, we performed a simulation study to explore the operating characteristics of the methods proposed for a variety of distributions with small sample sizes. Monte Carlo simulation results showed that when data were sampled from symmetric distributions, the error rates of the original BDM and WJ tests were scarcely affected by the lack of normality and homogeneity of variance. In contrast, when data were sampled from skewed distributions, the original BDM and WJ rates were not well controlled. Under such circumstances, the results clearly revealed that Hall’s transformation of the BDM and WJ tests provided generally better control of Type I error rates than did the same tests based on Box-Cox’s data transformation. Among all the methods considered in this study, we also found that Hall’s transformation of the BDM test yielded the best control of Type I errors, although it was often less powerful than either of the WJ tests when both approaches reasonably controlled the error rates. 相似文献

2.

A closer look at the effect of preliminary goodness-of-fit testing for normality for the one-sample t-test

Rochon J Kieser M 《The British journal of mathematical and statistical psychology》2011,64(3):410-426

Student's one-sample t-test is a commonly used method when inference about the population mean is made. As advocated in textbooks and articles, the assumption of normality is often checked by a preliminary goodness-of-fit (GOF) test. In a paper recently published by Schucany and Ng it was shown that, for the uniform distribution, screening of samples by a pretest for normality leads to a more conservative conditional Type I error rate than application of the one-sample t-test without preliminary GOF test. In contrast, for the exponential distribution, the conditional level is even more elevated than the Type I error rate of the t-test without pretest. We examine the reasons behind these characteristics. In a simulation study, samples drawn from the exponential, lognormal, uniform, Student's t-distribution with 2 degrees of freedom (t(2) ) and the standard normal distribution that had passed normality screening, as well as the ingredients of the test statistics calculated from these samples, are investigated. For non-normal distributions, we found that preliminary testing for normality may change the distribution of means and standard deviations of the selected samples as well as the correlation between them (if the underlying distribution is non-symmetric), thus leading to altered distributions of the resulting test statistics. It is shown that for skewed distributions the excess in Type I error rate may be even more pronounced when testing one-sided hypotheses. 相似文献

3.

认知诊断测验中基于信息矩阵的多群组DIF检验

孙小坚刘彦楼王诗梦辛涛宋乃庆周蔓《心理科学》2022,45(3):710-717

基于改进的Wald统计量,将适用于两群组的DIF检测方法拓展至多群组的项目功能差异（DIF）检验;改进的Wald统计量将分别通过计算观察信息矩阵（Obs）和经验交叉相乘信息矩阵（XPD）而得到。模拟研究探讨了此二者与传统计算方法在多个群组下的DIF检验情况,结果表明：（1）Obs和XPD的一类错误率明显低于传统方法,DINA模型估计下Obs和XPD的一类错误率接近理论水平;（2）样本量和DIF量较大时,Obs和XPD具有与传统Wald统计量大体相同的统计检验力。相似文献

4.

Bootstrapping to test for nonzero population correlation coefficients using univariate sampling

Beasley WH Deshea L Toothaker LE Mendoza JL Bard DE Rodgers JL 《心理学方法》2007,12(4):414-433

This article proposes 2 new approaches to test a nonzero population correlation (rho): the hypothesis-imposed univariate sampling bootstrap (HI) and the observed-imposed univariate sampling bootstrap (OI). The authors simulated correlated populations with various combinations of normal and skewed variates. With alpha set=.05, N> or =10, and rho< or =0.4, empirical Type I error rates of the parametric r and the conventional bivariate sampling bootstrap reached .168 and .081, respectively, whereas the largest error rates of the HI and the OI were .079 and .062. On the basis of these results, the authors suggest that the OI is preferable in alpha control to parametric approaches if the researcher believes the population is nonnormal and wishes to test for nonzero rhos of moderate size. 相似文献

5.

Comparing means from nonnormal distributions: The bisquare-weighted analysis of variance

Rebecca Anne Regeth W. Wren Stine 《Behavior research methods》1998,30(4):707-712

A new procedure analogous to the analysis of variance (ANOVA), called the bisquare-weighted ANOVA (bANOVA), is described. When a traditional ANOVA is calculated, using samples from a distribution with heavy tails, the Type I error rates remain in check, but the Type II error rates increase, relative to those across samples from a normal distribution. The bANOVA is robust with respect to deviations from a normal distribution, maintaining high power with normal and heavy-tailed distributions alike. The more popular rank ANOVA (rANOVA) is also described briefly. However, the rANOVA is not as robust to large deviations from normality as is the bANOVA, and it generates high Type I error rates when applied to three-way designs. 相似文献

6.

Comparing dependent correlations

Wilcox RR Tian T 《The Journal of general psychology》2008,135(1):105-112

In a recent article in The Journal of General Psychology, J. B. Hittner, K. May, and N. C. Silver (2003) described their investigation of several methods for comparing dependent correlations and found that all can be unsatisfactory, in terms of Type I errors, even with a sample size of 300. More precisely, when researchers test at the .05 level, the actual Type I error probability can exceed .10. The authors of this article extended J. B. Hittner et al.'s research by considering a variety of alternative methods. They found 3 that avoid inflating the Type I error rate above the nominal level. However, a Monte Carlo simulation demonstrated that when the underlying distribution of scores violated the assumption of normality, 2 of these methods had relatively low power and had actual Type I error rates well below the nominal level. The authors report comparisons with E. J. Williams' (1959) method. 相似文献

7.

On Selecting Tests for Equality of Two Normal Mean Vectors

《Multivariate behavioral research》2013,48(4):533-548

The conventional approach for testing the equality of two normal mean vectors is to test first the equality of covariance matrices, and if the equality assumption is tenable, then use the two-sample Hotelling T ² test. Otherwise one can use one of the approximate tests for the multivariate Behrens–Fisher problem. In this article, we study the properties of the Hotelling T ² test, the conventional approach, and one of the best approximate invariant tests (Krishnamoorthy & Yu, 2004) for the Behrens–Fisher problem. Our simulation studies indicated that the conventional approach often leads to inflated Type I error rates. The approximate test not only controls Type I error rates very satisfactorily when covariance matrices were arbitrary but was also comparable with the T ² test when covariance matrices were equal. 相似文献

8.

Testing for adverse impact when sample size is small

Collins MW Morris SB 《The Journal of applied psychology》2008,93(2):463-471

Adverse impact evaluations often call for evidence that the disparity between groups in selection rates is statistically significant, and practitioners must choose which test statistic to apply in this situation. To identify the most effective testing procedure, the authors compared several alternate test statistics in terms of Type I error rates and power, focusing on situations with small samples. Significance testing was found to be of limited value because of low power for all tests. Among the alternate test statistics, the widely-used Z-test on the difference between two proportions performed reasonably well, except when sample size was extremely small. A test suggested by G. J. G. Upton (1982) provided slightly better control of Type I error under some conditions but generally produced results similar to the Z-test. Use of the Fisher Exact Test and Yates's continuity-corrected chi-square test are not recommended because of overly conservative Type I error rates and substantially lower power than the Z-test. 相似文献

9.

认知诊断模型中项目水平模型比较统计量的健壮性

刘彦楼张倩萌郑宗军尹昊《心理科学》2005,(5):1251-1259

使用模拟研究方法比较了以往研究中提出的基于观察信息矩阵、三明治矩阵的Wald（分别表示为W_Obs、W_Sw）、似然比（Likelihood Ratio）统计量以及新提出的基于经验交叉相乘信息矩阵的Wald统计量（W_XPD）在模型——数据失拟条件下进行项目水平上模型比较时的表现。结果显示：（1）W_Sw的一类错误控制率有很强的健壮性。（2）W_XPD在Q矩阵错误设定的大多数条件下的表现优于W_Sw。结论：模型—数据拟合良好时可以使用W_Sw进行项目水平上的模型比较,当模型与数据失拟时W_XPD可能是更好的选择。相似文献

10.

认知诊断模型中项目水平模型比较统计量的健壮性

刘彦楼张倩萌郑宗军尹昊《心理科学》2019,(5):1251-1259

使用模拟研究方法比较了以往研究中提出的基于观察信息矩阵、三明治矩阵的Wald（分别表示为W_Obs、W_Sw）、似然比（Likelihood Ratio）统计量以及新提出的基于经验交叉相乘信息矩阵的Wald统计量（W_XPD）在模型——数据失拟条件下进行项目水平上模型比较时的表现。结果显示：（1）W_Sw的一类错误控制率有很强的健壮性。（2）W_XPD在Q矩阵错误设定的大多数条件下的表现优于W_Sw。结论：模型—数据拟合良好时可以使用W_Sw进行项目水平上的模型比较,当模型与数据失拟时W_XPD可能是更好的选择。相似文献

11.

A practical method for analyzing factorial designs with heteroscedastic data

Vallejo G Ato M Fernández MP Livacic-Rojas PE 《Psychological reports》2008,102(3):643-656

The Type I error rates and powers of three recent tests for analyzing nonorthogonal factorial designs under departures from the assumptions of homogeneity and normality were evaluated using Monte Carlo simulation. Specifically, this work compared the performance of the modified Brown-Forsythe procedure, the generalization of Box's method proposed by Brunner, Dette, and Munk, and the mixed-model procedure adjusted by the Kenward-Roger solution available in the SAS statistical package. With regard to robustness, the three approaches adequately controlled Type I error when the data were generated from symmetric distributions; however, this study's results indicate that, when the data were extracted from asymmetric distributions, the modified Brown-Forsythe approach controlled the Type I error slightly better than the other procedures. With regard to sensitivity, the higher power rates were obtained when the analyses were done with the MIXED procedure of the SAS program. Furthermore, results also identified that, when the data were generated from symmetric distributions, little power was sacrificed by using the generalization of Box's method in place of the modified Brown-Forsythe procedure. 相似文献

12.

A Monte Carlo evaluation of tests for comparing dependent correlations

Hittner JB May K Silver NC 《The Journal of general psychology》2003,130(2):149-168

The authors conducted a Monte Carlo simulation of 8 statistical tests for comparing dependent zero-order correlations. In particular, they evaluated the Type I error rates and power of a number of test statistics for sample sizes (Ns) of 20, 50, 100, and 300 under 3 different population distributions (normal, uniform, and exponential). For the Type I error rate analyses, the authors evaluated 3 different magnitudes of the predictor-criterion correlations (rho(y,x1) = rho(y,x2) = .1, .4, and .7). For the power analyses, they examined 3 different effect sizes or magnitudes of discrepancy between rho(y,x1) and rho(y,x2) (values of .1, .3, and .6). They conducted all of the simulations at 3 different levels of predictor intercorrelation (rho(x1,x2) = .1, .3, and .6). The results indicated that both Type I error rate and power depend not only on sample size and population distribution, but also on (a) the predictor intercorrelation and (b) the effect size (for power) or the magnitude of the predictor-criterion correlations (for Type I error rate). When the authors considered Type I error rate and power simultaneously, the findings suggested that O. J. Dunn and V. A. Clark's (1969) z and E. J. Williams's (1959) t have the best overall statistical properties. The findings extend and refine previous simulation research and as such, should have greater utility for applied researchers. 相似文献

13.

A comparison of methods to test mediation and other intervening variable effects 总被引：11，自引：0，他引：11

MacKinnon DP Lockwood CM Hoffman JM West SG Sheets V 《心理学方法》2002,7(1):83-104

A Monte Carlo study compared 14 methods to test the statistical significance of the intervening variable effect. An intervening variable (mediator) transmits the effect of an independent variable to a dependent variable. The commonly used R. M. Baron and D. A. Kenny (1986) approach has low statistical power. Two methods based on the distribution of the product and 2 difference-in-coefficients methods have the most accurate Type I error rates and greatest statistical power except in 1 important case in which Type I error rates are too high. The best balance of Type I error and statistical power across all cases is the test of the joint significance of the two effects comprising the intervening variable effect. 相似文献

14.

改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量

刘彦楼辛涛李令青田伟刘笑笑《心理学报》2016,(5):588-598

Hou,de la Torre和Nandakumar(2014)提出可以使用Wald统计量检验DIF,但其结果的一类错误率存在过度膨胀的问题。本研究中提出了一个使用观察信息矩阵进行计算的改进后的Wald统计量。结果表明:(1)使用观察信息矩阵计算的这一改进后的Wald统计量在DIF检验中具有良好的一类错误控制率,尤其是在项目具有较高区分能力的时候,解决了以往研究中一类错误率过度膨胀的问题。(2)随着样本量的增加以及DIF量的增大,使用观察信息矩阵计算Wald统计量的统计检验力也在增加。相似文献

15.

Estimating Latent Variable Interactions With Nonnormal Observed Data: A Comparison of Four Approaches

Heining Cham Stephen G. West Yue Ma Leona S. Aiken 《Multivariate behavioral research》2013,48(6):840-876

A Monte Carlo simulation was conducted to investigate the robustness of 4 latent variable interaction modeling approaches (Constrained Product Indicator [CPI], Generalized Appended Product Indicator [GAPI], Unconstrained Product Indicator [UPI], and Latent Moderated Structural Equations [LMS]) under high degrees of nonnormality of the observed exogenous variables. Results showed that the CPI and LMS approaches yielded biased estimates of the interaction effect when the exogenous variables were highly nonnormal. When the violation of nonnormality was not severe (normal; symmetric with excess kurtosis < 1), the LMS approach yielded the most efficient estimates of the latent interaction effect with the highest statistical power. In highly nonnormal conditions, the GAPI and UPI approaches with maximum likelihood (ML) estimation yielded unbiased latent interaction effect estimates, with acceptable actual Type I error rates for both the Wald and likelihood ratio tests of interaction effect at N ≥ 500. An empirical example illustrated the use of the 4 approaches in testing a latent variable interaction between academic self-efficacy and positive family role models in the prediction of academic performance. 相似文献

16.

Randomization test for coupled data

Bruno D. Zumbo 《Attention, perception & psychophysics》1996,58(3):471-478

Coupled data arise in perceptual research when subjects are contributing two scores to the data pool. These two scores, it can be reasonably argued, cannot be assumed to be independent of one another; therefore, special treatment is needed when performing statistical inference. This paper shows how the Type I error rate of randomization-based inference is affected by coupled data. It is demonstrated through Monte Carlo simulation that a randomization test behaves much like its parametric counterpart except that, for the randomization test, a negative correlation results in an inflation in the Type I error rate. A new randomization test, the couplet-referenced randomization test, is developed and shown to work for sample sizes of 8 or more observations. An example is presented to demonstrate the computation and interpretation of the new randomization test. 相似文献

17.

Omnibus hypothesis testing in dominance-based ordinal multiple regression

Long JD 《心理学方法》2005,10(3):329-351

Often quantitative data in the social sciences have only ordinal justification. Problems of interpretation can arise when least squares multiple regression (LSMR) is used with ordinal data. Two ordinal alternatives are discussed, dominance-based ordinal multiple regression (DOMR) and proportional odds multiple regression. The Q2 statistic is introduced for testing the omnibus null hypothesis in DOMR. A simulation study is discussed that examines the actual Type I error rate and power of Q2 in comparison to the LSMR omnibus F test under normality and non-normality. Results suggest that Q2 has favorable sampling properties as long as the sample size-to-predictors ratio is not too small, and Q2 can be a good alternative to the omnibus F test when the response variable is non-normal. 相似文献

18.

Evaluating clinical significance: Incorporating robust statistics with normative comparison tests

Katrina van Wieringen Robert A. Cribbie 《The British journal of mathematical and statistical psychology》2014,67(2):213-230

The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons. 相似文献

19.

三类多层中介效应分析方法比较

方杰温忠麟《心理科学》2018,(4):962-967

比较了贝叶斯法、Monte Carlo法和参数Bootstrap法在2-1-1多层中介分析中的表现。结果发现：1)有先验信息的贝叶斯法的中介效应点估计和区间估计都最准确;2)无先验信息的贝叶斯法、Monte Carlo法、偏差校正和未校正的参数Bootstrap法的中介效应点估计和区间估计表现相当,但Monte Carlo法在第Ⅰ类错误率和区间宽度指标上表现略优于其他三种方法,偏差校正的Bootstrap法在统计检验力上表现略优于其他三种方法,但在第Ⅰ类错误率上表现最差;结果表明,当有先验信息时,推荐使用贝叶斯法;当先验信息不可得时,推荐使用Monte Carlo法。相似文献

20.

Robust step‐down tests for multivariate independent group designs

《The British journal of mathematical and statistical psychology》2007,60(2):245-265

A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T² and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T² across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness. 相似文献