首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In conventional frequentist power analysis, one often uses an effect size estimate, treats it as if it were the true value, and ignores uncertainty in the effect size estimate for the analysis. The resulting sample sizes can vary dramatically depending on the chosen effect size value. To resolve the problem, we propose a hybrid Bayesian power analysis procedure that models uncertainty in the effect size estimates from a meta-analysis. We use observed effect sizes and prior distributions to obtain the posterior distribution of the effect size and model parameters. Then, we simulate effect sizes from the obtained posterior distribution. For each simulated effect size, we obtain a power value. With an estimated power distribution for a given sample size, we can estimate the probability of reaching a power level or higher and the expected power. With a range of planned sample sizes, we can generate a power assurance curve. Both the conventional frequentist and our Bayesian procedures were applied to conduct prospective power analyses for two meta-analysis examples (testing standardized mean differences in example 1 and Pearson's correlations in example 2). The advantages of our proposed procedure are demonstrated and discussed.  相似文献   

2.
The use of effect sizes and associated confidence intervals in all empirical research has been strongly emphasized by journal publication guidelines. To help advance theory and practice in the social sciences, this article describes an improved procedure for constructing confidence intervals of the standardized mean difference effect size between two independent normal populations with unknown and possibly unequal variances. The presented approach has advantages over the existing formula in both theoretical justification and computational simplicity. In addition, simulation results show that the suggested one- and two-sided confidence intervals are more accurate in achieving the nominal coverage probability. The proposed estimation method provides a feasible alternative to the most commonly used measure of Cohen’s d and the corresponding interval procedure when the assumption of homogeneous variances is not tenable. To further improve the potential applicability of the suggested methodology, the sample size procedures for precise interval estimation of the standardized mean difference are also delineated. The desired precision of a confidence interval is assessed with respect to the control of expected width and to the assurance probability of interval width within a designated value. Supplementary computer programs are developed to aid in the usefulness and implementation of the introduced techniques.  相似文献   

3.
A hybrid procedure for number correct scoring is proposed. The proposed scoring procedure is based on both classical true-score theory (CTT) and multidimensional item response theory (MIRT). Specifically, the hybrid scoring procedure uses test item weights based on MIRT and the total test scores are computed based on CTT. Thus, what makes the hybrid scoring method attractive is that this method accounts for the dimensionality of the test items while test scores remain easy to compute. Further, the hybrid scoring does not require large sample sizes once the item parameters are known. Monte Carlo techniques were used to compare and contrast the proposed hybrid scoring method with three other scoring procedures. Results indicated that all scoring methods in this study generated estimated and true scores that were highly correlated. However, the hybrid scoring procedure had significantly smaller error variances between the estimated and true scores relative to the other procedures.  相似文献   

4.
One of the main objectives in meta-analysis is to estimate the overall effect size by calculating a confidence interval (CI). The usual procedure consists of assuming a standard normal distribution and a sampling variance defined as the inverse of the sum of the estimated weights of the effect sizes. But this procedure does not take into account the uncertainty due to the fact that the heterogeneity variance (tau2) and the within-study variances have to be estimated, leading to CIs that are too narrow with the consequence that the actual coverage probability is smaller than the nominal confidence level. In this article, the performances of 3 alternatives to the standard CI procedure are examined under a random-effects model and 8 different tau2 estimators to estimate the weights: the t distribution CI, the weighted variance CI (with an improved variance), and the quantile approximation method (recently proposed). The results of a Monte Carlo simulation showed that the weighted variance CI outperformed the other methods regardless of the tau2 estimator, the value of tau2, the number of studies, and the sample size.  相似文献   

5.
Abstract

One important concept of experimental design is the random assignment of participants to experimental groups. This randomization process is used to prevent selection bias, as well as to provide a strong basis for a cause-and-effect relationship between the independent variable/s and the dependent variable/s. In small sample sizes, simple randomization may not provide equal groups at baseline for one or more of the variables, and therefore more restricted types of randomization, such as the stratified permuted-block randomization, can be used. A code was written to calculate the probability that simple randomization will not lead to equality between groups at baseline, and then an example of stratified permuted-block randomization was examined. The findings suggest that for certain variables that are commonly measured in experiments in motor learning, there is a relatively high probability that groups will not be equal at baseline after simple randomization. This observation reflects the small sample sizes usually found in the literature on motor learning. However, stratified permuted-block randomization does lead to greater equality among groups. Implications for researchers are discussed, and a flowchart is proposed that will allow researchers to decide whether to use simple or stratified randomization.  相似文献   

6.
Research problems that require a non‐parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two‐way interaction is proposed for the analysis of the typical sources of variation in a three‐way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non‐normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non‐normal distributions and large sample sizes. Degrees‐of‐freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods.  相似文献   

7.
The point-biserial correlation is a commonly used measure of effect size in two-group designs. New estimators of point-biserial correlation are derived from different forms of a standardized mean difference. Point-biserial correlations are defined for designs with either fixed or random group sample sizes and can accommodate unequal variances. Confidence intervals and standard errors for the point-biserial correlation estimators are derived from the sampling distributions for pooled-variance and separate-variance versions of a standardized mean difference. The proposed point-biserial confidence intervals can be used to conduct directional two-sided tests, equivalence tests, directional non-equivalence tests, and non-inferiority tests. A confidence interval for an average point-biserial correlation in meta-analysis applications performs substantially better than the currently used methods. Sample size formulas for estimating a point-biserial correlation with desired precision and testing a point-biserial correlation with desired power are proposed. R functions are provided that can be used to compute the proposed confidence intervals and sample size formulas.  相似文献   

8.
We examined nine adaptive methods of trimming, that is, methods that empirically determine when data should be trimmed and the amount to be trimmed from the tails of the empirical distribution. Over the 240 empirical values collected for each method investigated, in which we varied the total percentage of data trimmed, sample size, degree of variance heterogeneity, pairing of variances and group sizes, and population shape, one method resulted in exceptionally good control of Type I errors. However, under less extreme cases of non‐normality and variance heterogeneity a number of methods exhibited reasonably good Type I error control. With regard to the power to detect non‐null treatment effects, we found that the choice among the methods depended on the degree of non‐normality and variance heterogeneity. Recommendations are offered.  相似文献   

9.
A procedure for generating multivariate nonnormal distributions is proposed. Our procedure generates average values of intercorrelations much closer to population parameters than competing procedures for skewed and/or heavy tailed distributions and for small sample sizes. Also, it eliminates the necessity of conducting a factorization procedure on the population correlation matrix that underlies the random deviates, and it is simpler to code in a programming language (e.g., FORTRAN). Numerical examples demonstrating the procedures are given. Monte Carlo results indicate our procedure yields excellent agreement between population parameters and average values of intercorrelation, skew, and kurtosis.  相似文献   

10.
The data base for the Army Selection and Classification Project (Project A) contains two major samples referred to as the concurrent validation sample and the longitudinal validation sample. The former was drawn from a cohort that joined the Army in 1983/84, and the latter from a cohort that entered in 1986/87. This paper describes the data base resulting from the concurrent sample. The sampling procedure, the distribution of sample sizes over jobs, the total array of variables, and the data collection procedures are described. Also discussed are the extensive data editing procedures that were used to deal with missing data.  相似文献   

11.
There is no formal and generally accepted procedure for choosing an appropriate significance test for sample data when the assumption of normality is doubtful. Various tests of normality that have been proposed over the years have been found to have limited usefulness, and sometimes a preliminary test makes the situation worse. The present paper investigates a specific and easily applied rule for choosing between a parametric and non-parametric test, the Student t test and the Wilcoxon-Mann-Whitney test, that does not require a preliminary significance test of normality. Simulations reveal that the rule, which can be applied to sample data automatically by computer software, protects the Type I error rate and increases power for various sample sizes, significance levels, and non-normal distribution shapes. Limitations of the procedure in the case of heterogeneity of variance are discussed.  相似文献   

12.
A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T2 and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T2 across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness.  相似文献   

13.
The current paper proposes a solution that generalizes ideas of Brown and Forsythe to the problem of comparing hypotheses in two-way classification designs with heteroscedastic error structure. Unlike the standard analysis of variance, the proposed approach does not require the homogeneity assumption. A comprehensive simulation study, in which sample size of the cells, relationship between the cell sizes and unequal variance, degree of variance heterogeneity, and population distribution shape were systematically manipulated, shows that the proposed approximation was generally robust when normality and heterogeneity were jointly violated.  相似文献   

14.
This study examined the degree to which outliers were present in a convenience sample of published single-case research. Using a procedure for analyzing single-case data Allison &; Gorman (Behaviour Research and Therapy, 31, 621–631, 1993), this study compared the effect of outliers using ordinary least squares (OLS) regression to a robust regression method and attempted to answer four questions: (1) To what degree does outlier detection vary from OLS to robust regression? (2) How much do effect sizes differ from OLS to robust regression? (3) Are the differences produced by robust regression in more or less agreement with visual judgments of treatment effectiveness? (4) What is a typical range of effect sizes for robust regression versus OLS regression for data from “effective interventions”? Results suggest that outliers are common in single-case data. The effects of outliers in single-case data are explored, and the implications for researchers and practitioners using single-case designs are discussed.  相似文献   

15.
16.
Repeated measures analyses of variance are the method of choice in many studies from experimental psychology and the neurosciences. Data from these fields are often characterized by small sample sizes, high numbers of factor levels of the within-subjects factor(s), and nonnormally distributed response variables such as response times. For a design with a single within-subjects factor, we investigated Type I error control in univariate tests with corrected degrees of freedom, the multivariate approach, and a mixed-model (multilevel) approach (SAS PROC MIXED) with Kenward–Roger’s adjusted degrees of freedom. We simulated multivariate normal and nonnormal distributions with varied population variance–covariance structures (spherical and nonspherical), sample sizes (N), and numbers of factor levels (K). For normally distributed data, as expected, the univariate approach with Huynh–Feldt correction controlled the Type I error rate with only very few exceptions, even if samples sizes as low as three were combined with high numbers of factor levels. The multivariate approach also controlled the Type I error rate, but it requires NK. PROC MIXED often showed acceptable control of the Type I error rate for normal data, but it also produced several liberal or conservative results. For nonnormal data, all of the procedures showed clear deviations from the nominal Type I error rate in many conditions, even for sample sizes greater than 50. Thus, none of these approaches can be considered robust if the response variable is nonnormally distributed. The results indicate that both the variance heterogeneity and covariance heterogeneity of the population covariance matrices affect the error rates.  相似文献   

17.
Although use of the standardized mean difference in meta-analysis is appealing for several reasons, there are some drawbacks. In this article, we focus on the following problem: that a precision-weighted mean of the observed effect sizes results in a biased estimate of the mean standardized mean difference. This bias is due to the fact that the weight given to an observed effect size depends on this observed effect size. In order to eliminate the bias, Hedges and Olkin (1985) proposed using the mean effect size estimate to calculate the weights. In the article, we propose a third alternative for calculating the weights: using empirical Bayes estimates of the effect sizes. In a simulation study, these three approaches are compared. The mean squared error (MSE) is used as the criterion by which to evaluate the resulting estimates of the mean effect size. For a meta-analytic dataset with a small number of studies, theMSE is usually smallest when the ordinary procedure is used, whereas for a moderate or large number of studies, the procedures yielding the best results are the empirical Bayes procedure and the procedure of Hedges and Olkin, respectively.  相似文献   

18.
In this paper, the performance of six types of techniques for comparisons of means is examined. These six emerge from the distinction between the method employed (hypothesis testing, model selection using information criteria, or Bayesian model selection) and the set of hypotheses that is investigated (a classical, exploration‐based set of hypotheses containing equality constraints on the means, or a theory‐based limited set of hypotheses with equality and/or order restrictions). A simulation study is conducted to examine the performance of these techniques. We demonstrate that, if one has specific, a priori specified hypotheses, confirmation (i.e., investigating theory‐based hypotheses) has advantages over exploration (i.e., examining all possible equality‐constrained hypotheses). Furthermore, examining reasonable order‐restricted hypotheses has more power to detect the true effect/non‐null hypothesis than evaluating only equality restrictions. Additionally, when investigating more than one theory‐based hypothesis, model selection is preferred over hypothesis testing. Because of the first two results, we further examine the techniques that are able to evaluate order restrictions in a confirmatory fashion by examining their performance when the homogeneity of variance assumption is violated. Results show that the techniques are robust to heterogeneity when the sample sizes are equal. When the sample sizes are unequal, the performance is affected by heterogeneity. The size and direction of the deviations from the baseline, where there is no heterogeneity, depend on the effect size (of the means) and on the trend in the group variances with respect to the ordering of the group sizes. Importantly, the deviations are less pronounced when the group variances and sizes exhibit the same trend (e.g., are both increasing with group number).  相似文献   

19.
In a variety of measurement situations, the researcher may wish to compare the reliabilities of several instruments administered to the same sample of subjects. This paper presents eleven statistical procedures which test the equality ofm coefficient alphas when the sample alpha coefficients are dependent. Several of the procedures are derived in detail, and numerical examples are given for two. Since all of the procedures depend on approximate asymptotic results, Monte Carlo methods are used to assess the accuracy of the procedures for sample sizes of 50, 100, and 200. Both control of Type I error and power are evaluated by computer simulation. Two of the procedures are unable to control Type I errors satisfactorily. The remaining nine procedures perform properly, but three are somewhat superior in power and Type I error control.A more detailed version of this paper is also available.  相似文献   

20.
Yuan  Ke-Hai  Bentler  Peter M.  Chan  Wai 《Psychometrika》2004,69(3):421-436
Data in social and behavioral sciences typically possess heavy tails. Structural equation modeling is commonly used in analyzing interrelations among variables of such data. Classical methods for structural equation modeling fit a proposed model to the sample covariance matrix, which can lead to very inefficient parameter estimates. By fitting a structural model to a robust covariance matrix for data with heavy tails, one generally gets more efficient parameter estimates. Because many robust procedures are available, we propose using the empirical efficiency of a set of invariant parameter estimates in identifying an optimal robust procedure. Within the class of elliptical distributions, analytical results show that the robust procedure leading to the most efficient parameter estimates also yields a most powerful test statistic. Examples illustrate the merit of the proposed procedure. The relevance of this procedure to data analysis in a broader context is noted. The authors thank the editor, an associate editor and four referees for their constructive comments, which led to an improved version of the paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号