共查询到20条相似文献,搜索用时 0 毫秒
1.
Jiin‐Huarng Guo Dr Wei‐Ming Luh 《The British journal of mathematical and statistical psychology》2009,62(2):283-298
When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non‐normality for Yuen's two‐group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed. 相似文献
2.
Jiin‐Huarng Guo Dr Wei‐Ming Luh 《The British journal of mathematical and statistical psychology》2009,62(2):417-425
The factorial 2 × 2 fixed‐effect ANOVA is a procedure used frequently in scientific research to test mean differences between‐subjects in all of the groups. But if the assumption of homogeneity is violated, the test for the row, column, and the interaction effect might be invalid or less powerful. Therefore, for planning research in the case of unknown and possibly unequal variances, it is worth developing a sample size formula to obtain the desired power. This article suggests a simple formula to determine the sample size for 2 × 2 fixed‐effect ANOVA for heterogeneous variances across groups. We use the approximate Welch t test and consider the variance ratio to derive the formula. The sample size determination requires two‐step iterations but the approximate sample sizes needed for the main effect and the interaction effect can be determined separately with the specified power. The present study also provides an example and a SAS program to facilitate the calculation process. 相似文献
3.
Satoshi Usami 《Behavior research methods》2014,46(2):346-356
Hierarchical data sets arise when the data for lower units (e.g., individuals such as students, clients, and citizens) are nested within higher units (e.g., groups such as classes, hospitals, and regions). In data collection for experimental research, estimating the required sample size beforehand is a fundamental question for obtaining sufficient statistical power and precision of the focused parameters. The present research extends previous research from Heo and Leon (2008) and Usami (2011b), by deriving closed-form formulas for determining the required sample size to test effects in experimental research with hierarchical data, and by focusing on both multisite-randomized trials (MRTs) and cluster-randomized trials (CRTs). These formulas consider both statistical power and the width of the confidence interval of a standardized effect size, on the basis of estimates from a random-intercept model for three-level data that considers both balanced and unbalanced designs. These formulas also address some important results, such as the lower bounds of the needed units at the highest levels. 相似文献
4.
Jiin-huarng Guo Wei-ming Luh 《The British journal of mathematical and statistical psychology》2020,73(2):316-332
The equality of two group variances is frequently tested in experiments. However, criticisms of null hypothesis statistical testing on means have recently arisen and there is interest in other types of statistical tests of hypotheses, such as superiority/non-inferiority and equivalence. Although these tests have become more common in psychology and social sciences, the corresponding sample size estimation for these tests is rarely discussed, especially when the sampling unit costs are unequal or group sizes are unequal for two groups. Thus, for finding optimal sample size, the present study derived an initial allocation by approximating the percentiles of an F distribution with the percentiles of the standard normal distribution and used the exhaustion algorithm to select the best combination of group sizes, thereby ensuring the resulting power reaches the designated level and is maximal with a minimal total cost. In this manner, optimization of sample size planning is achieved. The proposed sample size determination has a wide range of applications and is efficient in terms of Type I errors and statistical power in simulations. Finally, an illustrative example from a report by the Health Survey for England, 1995–1997, is presented using hypertension data. For ease of application, four R Shiny apps are provided and benchmarks for setting equivalence margins are suggested. 相似文献
5.
Philip H. Ramsey Patricia P. Ramsey 《The British journal of mathematical and statistical psychology》2008,61(1):115-131
A Monte Carlo simulation was conducted to compare five, pairwise multiple comparison procedures. The number of means varied from 4 to 6 and the sample size ratio varied from 1 to 60. Procedures were evaluated on the basis of Type I errors, any‐pair power and all‐pairs power. Four procedures were shown to be conservative, while the fifth provided adequate control of Type I errors only for restricted values of sample size ratios. No procedure was found to be uniformly most powerful. The Tukey‐Kramer procedure was found to provide the best any‐pair power provided it is applied without requiring a significant overall F test. In most cases, the Hayter‐Fisher modification of the Tukey‐Kramer was found to provide very good any‐pair power and to be uniformly more powerful than the Tukey‐Kramer when a significant overall F test is required. A partition‐based version of Peritz's method usually provided the greatest all‐pairs power. A modification of the Shaffer‐Welsch was found to be useful in certain conditions. 相似文献
6.
The use of hierarchical data (also called multilevel data or clustered data) is common in behavioural and psychological research when data of lower-level units (e.g., students, clients, repeated measures) are nested within clusters or higher-level units (e.g., classes, hospitals, individuals). Over the past 25 years we have seen great advances in methods for computing the sample sizes needed to obtain the desired statistical properties for such data in experimental evaluations. The present research provides closed-form and iterative formulas for sample size determination that can be used to ensure the desired width of confidence intervals for hierarchical data. Formulas are provided for a four-level hierarchical linear model that assumes slope variances and inclusion of covariates under both balanced and unbalanced designs. In addition, we address several mathematical properties relating to sample size determination for hierarchical data via the standard errors of experimental effect estimates. These include the relative impact of several indices (e.g., random intercept or slope variance at each level) on standard errors, asymptotic standard errors, minimum required values at the highest level, and generalized expressions of standard errors for designs with any-level randomization under any number of levels. In particular, information on the minimum required values will help researchers to minimize the risk of conducting experiments that are statistically unlikely to show the presence of an experimental effect. 相似文献
7.
Arnold D. Well Alexander Pollatsek Susan J. Boyce 《Organizational behavior and human decision processes》1990,47(2)
In the first three experiments, we attempted to learn more about subjects' understanding of the importance of sample size by systematically changing aspects of the problems we gave to subjects. In a fourth study, understanding of the effects of sample size was tested as subjects went through a computerassisted training procedure that dealt with random sampling and the sampling distribution of the mean. Subjects used sample size information more appropriately for problems that were stated in terms of the accuracy of the sample average or the center of the sampling distribution than for problems stated in terms of the tails of the sampling distribution. Apparently, people understand that the means of larger samples are more likely to resemble the population mean but not the implications of this fact for the variability of the mean. The fourth experiment showed that although instruction about the sampling distribution of the mean led to better understanding of the effects of sample size, subjects were still unable to make correct inferences about the variability of the mean. The appreciation that people have for some aspects of the law of large numbers does not seem to result from an in-depth understanding of the relation between sample size and variability. 相似文献
8.
G. Shieh 《Behavior research methods》2013,45(4):955-967
The use of effect sizes and associated confidence intervals in all empirical research has been strongly emphasized by journal publication guidelines. To help advance theory and practice in the social sciences, this article describes an improved procedure for constructing confidence intervals of the standardized mean difference effect size between two independent normal populations with unknown and possibly unequal variances. The presented approach has advantages over the existing formula in both theoretical justification and computational simplicity. In addition, simulation results show that the suggested one- and two-sided confidence intervals are more accurate in achieving the nominal coverage probability. The proposed estimation method provides a feasible alternative to the most commonly used measure of Cohen’s d and the corresponding interval procedure when the assumption of homogeneous variances is not tenable. To further improve the potential applicability of the suggested methodology, the sample size procedures for precise interval estimation of the standardized mean difference are also delineated. The desired precision of a confidence interval is assessed with respect to the control of expected width and to the assurance probability of interval width within a designated value. Supplementary computer programs are developed to aid in the usefulness and implementation of the introduced techniques. 相似文献
9.
A recent trend in the psychological literature has been to include measures of effect size when reporting probability values. The several measures of effect size associated with the Student t test for two independent samples are appropriate only when the variances are homogeneous. In this paper, commonly used measures of effect size are considered and compared, using four data sets. A chance-corrected measure of effect size is provided for two or more treatment groups characterized by either homogeneous or heterogeneous variances. 相似文献
10.
Philip H. Ramsey Patricia P. Ramsey 《The British journal of mathematical and statistical psychology》2009,62(2):263-281
A Monte Carlo simulation was conducted to compare pairwise multiple comparison procedures. The number of means varied from 4 to 8 and the sample sizes varied from 2 to 500. Procedures were evaluated on the basis of Type I errors, any‐pair power and all‐pairs power. Two modifications of the Games and Howell procedure were shown to make it conservative. No procedure was found to be uniformly most powerful. For any pair power the Games and Howell procedure was found to be generally most powerful even when applied at more stringent levels to control Type I errors. For all pairs power the Peritz procedure applied with modified Brown–Forsythe tests was found to be most powerful in most conditions. 相似文献
11.
The authors discuss potential confusion in conducting primary studies and meta-analyses on the basis of differences between groups. First, the authors show that a formula for the sampling error of the standardized mean difference (d) that is based on equal group sample sizes can produce substantially biased results if applied with markedly unequal group sizes. Second, the authors show that the same concerns are present when primary analyses or meta-analyses are conducted with point-biserial correlations, as the point-biserial correlation (r) is a transformation of d. Third, the authors examine the practice of correcting a point-biserial r for unequal sample sizes and note that such correction would also increase the sampling error of the corrected r. Correcting rs for unequal sample sizes, but using the standard formula for sampling error in uncorrected r, can result in bias. The authors offer a set of recommendations for conducting meta-analyses of group differences. 相似文献
12.
Show‐Li Jan Gwowen Shieh 《The British journal of mathematical and statistical psychology》2014,67(1):72-93
For one‐way fixed effects ANOVA, it is well known that the conventional F test of the equality of means is not robust to unequal variances, and numerous methods have been proposed for dealing with heteroscedasticity. On the basis of extensive empirical evidence of Type I error control and power performance, Welch's procedure is frequently recommended as the major alternative to the ANOVA F test under variance heterogeneity. To enhance its practical usefulness, this paper considers an important aspect of Welch's method in determining the sample size necessary to achieve a given power. Simulation studies are conducted to compare two approximate power functions of Welch's test for their accuracy in sample size calculations over a wide variety of model configurations with heteroscedastic structures. The numerical investigations show that Levy's (1978a) approach is clearly more accurate than the formula of Luh and Guo (2011) for the range of model specifications considered here. Accordingly, computer programs are provided to implement the technique recommended by Levy for power calculation and sample size determination within the context of the one‐way heteroscedastic ANOVA model. 相似文献
13.
Douglas Elliffe Martin Elliffe 《Journal of the experimental analysis of behavior》2019,111(2):342-358
We advocate for rank‐permutation tests as the best choice for null‐hypothesis significance testing of behavioral data, because these tests require neither distributional assumptions about the populations from which our data were drawn nor the measurement assumption that our data are measured on an interval scale. We provide an algorithm that enables exact‐probability versions of such tests without recourse to either large‐sample approximation or resampling approaches. We particularly consider a rank‐permutation test for monotonic trend, and provide an extension of this test that allows unequal number of data points, or observations, for each subject. We provide an extended table of critical values of the test statistic for this test, and both a spreadsheet implementation and an Oracle® Java Web Start application to generate other critical values at https://sites.google.com/a/eastbayspecialists.co.nz/rank-permutation/ . 相似文献
14.
Sergio Chrisopoulos Maureen F. Dollard Anthony H. Winefield Christian Dormann 《Journal of Occupational & Organizational Psychology》2010,83(1):17-37
Research into work stress has attempted to identify job resources that can moderate the effects of job demands on strain. The recently developed triple‐match principle (TMP) proposes that job demands, resources, and strain can be conceptualized as being composed of cognitive, emotional, and physical dimensions. When a psychological imbalance is induced by job demands, individuals activate corresponding resources to reduce the effects of the demands. A closer match occurs when the resources are processed in the same psychological domain as the demands. The further away from a match, the less likely an interactive effect will become. Put simply, the likelihood of finding an interactive effect between job demands and job resources is greatest when demands, resources, and strain are based on qualitatively similar dimensions (i.e. cognitive, emotional, and physical). For example, emotional support from colleagues is likely to buffer the effects of emotional demands on emotional exhaustion. The TMP was tested in a sample of 179 Australian police officers in a two‐wave longitudinal study. The likelihood of finding an interactive effect was related to the degree of match between job demands, job resources, and strain with 33.3% of triple‐match interactions significant, 22.2% when there was a double‐match, and 0.0% when there was no match. These findings lend support to the TMP as a guiding framework, for research, to explore possible interactive effects in work stress research, and for practice, to inform interventions matching resources to occupational demands, to offset strain. 相似文献
15.
A comparison of two‐stage procedures for testing least‐squares coefficients under heteroscedasticity
Marie Ng Rand R. Wilcox 《The British journal of mathematical and statistical psychology》2011,64(2):244-258
This study explores the performance of several two‐stage procedures for testing ordinary least‐squares (OLS) coefficients under heteroscedasticity. A test of the usual homoscedasticity assumption is carried out in the first stage of the procedure. Subsequently, a test of the regression coefficients is chosen and performed in the second stage. Three recently developed methods for detecting heteroscedasticity are examined. In addition, three heteroscedastic robust tests of OLS coefficients are considered. A major finding is that performing a test of heteroscedasticity prior to applying a heteroscedastic robust test can lead to poor control over Type I errors. 相似文献
16.
17.
RIKKE LAMBEK ANEGEN TRILLINGSGAARD BJÖRN KADESJÖ DORTE DAMM PER HOVE THOMSEN 《Scandinavian journal of psychology》2010,51(6):540-547
Lambek, R., Trillingsgaard, A., Kadesjö, B., Damm, D. & Thomsen, P. H. (2010). Gender differences on the Five to Fifteen questionnaire in a non‐referred sample with inattention and hyperactivity‐impulsivity and a clinic‐referred sample with hyperkinetic disorder. Scandinavian Journal of Psychology 51, 540–447. The aim of the present study was to examine gender differences in children with inattention, hyperactivity, and impulsivity on the Five to Fifteen (FTF) parent questionnaire. First, non‐referred girls (n = 43) and boys (n = 51) with problems of attention and hyperactivity‐impulsivity and then clinic‐referred girls (n = 35) and boys (n = 66) with hyperkinetic disorder (HKD) were compared on the FTF. Results suggested that non‐referred boys were more hyperactive‐impulsive than non‐referred girls, whereas clinic‐referred boys and girl with HKD were more similar than dissimilar on the FTF questionnaire. Secondly, it was examined whether the application of gender mixed norms versus gender specific norms would result in varying proportions of clinic‐referred children with HKD being identified as impaired on the subdomains of the FTF questionnaire. Based on results it was concluded that the use of a gender mixed normative sample may lead to overestimation of impairment in boys with HKD, but the type of sample applied to define impairment on the FTF should depend on the purpose for applying the questionnaire. 相似文献
18.
Tingting Zhao Xianghua Luo Haitao Chu Chap T. Le Leonard H. Epstein Janet L. Thomas 《Journal of the experimental analysis of behavior》2016,106(3):242-253
The Cigarette Purchase Task is a behavioral economic assessment tool designed to measure the relative reinforcing efficacy of cigarette smoking across different prices. An exponential demand equation has become a standard model for analyzing purchase task data, but its utility is compromised by its inability to accommodate values of zero consumption. We propose a two‐part mixed effects model that keeps the same exponential demand equation for modeling nonzero consumption values, while providing a logistic regression for the binary outcome of zero versus nonzero consumption. Therefore, the proposed model can accommodate zero consumption values and retain the features of the exponential demand equation at the same time. As a byproduct, the logistic regression component of the proposed model provides a new demand index, the “derived breakpoint”, for the price above which a subject is more likely to be abstinent than to be smoking. We apply the proposed model to data collected at baseline from college students (N = 1,217) enrolled in a randomized clinical trial utilizing financial incentives to motivate tobacco cessation. Monte Carlo simulations showed that the proposed model provides better fits than an existing model. We note that the proposed methodology is applicable to other purchase task data, for example, drugs of abuse. 相似文献
19.
《Journal of Occupational & Organizational Psychology》2006,79(1):23-35
The level structure of West's (1990) four‐factor model of team climate for innovation was assessed by means of multi‐level confirmatory factor analysis (MCFA). The sample consisted of 1,487 individuals (195 teams) from a wide range of professions. Results showed that a considerable portion of the variance in the data was explained on the team level with intra‐class correlations ranging from .30 to .39. Furthermore, the results demonstrated that the overall measurement model fitted the data well at both the team and individual levels, while the factor loadings were slightly different across the levels with item loadings showing partial invariance. Results from confirmatory factor analyses conducted on separate levels, however, showed that the four‐factor model displayed the best fit to the data for both individual and team levels. A second‐order one‐factor model also fitted the data well on both levels. The results indicate that the team climate for innovation model can be used as a team‐level consensus model of team climate for innovation. 相似文献
20.
Yung‐Fong Hsu Ching‐Lan Chin 《The British journal of mathematical and statistical psychology》2014,67(2):266-283
The family of (non‐parametric, fixed‐step‐size) adaptive methods, also known as ‘up–down’ or ‘staircase’ methods, has been used extensively in psychophysical studies for threshold estimation. Extensions of adaptive methods to non‐binary responses have also been proposed. An example is the three‐category weighted up–down (WUD) method (Kaernbach, 2001) and its four‐category extension (Klein, 2001). Such an extension, however, is somewhat restricted, and in this paper we discuss its limitations. To facilitate the discussion, we characterize the extension of WUD by an algorithm that incorporates response confidence into a family of adaptive methods. This algorithm can also be applied to two other adaptive methods, namely Derman's up–down method and the biased‐coin design, which are suitable for estimating any threshold quantiles. We then discuss via simulations of the above three methods the limitations of the algorithm. To illustrate, we conduct a small scale of experiment using the extended WUD under different response confidence formats to evaluate the consistency of threshold estimation. 相似文献