首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Adverse impact evaluations often call for evidence that the disparity between groups in selection rates is statistically significant, and practitioners must choose which test statistic to apply in this situation. To identify the most effective testing procedure, the authors compared several alternate test statistics in terms of Type I error rates and power, focusing on situations with small samples. Significance testing was found to be of limited value because of low power for all tests. Among the alternate test statistics, the widely-used Z-test on the difference between two proportions performed reasonably well, except when sample size was extremely small. A test suggested by G. J. G. Upton (1982) provided slightly better control of Type I error under some conditions but generally produced results similar to the Z-test. Use of the Fisher Exact Test and Yates's continuity-corrected chi-square test are not recommended because of overly conservative Type I error rates and substantially lower power than the Z-test.  相似文献   

2.
A Monte Carlo simulation was conducted to compare five, pairwise multiple comparison procedures. The number of means varied from 4 to 6 and the sample size ratio varied from 1 to 60. Procedures were evaluated on the basis of Type I errors, any‐pair power and all‐pairs power. Four procedures were shown to be conservative, while the fifth provided adequate control of Type I errors only for restricted values of sample size ratios. No procedure was found to be uniformly most powerful. The Tukey‐Kramer procedure was found to provide the best any‐pair power provided it is applied without requiring a significant overall F test. In most cases, the Hayter‐Fisher modification of the Tukey‐Kramer was found to provide very good any‐pair power and to be uniformly more powerful than the Tukey‐Kramer when a significant overall F test is required. A partition‐based version of Peritz's method usually provided the greatest all‐pairs power. A modification of the Shaffer‐Welsch was found to be useful in certain conditions.  相似文献   

3.
Sequential rules are explored in the context of null hypothesis significance testing. Several studies have demonstrated that the fixed-sample stopping rule, in which the sample size used by researchers is determined in advance, is less practical and less efficient than sequential stopping rules. It is proposed that a sequential stopping rule called CLAST (composite limited adaptive sequential test) is a superior variant of COAST (composite open adaptive sequential test), a sequential rule proposed by Frick (1998). Simulation studies are conducted to test the efficiency of the proposed rule in terms of sample size and power. Two statistical tests are used: the one-tailed t test of mean differences with two matched samples, and the chi-square independence test for twofold contingency tables. The results show that the CLAST rule is more efficient than the COAST rule and reflects more realistically the practice of experimental psychology researchers.  相似文献   

4.
Factorial experimental designs have many potential advantages for behavioral scientists. For example, such designs may be useful in building more potent interventions by helping investigators to screen several candidate intervention components simultaneously and to decide which are likely to offer greater benefit before evaluating the intervention as a whole. However, sample size and power considerations may challenge investigators attempting to apply such designs, especially when the population of interest is multilevel (e.g., when students are nested within schools, or when employees are nested within organizations). In this article, we examine the feasibility of factorial experimental designs with multiple factors in a multilevel, clustered setting (i.e., of multilevel, multifactor experiments). We conduct Monte Carlo simulations to demonstrate how design elements-such as the number of clusters, the number of lower-level units, and the intraclass correlation-affect power. Our results suggest that multilevel, multifactor experiments are feasible for factor-screening purposes because of the economical properties of complete and fractional factorial experimental designs. We also discuss resources for sample size planning and power estimation for multilevel factorial experiments. These results are discussed from a resource management perspective, in which the goal is to choose a design that maximizes the scientific benefit using the resources available for an investigation.  相似文献   

5.
Contrasts of means are often of interest because they describe the effect size among multiple treatments. High-quality inference of population effect sizes can be achieved through narrow confidence intervals (CIs). Given the close relation between CI width and sample size, we propose two methods to plan the sample size for an ANCOVA or ANOVA study, so that a sufficiently narrow CI for the population (standardized or unstandardized) contrast of interest will be obtained. The standard method plans the sample size so that the expected CI width is sufficiently small. Since CI width is a random variable, the expected width being sufficiently small does not guarantee that the width obtained in a particular study will be sufficiently small. An extended procedure ensures with some specified, high degree of assurance (e.g., 90% of the time) that the CI observed in a particular study will be sufficiently narrow. We also discuss the rationale and usefulness of two different ways to standardize an ANCOVA contrast, and compare three types of standardized contrast in the ANCOVA/ANOVA context. All of the methods we propose have been implemented in the freely available MBESS package in R so that they can be easily applied by researchers.  相似文献   

6.
Lai K  Kelley K 《心理学方法》2011,16(2):127-148
In addition to evaluating a structural equation model (SEM) as a whole, often the model parameters are of interest and confidence intervals for those parameters are formed. Given a model with a good overall fit, it is entirely possible for the targeted effects of interest to have very wide confidence intervals, thus giving little information about the magnitude of the population targeted effects. With the goal of obtaining sufficiently narrow confidence intervals for the model parameters of interest, sample size planning methods for SEM are developed from the accuracy in parameter estimation approach. One method plans for the sample size so that the expected confidence interval width is sufficiently narrow. An extended procedure ensures that the obtained confidence interval will be no wider than desired, with some specified degree of assurance. A Monte Carlo simulation study was conducted that verified the effectiveness of the procedures in realistic situations. The methods developed have been implemented in the MBESS package in R so that they can be easily applied by researchers.  相似文献   

7.
The factorial 2 × 2 fixed‐effect ANOVA is a procedure used frequently in scientific research to test mean differences between‐subjects in all of the groups. But if the assumption of homogeneity is violated, the test for the row, column, and the interaction effect might be invalid or less powerful. Therefore, for planning research in the case of unknown and possibly unequal variances, it is worth developing a sample size formula to obtain the desired power. This article suggests a simple formula to determine the sample size for 2 × 2 fixed‐effect ANOVA for heterogeneous variances across groups. We use the approximate Welch t test and consider the variance ratio to derive the formula. The sample size determination requires two‐step iterations but the approximate sample sizes needed for the main effect and the interaction effect can be determined separately with the specified power. The present study also provides an example and a SAS program to facilitate the calculation process.  相似文献   

8.
The reporting and interpretation of effect size estimates are widely advocated in many academic journals of psychology and related disciplines. However, such concern has not been adequately addressed for analyses involving interactions between categorical and continuous variables. For the purpose of improving current practice, this article presents fundamental features and theoretical developments for the variance of standardized slopes as a desirable standardized effect size measure for the degree of disparity between several slope coefficients. To estimate the effect size, a consistent and nearly unbiased estimator is described and a simple refinement is emphasized for extreme situations whenever appropriate. The essential problems of power and sample size calculations for testing the equality of slope coefficients are also considered. According to the analytic justification and empirical assessment, the exact approach has a clear advantage over the approximate methods. Both SAS and R computer codes are provided to facilitate practical accessibility of the proposed techniques in interaction studies.  相似文献   

9.
In this article, we demonstrate that planning tasks enhance recall when the context of planning (a) is self-referential and (b) draws on familiar scenarios represented in episodic memory. Specifically, we show that when planning tasks are sorted according to the degree to which they evoke memories of personally familiar scenarios (e.g., planning a picnic), recall is reliably superior to tasks that fail to do so (e.g., planning an Arctic trek). We discuss the implications of these findings for planning tasks and their relation to episodic memory.  相似文献   

10.
Developmental studies have provided mixed evidence with regard to the question of whether children consider sample size and sample diversity in their inductive generalizations. Results from four experiments with 105 undergraduates, 105 school-age children (M = 7.2 years), and 105 preschoolers (M = 4.9 years) showed that preschoolers made a higher rate of projections from large samples than from small samples when samples were diverse (Experiments 1 and 3) but not when samples were homogeneous (Experiment 4) and not when the task required a choice between two samples (Experiment 2). Furthermore, when a property occurred in large and diverse samples, preschoolers exhibited a broad pattern of projection, generalizing the property to items from categories not represented in the evidence. In contrast, adults followed a normative pattern of induction and never attributed properties to items from categories not represented in the evidence. School-age children showed a mixed pattern of results.  相似文献   

11.
MOSTELLER F 《Psychometrika》1951,16(2):207-218
A test of goodness of fit is developed for Thurstone's method of paired comparisons, Case V. The test involves the computation of , wheren is the number of observations per pair, and and are the angles obtained by applying the inverse sine transformation to the fitted and the observed proportions respectively. The number of degrees of freedom is (k–1) (k–2)/2.This research was performed in the Laboratory of Social Relations under a grant made available to Harvard University by the RAND Corporation under the Department of the Air Force, Project RAND.  相似文献   

12.
In a variety of measurement situations, the researcher may wish to compare the reliabilities of several instruments administered to the same sample of subjects. This paper presents eleven statistical procedures which test the equality ofm coefficient alphas when the sample alpha coefficients are dependent. Several of the procedures are derived in detail, and numerical examples are given for two. Since all of the procedures depend on approximate asymptotic results, Monte Carlo methods are used to assess the accuracy of the procedures for sample sizes of 50, 100, and 200. Both control of Type I error and power are evaluated by computer simulation. Two of the procedures are unable to control Type I errors satisfactorily. The remaining nine procedures perform properly, but three are somewhat superior in power and Type I error control.A more detailed version of this paper is also available.  相似文献   

13.
The specification of sample size is an important aspect of the planning of every experiment. When the investigator intends to use the techniques of analysis of variance in the study of treatments effects, he should, in specifying sample size, take into consideration the power of theF tests which will be made. The charts presented in this paper make possible a simple and direct estimate of the sample size required forF tests of specified power.  相似文献   

14.
The use of hierarchical data (also called multilevel data or clustered data) is common in behavioural and psychological research when data of lower-level units (e.g., students, clients, repeated measures) are nested within clusters or higher-level units (e.g., classes, hospitals, individuals). Over the past 25 years we have seen great advances in methods for computing the sample sizes needed to obtain the desired statistical properties for such data in experimental evaluations. The present research provides closed-form and iterative formulas for sample size determination that can be used to ensure the desired width of confidence intervals for hierarchical data. Formulas are provided for a four-level hierarchical linear model that assumes slope variances and inclusion of covariates under both balanced and unbalanced designs. In addition, we address several mathematical properties relating to sample size determination for hierarchical data via the standard errors of experimental effect estimates. These include the relative impact of several indices (e.g., random intercept or slope variance at each level) on standard errors, asymptotic standard errors, minimum required values at the highest level, and generalized expressions of standard errors for designs with any-level randomization under any number of levels. In particular, information on the minimum required values will help researchers to minimize the risk of conducting experiments that are statistically unlikely to show the presence of an experimental effect.  相似文献   

15.
This article considers the problem of power and sample size calculations for normal outcomes within the framework of multivariate linear models. The emphasis is placed on the practical situation that not only the values of response variables for each subject are just available after the observations are made, but also the levels of explanatory variables cannot be predetermined before data collection. Using analytic justification, it is shown that the proposed methods extend the existing approaches to accommodate the extra variability and arbitrary configurations of the explanatory variables. The major modification involves the noncentrality parameters associated with the F approximations to the transformations of Wilks likelihood ratio, Pillai trace and Hotelling-Lawley trace statistics. A treatment of multivariate analysis of covariance models is employed to demonstrate the distinct features of the proposed extension. Monte Carlo simulation studies are conducted to assess the accuracy using a child’s intellectual development model. The results update and expand upon current work in the literature.The author wishes to thank the associate editor and the referees for comments which improve the paper considerably. This research was partially supported by a grant from the Natural Science Council of Taiwan.  相似文献   

16.
17.
Although it has been suggested that the delayed realization of intended actions should benefit from appropriate intention planning, empirical evidence on this issue is scarce. In three experiments, we examined whether and which planning aids provided in the intention formation phase affect delayed intention realization in young and old adults. One finding was that intention planning directly affected delayed intention realization: instructing participants to include the cue for appropriate intention initiation in their plans benefited delayed performance. Another finding was that older adults' performance was improved when they were guided in structuring their plan in combination with guidance in implementing this plan after a delay. In sum, the results point to the importance of plan-related factors for understanding the delayed realization of intended actions.  相似文献   

18.
The use of effect sizes and associated confidence intervals in all empirical research has been strongly emphasized by journal publication guidelines. To help advance theory and practice in the social sciences, this article describes an improved procedure for constructing confidence intervals of the standardized mean difference effect size between two independent normal populations with unknown and possibly unequal variances. The presented approach has advantages over the existing formula in both theoretical justification and computational simplicity. In addition, simulation results show that the suggested one- and two-sided confidence intervals are more accurate in achieving the nominal coverage probability. The proposed estimation method provides a feasible alternative to the most commonly used measure of Cohen’s d and the corresponding interval procedure when the assumption of homogeneous variances is not tenable. To further improve the potential applicability of the suggested methodology, the sample size procedures for precise interval estimation of the standardized mean difference are also delineated. The desired precision of a confidence interval is assessed with respect to the control of expected width and to the assurance probability of interval width within a designated value. Supplementary computer programs are developed to aid in the usefulness and implementation of the introduced techniques.  相似文献   

19.
Two experiments examined whether dispositional attributions are sensitive to the sample size of the evidence indicating a given level of covariation between person and behavior. Participants were given high or low levels of covariation (i.e. consensus and distinctiveness), and the acquisition of dispositional attributions was monitored by requesting dispositional trait ratings at fixed intervals. The results showed that dispositional attributions were sensitive to sample size, and increased given more evidence on high person‐behavior covariation while they decreased given more evidence on low person‐behavior covariation. Additional analyses suggested that in making dispositional inferences (e.g. about the actor), there was a slight preference for agreement information (e.g. low distinctiveness) over difference information (e.g. low consensus). The effects of sample size are inconsistent with current statistical or probabilistic models of covariation, but are in line with connectionist networks using an error‐correcting learning algorithm. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号