首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Contrasts and correlations in effect-size estimation   总被引:4,自引:0,他引:4  
This article describes procedures for presenting standardized measures of effect size when contrasts are used to ask focused questions of data. The simplest contrasts consist of comparisons of two samples (e.g., based on the independent t statistic). Useful effect-size indices in this situation are members of the g family (e.g., Hedges's g and Cohen's d ) and the Pearson r . We review expressions for calculating these measures and for transforming them back and forth, and describe how to adjust formulas for obtaining g or d from t , or r from g , when the sample sizes are unequal. The real-life implications of d or g calculated from t become problematic when there are more than two groups, but the correlational approach is adaptable and interpretable, although more complex than in the case of two groups. We describe a family of four conceptually related correlation indices: the alerting correlation, the contrast correlation, the effect-size correlation, and the BESD (binomial effect-size display) correlation. These last three correlations are identical in the simple setting of only two groups, but differ when there are more than two groups.  相似文献   

2.
The increased use of effect sizes in single studies and meta-analyses raises new questions about statistical inference. Choice of an effect-size index can have a substantial impact on the interpretation of findings. The authors demonstrate the issue by focusing on two popular effect-size measures, the correlation coefficient and the standardized mean difference (e.g., Cohen's d or Hedges's g), both of which can be used when one variable is dichotomous and the other is quantitative. Although the indices are often practically interchangeable, differences in sensitivity to the base rate or variance of the dichotomous variable can alter conclusions about the magnitude of an effect depending on which statistic is used. Because neither statistic is universally superior, researchers should explicitly consider the importance of base rates to formulate correct inferences and justify the selection of a primary effect-size statistic.  相似文献   

3.
The authors used two analyses developed within the framework of the uncontrolled manifold hypothesis to quantify multimuscle synergies during voluntary body sway: analysis of intertrial variance and analysis of motor equivalence with respect to the center of pressure (COP) trajectory. Participants performed voluntary sway tasks in the anteroposterior direction at 0.33 and 0.66 Hz. Muscle groups were identified in the space of muscle activations and used as elemental variables in the synergy analyses. Changing mechanical and vision feedback–based constraints led to significant changes in indices of sway performance such as COP deviations in the uninstructed, mediolateral direction and indices of spontaneous postural sway. In contrast, there were no significant effects on synergy indices. These findings show that the neural control of performance and of its stability may involve different control variables and neurophysiological structures. There were strong correlations between the indices of motor equivalence and those computed using the intercycle variance analysis. This result is potentially important for studies of patients with movement disorders who may be unable to perform multiple trials (cycles) at any given task, making analysis of motor equivalence of single trials a viable alternative to explore changes in stability of actions.  相似文献   

4.
测验信度是衡量测验质量的一个重要指标,认知诊断评估中同样需要重视信度问题。现有认知诊断中计算信度的方法均有一个前提假设:被试在前后两次测验的后验概率分布和边际概率完全相同。该假设过强,未考虑两次测验间存在的随机误差。基于Bootstrap抽样,提出了两类属性信度和模式信度的指标,分别是积差相关法和修正的一致性法。通过模拟研究比较了新方法和现有方法在不同属性个数、属性间相关性和题目数量下的表现,并基于英语能力认证考试ECPE和分数减法的实证数据验证了新方法的可行性。最后,对信度估计的影响因素进行了讨论。  相似文献   

5.
In linear regression, the most appropriate standardized effect size for individual independent variables having an arbitrary metric remains open to debate, despite researchers typically reporting a standardized regression coefficient. Alternative standardized measures include the semipartial correlation, the improvement in the squared multiple correlation, and the squared partial correlation. No arguments based on either theoretical or statistical grounds for preferring one of these standardized measures have been mounted in the literature. Using a Monte Carlo simulation, the performance of interval estimators for these effect-size measures was compared in a 5-way factorial design. Formal statistical design methods assessed both the accuracy and robustness of the four interval estimators. The coverage probability of a large-sample confidence interval for the semipartial correlation coefficient derived from Aloe and Becker was highly accurate and robust in 98% of instances. It was better in small samples than the Yuan-Chan large-sample confidence interval for a standardized regression coefficient. It was also consistently better than both a bootstrap confidence interval for the improvement in the squared multiple correlation and a noncentral interval for the squared partial correlation.  相似文献   

6.
This project examined the performance of classical and Bayesian estimators of four effect size measures for the indirect effect in a single-mediator model and a two-mediator model. Compared to the proportion and ratio mediation effect sizes, standardized mediation effect-size measures were relatively unbiased and efficient in the single-mediator model and the two-mediator model. Percentile and bias-corrected bootstrap interval estimates of ab/s Y , and ab(s X )/s Y in the single-mediator model outperformed interval estimates of the proportion and ratio effect sizes in terms of power, Type I error rate, coverage, imbalance, and interval width. For the two-mediator model, standardized effect-size measures were superior to the proportion and ratio effect-size measures. Furthermore, it was found that Bayesian point and interval summaries of posterior distributions of standardized effect-size measures reduced excessive relative bias for certain parameter combinations. The standardized effect-size measures are the best effect-size measures for quantifying mediated effects.  相似文献   

7.
8.
The goal of this study was to determine whether cluster analysis could be used to identify distinct subgroups of text message users based on behavioral economic indices of demand for text messaging. Cluster analysis is an analytic technique that attempts to categorize cases based on similarities across selected variables. Participants completed a questionnaire about mobile phone usage and a hypothetical texting demand task in which they indicated their likelihood of paying an extra charge to continue to send text messages. A hierarchical cluster analysis was conducted on behavioral economic indices, such as demand intensity, demand elasticity, breakpoint, and the maximum expenditure. With the cluster analysis, we identified 3 subgroups of text message users. The groups were characterized by (a) high intensity and low elasticity, (b) high intensity and medium elasticity, and (c) low intensity and high elasticity. In a demonstration of convergent validity, there were statistically significant and conceptually meaningful differences across the subgroups in various measures of mobile phone use and text messaging. Cluster analysis is a useful tool for identifying and profiling distinct, practically meaningful groups based on behavioral indices and could provide a framework for targeting interventions more efficiently.  相似文献   

9.
10.
Evidence for a bilingual advantage in executive control has led to the suggestion that being bilingual might protect against late-life cognitive decline. We assessed the performance of socially homogeneous groups of older (≥60 years) bilingual Welsh/English (n?=?50) and monolingual English (n?=?49) speakers on a range of executive control tasks yielding 17 indices for comparison. Effect sizes (>.2) favoured monolinguals on 10 indices, with negligible differences observed on the remaining 7 indices. Univariate analyses indicated that monolinguals performed significantly better on 2 of 17 indices. Multivariate analysis indicated no significant overall differences between the two groups in performance on executive tasks. Older Welsh bilinguals do not show a bilingual advantage in executive control, and where differences are observed, these tend to favour monolinguals. A possible explanation may lie in the nature of the socio-linguistic context and its influence on cognitive processing in the bilingual group.  相似文献   

11.
David Sohn 《Sex roles》1982,8(4):345-357
An effect-size analysis of the findings on sex differences in the use of achievement self-attributions was performed to determine if there were relationships between these two variables that accounted for more than 5% of the variance. The effect-size index used was an estimate of omega squared ( 2 ). Two kinds of sex difference effects were examined: (a) the main effect for sex and (b) the simple sex difference effects for success and failure, respectively. With the exception of luck attributions for success, whose 2 was .01, all 2 s were less than .01. It was concluded that the studies surveyed provided no evidence of the existence of consequential relationships between sex, achievement, and self-attributions.  相似文献   

12.
13.
We investigated age-related differences in finger coordination during rotational hand actions. Two hypotheses based on earlier studies were tested: higher safety margins and lower synergy indices were expected in the elderly. Young and elderly subjects held a handle instrumented with five six-component force sensors and performed discrete accurate pronation and supination movements. The weight of the system was counterbalanced with another load. Indices of synergies stabilizing salient performance variables, such as total normal force, total tangential force, moments produced by these forces, and total moment of force were computed at two levels of a hypothetical control hierarchy, at the virtual finger-thumb level and at the individual finger level. At each level, synergy indices reflected the normalized difference between the sum of the variances of elemental variables and variance of their combined output, both computed at comparable phases over repetitive trials. The elderly group performed the task slower and showed lower safety margins for the thumb during the rotation phase. Overall, the synergy indices were not lower in the elderly group. In several cases, these indices were significantly higher in the elderly than in the younger participants. Hence, both main hypotheses have been falsified. We interpret the unexpectedly low safety margins in the elderly as resulting from several factors such as increased force variability, impaired feed-forward control, and the fact that there was no danger of dropping the object. Our results suggest that in some natural tasks, such as the one used in this study, healthy elderly persons show no impairment, as compared to younger persons, in their ability to organize digits into synergies stabilizing salient performance variables.  相似文献   

14.
Students in four sections of an undergraduate educational course (two large and two small sections) took out-of-class practice exams prior to actual exams for each of five course units. Each course unit consisted of five class sessions focusing on a specific developmental theme. Some sections received practice-exam credit based on the number of items completed, whereas other sections received practice-exam credit based on the number of items answered accurately. The contingencies were applied only to the practice exams. A two-way MANOVA included two independent variables (practice-exam contingency and group size) and two dependent variables (practice-exam performance and unit-exam performance). The analysis revealed a main effect for both independent variables across both dependent variables, with students performing better under the accuracy than the completion contingency and better in the small than the large groups. One exception to this overall pattern was a non-significant difference between the large and small groups on the practice exams across both contingencies.  相似文献   

15.
During the past 25 years, Davis's social decision scheme (SDS) model, designed to clarify how individual-level characteristics combine to create group-level products, has had a major impact on small group research. Using formal, mathematical models, Davis and his colleagues first construct predictions, or theoretical baselines, about group products based on assumptions about members' characteristics and interactions and then compare these predictions to the performances of real groups. The SDS approach has been valuable in clarifying how group performance is affected by such variables as task characteristics, group size, individual differences (e.g., member status), procedural factors (e.g., straw polling, agendas), and temporal changes in social parameters.  相似文献   

16.
How meta-analysis increases statistical power   总被引:1,自引:0,他引:1  
One of the most frequently cited reasons for conducting a meta-analysis is the increase in statistical power that it affords a reviewer. This article demonstrates that fixed-effects meta-analysis increases statistical power by reducing the standard error of the weighted average effect size (T.) and, in so doing, shrinks the confidence interval around T.. Small confidence intervals make it more likely for reviewers to detect nonzero population effects, thereby increasing statistical power. Smaller confidence intervals also represent increased precision of the estimated population effect size. Computational examples are provided for 3 effect-size indices: d (standardized mean difference), Pearson's r, and odds ratios. Random-effects meta-analyses also may show increased statistical power and a smaller standard error of the weighted average effect size. However, the authors demonstrate that increasing the number of studies in a random-effects meta-analysis does not always increase statistical power.  相似文献   

17.
A method is proposed for constructing indices as linear functions of variables such that the reliability of the compound score is maximized. Reliability is defined in the framework of latent variable modeling [i.e., item response theory (IRT)] and optimal weights of the components of the index are found by maximizing the posterior variance relative to the total latent variable variance. Three methods for estimating the weights are proposed. The first is a likelihood-based approach, that is, marginal maximum likelihood (MML). The other two are Bayesian approaches based on Markov chain Monte Carlo (MCMC) computational methods. One is based on an augmented Gibbs sampler specifically targeted at IRT, and the other is based on a general purpose Gibbs sampler such as implemented in OpenBugs and Jags. Simulation studies are presented to demonstrate the procedure and to compare the three methods. Results are very similar, so practitioners may be suggested the use of the easily accessible latter method. A real-data set pertaining to the 28-joint Disease Activity Score is used to show how the methods can be applied in a complex measurement situation with multiple time points and mixed data formats.  相似文献   

18.
Adapting Edgington's [J. Psychol. 90 (1975) 57] randomly determined intervention start-point model, Levin and Wampold [Sch. Psychol. Quart. 14 (1999) 59] proposed a set of nonparametric randomization tests for analyzing the data from single-case designs. In the present study, the performance of Levin and Wampold's four basic tests (independent start-point general and comparative effectiveness, simultaneous start-point general and comparative effectiveness) was examined with respect to their Type I error rates and statistical power. Of Levin and Wampold's four tests, all except the independent start-point comparative effectiveness test maintained their empirical Type I error rates and had acceptable power at larger sample-size and effect-size combinations. The one-tailed comparative intervention effectiveness test for the independent start-point model was found to be too liberal, in that it did not maintain its Type I error rate. Although a two-tailed application of that test was found to be conservative at longer series lengths, it had acceptable power at larger sample-size and effect-size combinations. The results support the utility of a versatile new class of single-case designs that permit both within- and between-unit statistical assessments of intervention effectiveness.  相似文献   

19.
The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.  相似文献   

20.
Meta-analytic structural equation modeling (MASEM) is increasingly applied to advance theories by synthesizing existing findings. MASEM essentially consists of two stages. In Stage 1, a pooled correlation matrix is estimated based on the reported correlation coefficients in the individual studies. In Stage 2, a structural model (such as a path model) is fitted to explain the pooled correlations. Frequently, the individual studies do not provide all the correlation coefficients between the research variables. In this study, we modify the currently optimal MASEM-method to deal with missing correlation coefficients, and compare its performance with existing methods. This study is the first to evaluate the performance of fixed-effects MASEM methods under different levels of missing correlation coefficients. We found that the often used univariate methods performed very poorly, while the multivariate methods performed well overall.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号