首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The use of effect sizes and associated confidence intervals in all empirical research has been strongly emphasized by journal publication guidelines. To help advance theory and practice in the social sciences, this article describes an improved procedure for constructing confidence intervals of the standardized mean difference effect size between two independent normal populations with unknown and possibly unequal variances. The presented approach has advantages over the existing formula in both theoretical justification and computational simplicity. In addition, simulation results show that the suggested one- and two-sided confidence intervals are more accurate in achieving the nominal coverage probability. The proposed estimation method provides a feasible alternative to the most commonly used measure of Cohen’s d and the corresponding interval procedure when the assumption of homogeneous variances is not tenable. To further improve the potential applicability of the suggested methodology, the sample size procedures for precise interval estimation of the standardized mean difference are also delineated. The desired precision of a confidence interval is assessed with respect to the control of expected width and to the assurance probability of interval width within a designated value. Supplementary computer programs are developed to aid in the usefulness and implementation of the introduced techniques.  相似文献   

2.
基于概化理论的方差分量变异量估计   总被引:2,自引:0,他引:2  
黎光明  张敏强 《心理学报》2009,41(9):889-901
概化理论广泛应用于心理与教育测量实践中, 方差分量估计是进行概化理论分析的关键。方差分量估计受限于抽样, 需要对其变异量进行探讨。采用蒙特卡洛(Monte Carlo)数据模拟技术, 在正态分布下讨论不同方法对基于概化理论的方差分量变异量估计的影响。结果表明: Jackknife方法在方差分量变异量估计上不足取; 不采取Bootstrap方法的“分而治之”策略, 从总体上看, Traditional方法和有先验信息的MCMC方法在标准误及置信区间这两个变异量估计上优势明显。  相似文献   

3.
In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.  相似文献   

4.
The credible intervals that people set around their point estimates are typically too narrow (cf. Lichtenstein, Fischhoff, & Phillips, 1982). That is, a set of many such intervals does not contain the actual values of the criterion variables as often as it should given the probability assigned to this event for each estimate. The typical interpretation of such data is that people are overconfident about the accuracy of their judgments. This paper presents data from two studies showing the typical levels of overconfidence for individual estimates of unknown quantities. However, data from the same subjects on a different measure of confidence for the same items, their own global assessment for the set of multiple estimates as a whole, showed significantly lower levels of confidence and overconfidence than their average individual assessment for items in the set. It is argued that the event and global assessments of judgment quality are fundamentally different and are affected by unique psychological processes. Finally, we discuss the implications of a difference between confidence in single and multiple estimates for confidence research and theory.  相似文献   

5.
On the Johnson-Neyman technique and some extensions thereof   总被引:1,自引:0,他引:1  
The Johnson-Neyman technique is a statistical tool used most frequently in educational and psychological applications. This paper starts by briefly reviewing the Johnson-Neyman technique and suggesting when it should and should not be used; then several different modifications and extensions of the Johnson-Neyman technique, all of them conceptually simple, are proposed. The close relation between confidence intervals and regions of significance of the Johnson-Neyman type is pointed out. The problem of what to do when more than two groups are being compared is considered. The situation of more than one criterion variable is also considered.This research was supported in part by Educational Testing Service, and in part by the Mathematics Division of the Air Force Office of Scientific Research.  相似文献   

6.
The point-biserial correlation is a commonly used measure of effect size in two-group designs. New estimators of point-biserial correlation are derived from different forms of a standardized mean difference. Point-biserial correlations are defined for designs with either fixed or random group sample sizes and can accommodate unequal variances. Confidence intervals and standard errors for the point-biserial correlation estimators are derived from the sampling distributions for pooled-variance and separate-variance versions of a standardized mean difference. The proposed point-biserial confidence intervals can be used to conduct directional two-sided tests, equivalence tests, directional non-equivalence tests, and non-inferiority tests. A confidence interval for an average point-biserial correlation in meta-analysis applications performs substantially better than the currently used methods. Sample size formulas for estimating a point-biserial correlation with desired precision and testing a point-biserial correlation with desired power are proposed. R functions are provided that can be used to compute the proposed confidence intervals and sample size formulas.  相似文献   

7.
In an attempt to elucidate the nature of the subject’s strategy in a two-interval forcedchoice auditory detection task, event-related potentials were studied at two intensities which yielded mean accuracies of 82% and 98%. Subjects reported the observation interval in which they judged the signal to be present and the confidence of the judgment. Principal components varimax analyses yielded four components: a CZ maximal P300, a Slow Wave, a slow negative shift, and a late negative component. The P300 amplitude findings suggest that different strategies are utilized for high-confidence and low-confidence detections. At high confidence, P300 amplitude is large for the observation interval in which the signal is presented, indicating a strategy involving serial independent detection. However, the P300 latency findings at high confidence suggest that absence of the signal in the first observation interval is nonetheless noted: P300 latency in response to signal presence is shorter for the second observation interval than for the first observation interval. At low confidence, P300 is small or absent for both observation intervals, indicating a deferred decision, presumably arrived at through comparison of the two percepts.  相似文献   

8.
E. Maris 《Psychometrika》1998,63(1):65-71
In the context ofconditional maximum likelihood (CML) estimation, confidence intervals can be interpreted in three different ways, depending on the sampling distribution under which these confidence intervals contain the true parameter value with a certain probability. These sampling distributions are (a) the distribution of the data given theincidental parameters, (b) the marginal distribution of the data (i.e., with the incidental parameters integrated out), and (c) the conditional distribution of the data given the sufficient statistics for the incidental parameters. Results on the asymptotic distribution of CML estimates under sampling scheme (c) can be used to construct asymptotic confidence intervals using only the CML estimates. This is not possible for the results on the asymptotic distribution under sampling schemes (a) and (b). However, it is shown that theconditional asymptotic confidence intervals are also valid under the other two sampling schemes. I am indebted to Theo Eggen, Norman Verhelst and one of Psychometrika's reviewers for their helpful comments.  相似文献   

9.
The aim of this article is to provide empirical psychometric evidence of the (longitudinal) predictive validity of a learning potential measure—the Learning Potential Computerised Adaptive Test (LPCAT)—in comparison with standard static tests with school aggregate results as the criterion measure. Participants were 79 boys (mean age 12.44, SD = 0.44) and 72 girls (mean age 11.18, SD = 0.42) attending two private schools. Correlation and regression analyses were used to evaluate the predictive validity of the learning potential and standard test scores for school aggregate academic results as criterion measure. Results indicate that learning potential scores were statistically significant predictors of aggregate academic results and provided results that were comparable to those of the standard test results—providing empirical support for the use of learning potential tests in mainstream educational settings.  相似文献   

10.
Tasks used to examine short-term memory (STM) in animals have almost exclusively required retention of visual cues. To determine if haptic information can be retained, three rhesus monkeys were trained to perform, using only haptic cues, a simultaneous (SMS) and a delayed matching-to-sample (DMS) task. On each trial, the monkeys felt and responded to a sample stimulus on a centrally located manipulandum. They were then presented two comparison stimuli located on both sides of the central manipulandum. A response matching the comparison stimulus with the sample stimulus was reinforced. In SMS a mean of 2,725 trials was required to reach a criterion of 90% correct. As in DMS performance for visual cues, in haptic DMS the monkeys were capable of above-chance responding at retention intervals of greater than 1 min. This haptic DMS task should be useful for testing STM models, for examining the physiological basis of STM, and for examining drug effects.  相似文献   

11.
Three experiments were conducted to measure the sensitivity of two Ss to the odor of butanol. In the first two experiments, the method of double random, yes-no staircases was used. A practice effect of over a log10 unit in millimoles decrease in apparent threshold was observed in both Ss. Consistent shifts in the response criterion were induced when Ss were paid to meet an arbitrarily determined physical criterion. In Experiment S, the confidence rating procedure was used. Results at eight different signal intensities are of the form predicted by signal detection theory. d’ is shown to be related to signal strength by a power function with a slope of about .30 which suggests that the olfactory transducer compresses sensory input produced by weak concentrations of butanol.  相似文献   

12.
The standard Pearson correlation coefficient, r, is a biased estimator of the population correlation coefficient, ρ(XY) , when predictor X and criterion Y are indirectly range-restricted by a third variable Z (or S). Two correction algorithms, Thorndike's (1949) Case III, and Schmidt, Oh, and Le's (2006) Case IV, have been proposed to correct for the bias. However, to our knowledge, the two algorithms did not provide a procedure to estimate the associated standard error and confidence intervals. This paper suggests using the bootstrap procedure as an alternative. Two Monte Carlo simulations were conducted to systematically evaluate the empirical performance of the proposed bootstrap procedure. The results indicated that the bootstrap standard error and confidence intervals were generally accurate across simulation conditions (e.g., selection ratio, sample size). The proposed bootstrap procedure can provide a useful alternative for the estimation of the standard error and confidence intervals for the correlation corrected for indirect range restriction.  相似文献   

13.
A psychophysical approach was used to obtain judgments of visual extent under three conditions. In tuvo conditions a comparison stimulus at each of two distances was matched in size to a standard which varied in distance. Stimuli were presented on a well-lighted table and were judged by two observers under Objective instructions. Both the standard and comparison were located in either a frontal or longitudinal plane. In a third condition relative distance estimates were given of two stimuli which varied in their relative positions along the table. The mean results for all conditions were described as a power function of physical stimulus measures. The exponent was greater than 1.0 for frontal size and usually less than 1.0 for flat size and distance. The position of the comparison affected the magnitude of the exponents to a lesser degree. These findings have relevance for interpretations of size and distance judgments.  相似文献   

14.
This article presents a generalization of the Score method of constructing confidence intervals for the population proportion (E. B. Wilson, 1927) to the case of the population mean of a rating scale item. A simulation study was conducted to assess the properties of the Score confidence interval in relation to the traditional Wald (A. Wald, 1943) confidence interval under a variety of conditions, including sample size, number of response options, extremeness of the population mean, and kurtosis of the response distribution. The results of the simulation study indicated that the Score interval usually outperformed the Wald interval, suggesting that the Score interval is a viable method of constructing confidence intervals for the population mean of a rating scale item.  相似文献   

15.
Wallsten and Gonzalez-Vallejo (1994) developed the Stochastic Judgment Model to account for true-false judgment and response processes in a single well-defined knowledge domain. This paper generalizes the model to a four-category rating task that encompasses two knowledge domains simultaneously. It then applies the model to an experiment in which Ph.D. students in history and English literature rated confidence in the truth of statements in both domains, and also decided which statement within a pair consisting of one from each domain was more likely true. Constrained versions of the general model fit the rating data very well and accurately predicted the pair-comparison (PC) choices. The results suggest that (a) the mean distance between the true and false statement distributions of confidence was greater in the better known domain; (b) judged confidence variability is greater in the domain of greater knowledge; while simultaneously (c) criterion variability is constant across domains; (d) the extreme response criteria are located symmetrically around the central one; which (e) is located to yield the usual bias to call statements true. Finally, cross-domain PC choices were very well predicted by assuming that respondents judged only the statement in the single domain they knew better and not well predicted by the more common assumption that they compare their levels of confidence in the two statements. Implications for the underlying cognitive processes are discussed including the effects of expertise.  相似文献   

16.
This paper documents a very pervasive underconfidence bias in the area of sensory discrimination. In order to account for this phenomenon, a subjective distance theory of confidence in sensory discrimination is proposed. This theory, based on the law of comparative judgment and the assumption of confidence as an increasing function of the perceived distance between stimuli, predicts underconfidence—that is, that people should perform better than they express in their confidence assessments. Due to the fixed sensitivity of the sensory system, this underconfidence bias is practically impossible to avoid. The results of Experiment 1 confirmed the prediction of underconfidence with the help of present-day calibration methods and indkated-a-good quantitative fit of the theory. The results of Experiment 2 showed that prolonged experience of outcome feedback (160 trials) had no effect on underconfidence. It is concluded that the subjective distance theory provides a better explanation of the underconfidence phenomenon than-do previous accounts in terms of subconscious processes.  相似文献   

17.
A number of researchers (e.g. Kerr, 1978; Walsh, Russell, Imanaka, & James, 1979) have previously demonstrated interference between location and distance information in motor short-term memory. This interference manifests itself in a characteristic pattern of undershooting and overshooting, with reproduction movement location being drawn in the direction of criterion movement distance and, conversely, the distance of reproduction movements being influenced by the terminal location of the criterion movement. We investigated the effects of different cognitive strategies upon the appearance of this location-distance interference during the reproduction of movement location (Experiment 1) and distance (Experiments 2 and 3) in a linear arm positioning task. Experiment 1 compared performance in location reproduction between two strategy groups differing in the availability of explicit information about the change in starting position. The characteristic undershooting-overshooting interference pattern was observed for the group without the explicit information about the change in starting position but disappeared for the group in which explicit information about the change in starting position was provided. Experiment 2 examined the systematic undershooting-overshooting pattern in distance reproduction for a location strategy (involving some extrapolation of the start and end locations), a counting strategy, and a distance sense strategy (involving the use of visual imagery). The systematic response bias pattern disappeared when the subjects used a location strategy but was clearly observed for the subjects using the other two strategies. This finding was generally confirmed by Experiment 3, which showed a typical undershooting-overshooting pattern in distance reproduction for a counting/distance sense strategy but not for two location strategies (a general location and an explicit location strategy). The location strategies differed in the availability of explicit information about starting and end locations for both the criterion and reproduction movements. The results from these three experiments indicate that explicit information about the start andlor end locations prevents the usual interference between location and distance information from arising in movement reproduction. The notions of automatic and controlled processing and cerebral hemispheric specialization are discussed as potential explanations of these results and of the interference typically observed in motor short-term memory between distance and location information.  相似文献   

18.
Differential thresholds for tempi (with interonset intervals ranging from 100 to 1,500 msec) were measured using an adaptive 2IFC paradigm for several types of auditory sequences. In Experiment 1, the number of intervals in an isochronous sequence was varied to compare the sensitivity for single intervals withthat for sequences of two to six intervals. Mean relative just noticeable differences (JNDs) decreased as the number of intervals increased (single intervals=6%, two intervals=4%, four intervals=3.2%, six intervals=3%) and were optimal at intermediate tempi for both sequences and single intervals (as low as 1.5% in the range between 300 and 800 msec). In Experiment 2, the sensitivity for different types of irregular sequences was studied. Globally, JNDs for irregular sequences were of an intermediate level between that observed for single intervals and that observed for regular sequences. However, the closer a sequence was to regularity, the lower its relative JND. Experiment 3 demonstrated that musicians were more sensitive than nonmusicians to changes in tempo, and this was true for single intervals and for regular and irregular sequences, demonstrating the role of training on these abilities. The results are discussed in terms of possible underlying mechanisms, in particular those providing a mental representation of the mean and dispersion of successive interval durations.  相似文献   

19.
The numerical distance effect (NDE) is one of the most robust effects in the study of numerical cognition. However, the validity and reliability of distance effects across different formats and paradigms has not been assessed. Establishing whether the distance effect is both reliable and valid has important implications for the use of this paradigm to index the processing and representation of numerical magnitude in both behavioral and neuroimaging studies. In light of this, we examine the reliability and validity of frequently employed variants (and one new variant) of the numerical comparison task: two symbolic comparison variants and two nonsymbolic comparison variants. The results of two experiments demonstrate that measures of the NDE that use nonsymbolic stimuli are far more reliable than measures of the NDE that use symbolic stimuli. With respect to correlations between measures, we find evidence that the NDE that arises using symbolic stimuli is uncorrelated with the NDE that is elicited by using nonsymbolic stimuli. Results are discussed with respect to their implications for the use of the NDE as a metric of numerical processing and representation in research with both children and adults.  相似文献   

20.
We argue that to best comprehend many data sets, plotting judiciously selected sample statistics with associated confidence intervals can usefully supplement, or even replace, standard hypothesis-testing procedures. We note that most social science statistics textbooks limit discussion of confidence intervals to their use in between-subject designs. Our central purpose in this article is to describe how to compute an analogous confidence interval that can be used in within-subject designs. This confidence interval rests on the reasoning that because between-subject variance typically plays no role in statistical analyses of within-subject designs, it can legitimately be ignored; hence, an appropriate confidence interval can be based on the standard within-subject error term—that is, on the variability due to the subject × condition interaction. Computation of such a confidence interval is simple and is embodied in Equation 2 on p. 482 of this article. This confidence interval has two useful properties. First, it is based on the same error term as is the corresponding analysis of variance, and hence leads to comparable conclusions. Second, it is related by a known factor (√2) to a confidence interval of the difference between sample means; accordingly, it can be used to infer the faith one can put in some pattern of sample means as a reflection of the underlying pattern of population means. These two properties correspond to analogous properties of the more widely used between-subject confidence interval.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号