首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Signal detection theory offers several indexes of sensitivity (d’,A z, andA’) that are appropriate for two-choice discrimination when data consist of one hit rate and one false alarm rate per condition. These measures require simplifying assumptions about how target and lure evidence is distributed. We examine three statistical properties of these indexes: accuracy (good agreement between the parameter and the sampling distribution mean), precision (small variance of the sampling distribution), and robustness (small influence of violated assumptions on accuracy). We draw several conclusions from the results. First, a variety of parameters (sample size, degree of discriminability, and magnitude of hits and false alarms) influence statistical bias in these indexes. Comparing conditions that differ in these parameters entails discrepancies that can be reduced by increasing N. Second, unequal variance of the evidence distributions produces significant bias that cannot be reduced by increasing N—a serious drawback to the use of these sensitivity indexes when variance is unknown. Finally, their relative statistical performances suggest thatA z is preferable toA’.  相似文献   

2.
For a discrimination experiment, a plot of the hit rate against the false-alarm rate--the ROC curve--summarizes performance across a range of confidence levels. In many content areas, ROCs are well described by a normal-distribution model and the z-transformed hit and false-alarm rates are approximately linearly related. We examined the sampling distributions of three parameters of this model when applied to a ratings procedure: the area under the ROC (Az), the normalized difference between the means of the underlying signal and noise distributions (da), and the slope of the ROC on z-coordinates (s). Statistical bias (the degree to which the mean of the sampling distribution differed from the true value) was trivial for Az, small but noticeable for da, and substantial for s. Variability of the sampling distributions decreased with the number of trials and was also affected by the number of response categories available to the participant and by the overall sensitivity level. Figures in the article and tables available on line can be used to construct confidence intervals around ROC statistics and to test statistical hypotheses.  相似文献   

3.
Killeen and Fetterman's (1988) behavioral theory of animal timing predicts that decreases in the rate of reinforcement should produce decreases in the sensitivity (A') of temporal discriminations and a decrease in miss and correct rejection rates (decrease in bias toward "long" responses). Eight rats were trained on a 10- versus 0.1-s temporal discrimination with an intertrial interval of 5 s and were subsequently tested on probe days on the same discrimination with intertrial intervals of 1, 2.5, 5, 10, or 20 s. The rate of reinforcement declined for all animals as intertrial interval increased. Although sensitivity (A') decreased with increasing intertrial interval, all rats showed an increase in bias to make long responses.  相似文献   

4.
In theory, the greatest lower bound (g.l.b.) to reliability is the best possible lower bound to the reliability based on single test administration. Yet the practical use of the g.l.b. has been severely hindered by sampling bias problems. It is well known that the g.l.b. based on small samples (even a sample of one thousand subjects is not generally enough) may severely overestimate the population value, and statistical treatment of the bias has been badly missing. The only results obtained so far are concerned with the asymptotic variance of the g.l.b. and of its numerator (the maximum possible error variance of a test), based on first order derivatives and the asumption of multivariate normality. The present paper extends these results by offering explicit expressions for the second order derivatives. This yields a closed form expression for the asymptotic bias of both the g.l.b. and its numerator, under the assumptions that the rank of the reduced covariance matrix is at or above the Ledermann bound, and that the nonnegativity constraints on the diagonal elements of the matrix of unique variances are inactive. It is also shown that, when the reduced rank is at its highest possible value (i.e., the number of variables minus one), the numerator of the g.l.b. is asymptotically unbiased, and the asymptotic bias of the g.l.b. is negative. The latter results are contrary to common belief, but apply only to cases where the number of variables is small. The asymptotic results are illustrated by numerical examples.This research was supported by grant DMI-9713878 from the National Science Foundation.  相似文献   

5.
An experiment assessed the effect of subliminally embedded, visual material on an auditory detection task. 22 women and 19 men were presented tachistoscopically with words designated as "emotional" or "neutral" on the basis of prior GSRs and a Word Rating List under four conditions: (a) Unembedded Neutral, (b) Embedded Neutral, (c) Unembedded Emotional, and (d) Embedded Emotional. On each trial subjects made forced choices concerning the presence or absence of an auditory tone (1000 Hz) at threshold level; hits and false alarm rates were used to compute non-parametric indices for sensitivity (A') and response bias (B"). While over-all analyses of variance yielded no significant differences, further examination of the data suggests the presence of subliminally "receptive" and "non-receptive" subpopulations.  相似文献   

6.
This paper is a presentation of the statistical sampling theory of stepped-up reliability coefficients when a test has been divided into any number of equivalent parts. Maximum-likelihood estimators of the reliability are obtained and shown to be biased. Their sampling distributions are derived and form the basis of the definition of new unbiased estimators with known sampling distributions. These unbiased estimators have a smaller sampling variance than the maximum-likelihood estimators and are, because of this and some other favorable properties, recommended for general use. On the basis of the variances of the unbiased estimators the gain in accuracy in estimating reliability connected with further division of a test can be expressed explicitly. The limits of these variances and thus the limits of accuracy of estimation are derived. Finally, statistical small sample tests of the reliability coefficient are outlined. This paper also covers the sampling distribution of Cronbach's coefficient alpha.  相似文献   

7.
Most inter-rater reliability studies using nominal scales suggest the existence of two populations of inference: the population of subjects (collection of objects or persons to be rated) and that of raters. Consequently, the sampling variance of the inter-rater reliability coefficient can be seen as a result of the combined effect of the sampling of subjects and raters. However, all inter-rater reliability variance estimators proposed in the literature only account for the subject sampling variability, ignoring the extra sampling variance due to the sampling of raters, even though the latter may be the biggest of the variance components. Such variance estimators make statistical inference possible only to the subject universe. This paper proposes variance estimators that will make it possible to infer to both universes of subjects and raters. The consistency of these variance estimators is proved as well as their validity for confidence interval construction. These results are applicable only to fully crossed designs where each rater must rate each subject. A small Monte Carlo simulation study is presented to demonstrate the accuracy of large-sample approximations on reasonably small samples.  相似文献   

8.
We report analytical and computational investigations into the effects of base time on the diagnosticity of two popular theoretical tools in the redundant signals literature: (1) the race model inequality and (2) the capacity coefficient. We show analytically and without distributional assumptions that the presence of base time decreases the sensitivity of both of these measures to model violations. We further use simulations to investigate the statistical power model selection tools based on the race model inequality, both with and without base time. Base time decreases statistical power, and biases the race model test toward conservatism. The magnitude of this biasing effect increases as we increase the proportion of total reaction time variance contributed by base time. We marshal empirical evidence to suggest that the proportion of reaction time variance contributed by base time is relatively small, and that the effects of base time on the diagnosticity of our model-selection tools are therefore likely to be minor. However, uncertainty remains concerning the magnitude and even the definition of base time. Experimentalists should continue to be alert to situations in which base time may contribute a large proportion of the total reaction time variance.  相似文献   

9.
The importance of accurate estimation and of powerful statistical tests is widely recognized but has rarely been acknowledged in practice in the social and behavioral sciences. This is especially true for estimation and testing when one is dealing with multilevel designs, not least because approximating accuracy and power is more complex due to having multiple variances and research units at several levels. The complexity further increases for imbalanced designs, often necessitating simulation studies that perform accuracy and power calculations. However, we show, using such simulation studies, that the distortion of balance can be ignored in most cases, making efficiency studies simpler and the use of existing software valid. An exception is suggested for imbalanced data from a large majority of small groups. Furthermore, an empirical sampling distribution of variance parameters may show substantial skewness and kurtosis, depending on the number of groups and, for the random slope, depending also on the group’s size, adding another caveat to the recommendation to ignore imbalance.  相似文献   

10.
The statistical simulation program DATASIM is designed to conduct large-scale sampling experiments on microcomputers. Monte Carlo procedures are used to investigate the Type I and Type II error rates for statistical tests when one or more assumptions are systematically violated-assumptions, for example, regarding normality, homogeneity of variance or covariance, mini-mum expected cell frequencies, and the like. In the present paper, we report several initial tests of the data-generating algorithms employed by DATASIM. The results indicate that the uniform and standard normal deviate generators perform satisfactorily. Furthermore, Kolmogorov-Smirnov tests show that the sampling distributions ofz, t, F, χ2, andr generated by DATASIM simulations follow the appropriate theoretical distributions. Finally, estimates of Type I error rates obtained by DATASIM under various patterns of violations of assumptions are in close agreement with the results of previous analytical and empirical studies; These converging lines of evidence suggest that DATASIM may well prove to be a reliable and productive tool for conducting statistical simulation research.  相似文献   

11.
The suicide rate among psychiatrists revisited   总被引:1,自引:0,他引:1  
A review of the literature which examined the suicide rate among psychiatrists and other doctors was made. Particular attention was given to statistical and methodological problems. Common problems include small research sampling, inappropriate comparisons, lack of controls for age, sex, or other relevant factors, interpolating rates from a level per 10,000 to a level per 100,000, and inclusion of a number of unwarranted assumptions. The review did not find evidence that the suicide rate among psychiatrists is higher compared to the population as a whole; nor is there any evidence that the rates of any medical specialty are above average, controlling for the relevant variables. The materials reviewed included all published studies. In order adequately to assess the suicide rate among psychiatrists, a systematic and extensive study must be made, controlling for the relevant methodological variables.  相似文献   

12.
Experiments often produce a hit rate and a false alarm rate in each of two conditions. These response rates are summarized into a single-point sensitivity measure such as d', and t tests are conducted to test for experimental effects. Using large-scale Monte Carlo simulations, we evaluate the Type I error rates and power that result from four commonly used single-point measures: d', A', percent correct, and gamma. We also test a newly proposed measure called gammaC. For all measures, we consider several ways of handling cases in which false alarm rate = 0 or hit rate = 1. The results of our simulations indicate that power is similar for these measures but that the Type I error rates are often unacceptably high. Type I errors are minimized when the selected sensitivity measure is theoretically appropriate for the data.  相似文献   

13.
Most indexes of item validity and difficulty vary systematically with changes in the mean and variance of the group. Formulas are presented showing how certain item parameters will vary with these alterations in group mean and variance. Item parameters are also suggested which should remain invariant under such changes. These parameters are developed under two different assumptions: first, the assumption that thetotal distribution of the item ability variable is normal, and, second, that the distribution of the item ability variablefor each array of the explicit selection variable is normal. The writer wishes to acknowledge helpful discussions of this paper with Paul Horst and Herbert S. Sichel who have worked on various aspects of the problem of invariant item parameters.  相似文献   

14.
15.
Knowledge monitoring predicts academic outcomes in many contexts. However, measures of knowledge monitoring accuracy are often incomplete. In the current study, a measure of students’ ability to discriminate known from unknown information as a component of knowledge monitoring was considered. Undergraduate students’ knowledge monitoring accuracy was assessed and used to predict final exam scores in a specific course. It was found that gamma, a measure commonly used as the measure of knowledge monitoring accuracy, accounted for a small, but significant amount of variance in academic performance whereas the discrimination and bias indexes combined to account for a greater amount of variance in academic performance.  相似文献   

16.
The current study examined accuracy and bias in perceptions of a partner’s daily approach and avoidance sacrifice motives, associations of bias with daily sacrificer and perceiver relationship quality, and moderators of accuracy (gender, relationship length, daily stress). 94 cohabiting couples completed daily measures of sacrifice (N = 375 days), stress, relationship satisfaction, and closeness. People showed evidence of accuracy for partner sacrifice motives, but accuracy for approach sacrifice depended on relationship length. People underperceived partner sacrifice motives; underperception of approach sacrifice was associated with greater sacrificer closeness, while overperception of approach sacrifice and overperception of avoidance sacrifice was associated with greater perceiver closeness, although these effects were small. Gender and stress did not moderate tracking accuracy or mean-level bias.  相似文献   

17.
A series of studies on part-whole free recall led to the conclusion that learning part of a list before learning the entire list produces negative transfer late in learning. The statistical evidence for this conclusion is shown to depend upon assumptions about (1) the asymptotic level reached and (2) the relative magnitude of the variance between conditions as compared to the variance of Ss within conditions. Evidence concerning these assumptions is reviewed, and it is argued that there was insufficient evidence to support a conclusion of negative transfer in part-whole free recall.  相似文献   

18.
Production,verification, and priming of multiplication facts   总被引:2,自引:0,他引:2  
In the arithmetic-verification procedure, subjects are presented with a simple equation (e.g., 4 × 8 = 24) and must decide quickly whether it is true or false. The prevailing model of arithmetic verification holds that the presented answer (e.g., 24) has no direct effect on the speed and accuracy of retrieving an answer to the problem. It follows that models of the retrieval stage based on verification are also valid models of retrieval in the production task, in which subjects simply retrieve and state the answer to a given problem. Results of two experiments using singledigit multiplication problems challenge these assumptions. It is argued that the presented answer in verification functions as a priming stimulus and that on “true” verification trials the effects of priming are sufficient to distort estimates of problem difficulty and to mask important evidence about the nature of the retrieval process. It is also argued that the priming of false answers that have associative links to a presented problem induces interference that disrupts both speed and accuracy of retrieval. The results raise questions about the interpretation of verification data and offer support for a network-interference theory of the mental processes underlying simple multiplication.  相似文献   

19.
Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of freedom heteroscedastic statistic for independent and correlated groups designs in order to achieve robustness to the biasing effects of nonnormality and variance heterogeneity. The authors describe a nonparametric bootstrap methodology that can provide improved Type I error control. In addition, the authors indicate how researchers can set robust confidence intervals around a robust effect size parameter estimate. In an online supplement, the authors use several examples to illustrate the application of an SAS program to implement these statistical methods.  相似文献   

20.
In a recent paper, Bedrick derived the asymptotic distribution of Lord's modified sample biserial correlation estimator and studied its efficiency for bivariate normal populations. We present a more detailed examination of the properties of Lord's estimator and several competitors, including Brogden's estimator. We show that Lord's estimator is more efficient for three nonnormal distributions than a generalization of Pearson's sample biserial estimator. In addition, Lord's estimator is reasonably efficient relative to the maximum likelihood estimator for these distributions. These conclusions are consistent with Bedrick's results for the bivariate normal distribution. We also study the small sample bias and variance of Lord's estimator, and the coverage properties of several confidence interval estimates.The author would like to thank the referees for several suggestions that improved the paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号