首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Distribution-free tests of stochastic dominance for small samples   总被引:1,自引:0,他引:1  
One variable is said to “stochastically dominate” another if the probability of observations smaller than x is greater for one variable than the other, for all x. Inferring stochastic dominance from data samples is important for many applications of econometrics and experimental psychology, but little is known about the performance of existing inferential methods. Through simulation, we show that three of the most widely used inferential methods are inadequate for use in small samples of the size commonly encountered in many applications (up to 400 observations from each distribution). We develop two new inferential methods that perform very well in a limited, but practically important, case where the two variables are guaranteed not to be equal in distribution. We also show that extensions of these new methods, and an improved version of an existing method, perform quite well in the original, unlimited case.  相似文献   

2.
In a comparison of 2 treatments, if outcome scores are denoted by X in 1 condition and by Y in the other, stochastic equality is defined as P(X < Y) = P(X > Y). Tests of stochastic equality can be affected by characteristics of the distributions being compared, such as heterogeneity of variance. Thus, various robust tests of stochastic equality have been proposed and are evaluated here using a Monte Carlo study with sample sizes ranging from 10 to 30. Three robust tests are identified that perform well in Type I error rates and power except when extremely skewed data co-occur with very small n. When tests of stochastic equality might be preferred to tests of means is also considered.  相似文献   

3.
4.
Suppose a collection of standard tests is given to all subjects in a random sample, but a different new test is given to each group of subjects in nonoverlapping subsamples. A simple method is developed for displaying the information that the data set contains about the correlational structure of the new tests. This is possible to some extent, even though each subject takes only one new test. The method uses plausible values of the partial correlations among the new tests given the standard tests in order to generate plausible simple correlations among the new tests and plausible multiple correlations between composites of the new tests and the standard tests. The real data example included suggests that the method can be useful in practical problems.  相似文献   

5.
6.
7.
8.
When sample observations are not independent, the variance estimate in the denominator of the Student t statistic is altered, inflating the value of the test statistic and resulting in far too many Type I errors. Furthermore, how much the Type I error probability exceeds the nominal significance level is an increasing function of sample size. If N is quite large, in the range of 100 to 200 or larger, small apparently inconsequential correlations that are unknown to a researcher, such as .01 or .02, can have substantial effects and lead to false reports of statistical significance when effect size is zero.  相似文献   

9.
Consider an old testX consisting ofs sections and two new testsY andZ similar toX consisting ofp andq sections respectively. All subjects are given testX plus two variable sections from either testY orZ. Different pairings of variable sections are given to each subsample of subjects. We present a method of estimating the covariance matrix of the combined test (X 1, ...,X s ,Y 1, ...,Y p ,Z 1, ...,Z q ) and describe an application of these estimation techniques to linear, observed-score, test equating.The author is indebted to Paul W. Holland and Donald B. Rubin for their encouragement and many helpful comments and suggestions that contributed significantly to the development of this paper.This research was supported by the Program Statistics Research Project of the ETS Research Statistics Group.  相似文献   

10.
“Improper linear models” (see Dawes, Am. Psychol. 34:571–582, 1979), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of “improper” linear models as “proper” statistical models with a single predictor. We derive the upper bound on the mean squared error of this estimator and demonstrate that it has less variance than ordinary least squares estimates. We examine common choices of the weighting vector used in the literature, e.g., single variable heuristics and equal weighting, and illustrate their performance in various test cases.  相似文献   

11.
12.
Five different ability estimators—maximum likelihood [MLE ()], weighted likelihood [WLE ()], Bayesian modal [BME ()], expected a posteriori [EAP ()] and the standardized number-right score [Z ()]—were used as scores for conventional, multiple-choice tests. The bias, standard error and reliability of the five ability estimators were evaluated using Monte Carlo estimates of the unknown conditional means and variances of the estimators. The results indicated that ability estimates based on BME (), EAP () or WLE () were reasonably unbiased for the range of abilities corresponding to the difficulty of a test, and that their standard errors were relatively small. Also, they were as reliable as the old standby—the number-right score.  相似文献   

13.
Logistic regression has probably been underutilized in clinical investigations of personality because of its relatively recent development (dictated by the need for computer programs to obtain maximum likelihood estimates), and the fact that use has been largely confined to the fields of biostatistics, epidemiology, and economics Its use should be given serious consideration when the outcome of interest is dichotomous (or polychotomous) in nature and the predictors of interest may be categorical or continuous. The logit transformation is quite tractable mathematically, and it embodies the notion of threshold, which may have relevance for many of the variables that are of interest to investigators in the field of personality. Furthermore, investigators with experience in multiple linear regression or contingency table analysis should have little trouble in transitioning to logistic regression. Logistic regression programs are readily available in the major statistical packages, all of which provide fairly standard output.  相似文献   

14.
Familiarity with a word can be divided into two main components: familiarity with the form of the word (due to both its lexicality and its specific form) and familiarity with its meaning. In this study, ratings of familiarity were compared for words whose meaning was unknown to participants (UM words), for words of known meaning (KM words), and for unknown words (U words). Linguistic and experiential frequencies were equivalent. Rated familiarity was lower for UM than KM words and even lower for U words. Next, we built pseudowords from these stimuli by changing one letter and submitted them to two familiarity rating tasks that differed in the nature of the additional stimuli: either only nonwords or nonwords plus words. It was assumed that familiarity ratings would be lower for pseudowords built from UM words than for pseudowords built from KM words. The data were consistent with this assumption, and ratings depended on the initial categories of stimuli. These results support the view that usual word familiarity has two components, familiarity with form and familiarity with meaning, and a double source, processing of word form and processing of word meaning. The full set of these materials and norms may be downloaded from www.psychonomic.org/archive.  相似文献   

15.
Traditionally, two distinct approaches have been employed for exploratory factor analysis: maximum likelihood factor analysis and principal component analysis. A third alternative, called regularized exploratory factor analysis, was introduced recently in the psychometric literature. Small sample size is an important issue that has received considerable discussion in the factor analysis literature. However, little is known about the differential performance of these three approaches to exploratory factor analysis in a small sample size scenario. A simulation study and an empirical example demonstrate that regularized exploratory factor analysis may be recommended over the two traditional approaches, particularly when sample sizes are small (below 50) and the sample covariance matrix is near singular.  相似文献   

16.
In this article, we present a Bayes factor solution for inference in multiple regression. Bayes factors are principled measures of the relative evidence from data for various models or positions, including models that embed null hypotheses. In this regard, they may be used to state positive evidence for a lack of an effect, which is not possible in conventional significance testing. One obstacle to the adoption of Bayes factor in psychological science is a lack of guidance and software. Recently, Liang, Paulo, Molina, Clyde, and Berger (2008) Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. 2008. Mixtures of g-priors for {B}ayesian variable selection. Journal of the American Statistical Association, 103: 410423. Retrieved from http://pubs.amstat.org/doi/pdf/10.1198/016214507000001337[Taylor &; Francis Online], [Web of Science ®] [Google Scholar] developed computationally attractive default Bayes factors for multiple regression designs. We provide a web applet for convenient computation and guidance and context for use of these priors. We discuss the interpretation and advantages of the advocated Bayes factor evidence measures.  相似文献   

17.
18.
Baumgartner, Weiss, and Schindler (1998) introduced a novel non-parametric test for the two-sample comparison that is superior to commonly used tests such as the Wilcoxon rank-sum test. A modification of the novel test statistic can be used for one-sided comparisons based on ordinal data. Such comparisons frequently occur in psychological research, and the Wilcoxon test is often recommended for their analysis. Here, the two tests were compared in a simulation study. According to this study the tests have a similar type I error rate, but the modified Baumgartner-Weiss-Schindler test is more powerful than the Wilcoxon test.  相似文献   

19.
Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.The authors wish to thank Ella Mae Matsumura for her data set and helpful comments, Frank Baker for his advice on item response theory, Hirotugu Akaike and Taskin Atilgan, for helpful discussions regarding AIC, Graham Wood for his advice concerning the class of all binomial mixture models, Yiu Ming Chiu for providing useful references and information on tetrachoric models, and the Editor and two referees for suggesting several references and alternative approaches.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号