首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
To date, virtually all techniques appropriate for ordinal data are based on the uniform probability distribution over the permutations. In this paper we introduce and examine an alternative probability model for the distribution of ordinal data. Preliminary to deriving the expectations of Spearman's rho and Kendall's tau under this model, we show how to compute certain conditional expectations of rho and tau under the uniform distribution. The alternative probability model is then applied to ordinal test theory, and the calculation of true scores and test reliability are discussed.  相似文献   

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE®General Analytical Writing and until 2009 in the case of TOEFL® iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e‐rater®. In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability.  相似文献   

For ordinal measurement the concept of an individual propensity distribution is developed. For any given individual the mean of this distribution is his true score, for which estimation procedures are discussed. Two measures of individual dispersion are considered and their distributions derived in the null case. These measures are shown to be counterparts at the individual level of Kendall's tau and Spearman's rho. Estimation of the two dispersion measures from sample data is investigated, and the relation of these estimates to the variance of the individual propensity distribution is derived.  相似文献   

A test theory using only ordinal assumptions is presented. It is based on the idea that the test items are a sample from a universe of items. The sum across items of the ordinal relations for a pair of persons on the universe items is analogous to a true score. Using concepts from ordinal multiple regression, it is possible to estimate the tau correlations of test items with the universe order from the taus among the test items. These in turn permit the estimation of the tau of total score with the universe. It is also possible to estimate the odds that the direction of a given observed score difference is the same as that of the true score difference. The estimates of the correlations between items and universe and between total score and universe are found to agree well with the actual values in both real and artificial data.Part of this paper was presented at the June, 1989, Meeting of the Psychometric Society. The authors wish to thank several reviewers for their suggestions. This research was mainly done while the second author was a University Fellow at the University of Southern California.  相似文献   

The concept of an ordinal instrumental probabilistic comparison is introduced. It relies on an ordinal scale given a priori and on the concept of stochastic dominance. It is used to define a weakly independently ordered system, or isotonic ordinal probabilistic (ISOP) model, which allows the construction of separate sample-free ordinal scales on a set of subjects and a set of items. The ISOP-model is a common nonparametric theoretical structure for unidimensional models for quantitative, ordinal and dichotomous variables.Fundamental theorems on dichotomous and polytomous weakly independently ordered systems are derived. It is shown that the raw score system has the same formal properties as the latent system, and therefore the latter can be tested at the observed empirical level.I wish to thank 3 reviewers and 2 editors who contributed a lot to the readability and precision of the article.  相似文献   

In many psychological studies, in particular those conducted by experience sampling, mental states are measured repeatedly for each participant. Such a design allows for regression models that separate between- from within-person, or trait-like from state-like, components of association between two variables. But these models are typically designed for continuous variables, whereas mental state variables are most often measured on an ordinal scale. In this paper we develop a model for disaggregating between- from within-person effects of one ordinal variable on another. As in standard ordinal regression, our model posits a continuous latent response whose value determines the observed response. We allow the latent response to depend nonlinearly on the trait and state variables, but impose a novel penalty that shrinks the fit towards a linear model on the latent scale. A simulation study shows that this penalization approach is effective at finding a middle ground between an overly restrictive linear model and an overfitted nonlinear model. The proposed method is illustrated with an application to data from the experience sampling study of Baumeister et al. (2020, Personality and Social Psychology Bulletin, 46, 1631).  相似文献   

徐芃  祁禄  熊健  叶浩生 《心理学报》2015,47(12):1520-1528
定序变量在心理现象和心理数据中随处可见, 采用综合的定序变量回归分析模型可以对“镜像模式”和“漏斗模型”的心理现象做出合理的解释和预测。首先通过非参数检验对影响因素进行初步降维, 其次用Probit定序回归对降维后的影响因素贡献率进行判别, 从而进一步筛选具有显著性判断水平的有效指标, 最后用Logistic回归模型对某种特定的心理现象发生与否进行信息量足够大的解释和预测。大学毕业生工作生活质量满意度的预测对这种综合定序变量回归分析模型的实例拟合, 证实了综合定序变量回归分析模型在心理现象和心理数据分析中的应用价值。  相似文献   

Ordinal predictors are commonly used in regression models. They are often incorrectly treated as either nominal or metric, thus under- or overestimating the information contained. Such practices may lead to worse inference and predictions compared to methods which are specifically designed for this purpose. We propose a new method for modelling ordinal predictors that applies in situations in which it is reasonable to assume their effects to be monotonic. The parameterization of such monotonic effects is realized in terms of a scale parameter b representing the direction and size of the effect and a simplex parameter modelling the normalized differences between categories. This ensures that predictions increase or decrease monotonically, while changes between adjacent categories may vary across categories. This formulation generalizes to interaction terms as well as multilevel structures. Monotonic effects may be applied not only to ordinal predictors, but also to other discrete variables for which a monotonic relationship is plausible. In simulation studies we show that the model is well calibrated and, if there is monotonicity present, exhibits predictive performance similar to or even better than other approaches designed to handle ordinal predictors. Using Stan, we developed a Bayesian estimation method for monotonic effects which allows us to incorporate prior information and to check the assumption of monotonicity. We have implemented this method in the R package brms, so that fitting monotonic effects in a fully Bayesian framework is now straightforward.  相似文献   

This paper proposes a general approach to accounting for individual differences in the extreme response style in statistical models for ordered response categories. This approach uses a hierarchical ordinal regression modeling framework with heterogeneous thresholds structures to account for individual differences in the response style. Markov chain Monte Carlo algorithms for Bayesian inference for models with heterogeneous thresholds structures are discussed in detail. A simulation and two examples based on ordinal probit models are given to illustrate the proposed methodology. The simulation and examples also demonstrate that failing to account for individual differences in the extreme response style can have adverse consequences for statistical inferences.The author is grateful to Ulf Böckenholt, an associate editor, and three anonymous reviewers for helpful comments, and Kristine Kuhn and Kshiti Joshi for providing the data.  相似文献   

In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made.  相似文献   

In a recent article, Fagot proposed a generalized family of coefficients of relational agreement for multiple judges, focusing on the concept of empirically meaningful relationships. In this paper an ordinal coefficient of relational agreement, based on ranking data, is presented as a special case of the generalized family. It is shown that the proposed ordinal coefficient encompasses other ordinal coefficients, such as the Kendall coefficient of concordance, the average Spearman rank-order coefficient, and intraclass correlation based on ranks. It is also shown that the Kendall coefficient of concordance, corrected for chance agreement, is equivalent to the ordinal coefficient proposed in this paper.  相似文献   

With random assignment to treatments and standard assumptions, either a one-way ANOVA of post-test scores or a two-way, repeated measures ANOVA of pre- and post-test scores provides a legitimate test of the equal treatment effect null hypothesis for latent variable . In an ANCOVA for pre- and post-test variablesX andY which are ordinal measures of and , respectively, random assignment and standard assumptions ensure the legitimacy of inferences about the equality of treatment effects on latent variable . Sample estimates of adjustedY treatment means are ordinal estimators of adjusted post-test means on latent variable .  相似文献   

Four rats pressed levers and received food pellets under fixed-interval reinforcement schedules of 20, 60, and 180 seconds. The number of responses in each interval was recorded. From these data, the probability of reinforcement was determined as a function of response count. These functions were generally increasing. This finding is consistent with previous suggestions that increasing response rates within fixed intervals may be a function of response count in addition to or instead of elapsed or remaining time.  相似文献   

The simultaneous and nonparametric estimation of latent abilities and item characteristic curves is considered. The asymptotic properties of ordinal ability estimation and kernel smoothed nonparametric item characteristic curve estimation are investigated under very general assumptions on the underlying item response theory model as both the test length and the sample size increase. A large deviation probability inequality is stated for ordinal ability estimation. The mean squared error of kernel smoothed item characteristic curve estimates is studied and a strong consistency result is obtained showing that the worst case error in the item characteristic curve estimates over all items and ability levels converges to zero with probability equal to one.  相似文献   

Item response theory posits local independence, or conditional independence of item responses given item parameters and examinee proficiency parameters. The usual definition of local independence, however, addresses the context of fixed tests, and initially appears to yield incorrect response-pattern probabilities in the context of adaptive testing. The paradox is resolved by introducing additional notation to deal with the item selection mechanism.We are grateful to Charlie Lewis, Ming-Mei Wang, and Pao-Kuei Wu for discussions on this topic, and to the Editor, the reviewers, and Howard Wainer for helpful comments on an earlier version of the paper. The first author's work was supported in part by the National Center for Research on Evaluation, Standards, Student Testing (CRESST), Educational Research and Development Program, cooperative agreement number R117G10027 and CFDA catalog number 84.117G, as administered by the Office of Educational Research and Improvement, U.S. Department of Education.  相似文献   

Score tests for identifying locally dependent item pairs have been proposed for binary item response models. In this article, both the bifactor and the threshold shift score tests are generalized to the graded response model. For the bifactor test, the generalization is straightforward; it adds one secondary dimension associated only with one pair of items. For the threshold shift test, however, multiple generalizations are possible: in particular, conditional, uniform, and linear shift tests are discussed in this article. Simulation studies show that all of the score tests have accurate Type I error rates given large enough samples, although their small‐sample behaviour is not as good as that of Pearson's Χ2 and M2 as proposed in other studies for the purpose of local dependence (LD) detection. All score tests have the highest power to detect the LD which is consistent with their parametric form, and in this case they are uniformly more powerful than Χ2 and M2; even wrongly specified score tests are more powerful than Χ2 and M2 in most conditions. An example using empirical data is provided for illustration.  相似文献   

Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken (1971) and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four classical lower bounds to reliability. Finally, recommendations are given concerning the use of these estimation methods.The authors are grateful for constructive comments from the reviewers and from Charles Lewis.  相似文献   

Exact conditional tests of independence in cross-classification tables are formulated based on the 2 statistic and statistics with stronger operational interpretations, such as some nominal and ordinal measures of association. Guidelines for the table dimensions and sample sizes for which the tests are economically implemented on a computer are given. Some selected sample sizes and marginal distributions are used in a numerical comparison between the significance levels of the approximate and exact conditional tests based on the 2 statistic.The authors are grateful for the suggestions of the referees and for computer funding provided by the Northeast Regional Data Center at the University of Florida.  相似文献   

Assuming a nonparametric family of item response theory models, a theory-based procedure for testing the hypothesis of unidimensionality of the latent space is proposed. The asymptotic distribution of the test statistic is derived assuming unidimensionality, thereby establishing an asymptotically valid statistical test of the unidimensionality of the latent trait. Based upon a new notion of dimensionality, the test is shown to have asymptotic power 1. A 6300 trial Monte Carlo study using published item parameter estimates of widely used standardized tests indicates conservative adherence to the nominal level of significance and statistical power averaging 81 out of 100 rejections for examinee sample sizes and psychological test lengths often incurred in practice.The referees' comments were remarkably detailed and greatly enhanced the writeup and sensitized the author to certain pertinent issues. Discussions with Fritz Drasgow, Lloyd Humphreys, Dennis Jennings, Brian Junker, Robert Linn, Ratna Nandakumar, and Robin Shealy were also very useful.This research was supported by the Office of Naval Research under grant N00014-84-K-0186; NR 150-533, and by the National Science Foundation under grant DMS 85-03321.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号