共查询到20条相似文献,搜索用时 15 毫秒
1.
It is often considered desirable to have the same ordering of the items by difficulty across different levels of the trait or ability. Such an ordering is an invariant item ordering (IIO). An IIO facilitates the interpretation of test results. For dichotomously scored items, earlier research surveyed the theory and methods of an invariant ordering in a nonparametric IRT context. Here the focus is on polytomously scored items, and both nonparametric and parametric IRT models are considered.The absence of the IIO property in twononparametric polytomous IRT models is discussed, and two nonparametric models are discussed that imply an IIO. A method is proposed that can be used to investigate whether empirical data imply an IIO. Furthermore, only twoparametric polytomous IRT models are found to imply an IIO. These are the rating scale model (Andrich, 1978) and a restricted rating scale version of the graded response model (Muraki, 1990). Well-known models, such as the partial credit model (Masters, 1982) and the graded response model (Samejima, 1969), do no imply an IIO. 相似文献
2.
A note on monotonicity of item response functions for ordered polytomous item response theory models
Hyeon-Ah Kang Ya-Hui Su Hua-Hua Chang 《The British journal of mathematical and statistical psychology》2018,71(3):523-535
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales. 相似文献
3.
Anton K. Formann 《Psychometrika》1988,53(1):45-62
Starting from perfectly discriminating nonmonotone dichotomous items, a class of probabilistic models with or without response errors and with or without intrinsically unscalable respondents is described. All these models can be understood as simply restricted latent class analysis. Thus, the estimation and identifiability of the parameters (class sizes and item latent probabilities) as well as the chi-squared goodness-of-fit tests (Pearson and likelihood-ratio) are free of the problems. The applicability of the proposed variants of latent class models is demonstrated on real attitudinal data.This research was supported by the Kulturamt der Stadt Wien, Magistratsabteilung 7.The author wishes to thank the editor, Ivo W. Molenaar, as well as Clifford C. Clogg and the anonymous reviewers for their valuable comments on the earlier drafts of this paper. 相似文献
4.
M. J. H. van Onna 《Psychometrika》2002,67(4):519-538
In a latent class IRT model in which the latent classes are ordered on one dimension, the class specific response probabilities are subject to inequality constraints. The number of these inequality constraints increase dramatically with the number of response categories per item, if assumptions like monotonicity or double monotonicity of the cumulative category response functions are postulated. A Markov chain Monte Carlo method, the Gibbs sampler, can sample from the multivariate posterior distribution of the parameters under the constraints. Bayesian model selection can be done by posterior predictive checks and Bayes factors. A simulation study is done to evaluate results of the application of these methods to ordered latent class models in three realistic situations. Also, an example of the presented methods is given for existing data with polytomous items. It can be concluded that the Bayesian estimation procedure can handle the inequality constraints on the parameters very well. However, the application of Bayesian model selection methods requires more research. 相似文献
5.
6.
Xuelan Qiu Jimmy de la Torre 《The British journal of mathematical and statistical psychology》2023,76(3):491-512
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising. 相似文献
7.
The authors describe and use four methods for detecting Differential Item Functioning in polytomous items: Mantel, Generalized Mantel-Haenszel (GMH), Ordinal Logistic Regression (RLO), and Discriminant Logistic Regression (RLD). For each procedure, the theoretical model and the measure of effect size are described. The data from the "Reading Comprehension Test" from the PISA2000 evaluation program were analyzed using a cross-validation design. Two booklets were independently evaluated in the American and Spanish samples. Adopting as decision rule the significance of the statistical test and the measurement of the effect size, agreement among the evaluated procedures was total for two of the analyzed items. 相似文献
8.
9.
Francesco Bartolucci 《Psychometrika》2007,72(2):141-157
We illustrate a class of multidimensional item response theory models in which the items are allowed to have different discriminating
power and the latent traits are represented through a vector having a discrete distribution. We also show how the hypothesis
of unidimensionality may be tested against a specific bidimensional alternative by using a likelihood ratio statistic between
two nested models in this class. For this aim, we also derive an asymptotically equivalent Wald test statistic which is faster
to compute. Moreover, we propose a hierarchical clustering algorithm which can be used, when the dimensionality of the latent
structure is completely unknown, for dividing items into groups referred to different latent traits. The approach is illustrated
through a simulation study and an application to a dataset collected within the National Assessment of Educational Progress,
1996.
The author would like to thank the Editor, an Associate Editor and three anonymous referees for stimulating comments. I also
thank L. Scaccia, F. Pennoni and M. Lupparelli for having done part of the simulations. 相似文献
10.
Brian W. Junker 《Psychometrika》1991,56(2):255-278
A definition ofessential independence is proposed for sequences of polytomous items. For items satisfying the reasonable assumption that the expected amount of credit awarded increases with examinee ability, we develop a theory ofessential unidimensionality which closely parallels that of Stout. Essentially unidimensional item sequences can be shown to have a unique (up to change-of-scale) dominant underlying trait, which can be consistently estimated by a monotone transformation of the sum of the item scores. In more general polytomous-response latent trait models (with or without ordered responses), anM-estimator based upon maximum likelihood may be shown to be consistent for under essentially unidimensional violations of local independence and a variety of monotonicity/identifiability conditions. A rigorous proof of this fact is given, and the standard error of the estimator is explored. These results suggest that ability estimation methods that rely on the summation form of the log likelihood under local independence should generally be robust under essential independence, but standard errors may vary greatly from what is usually expected, depending on the degree of departure from local independence. An index of departure from local independence is also proposed.This work was supported in part by Office of Naval Research Grant N00014-87-K-0277 and National Science Foundation Grant NSF-DMS-88-02556. The author is grateful to William F. Stout for many helpful comments, and to an anonymous reviewer for raising the questions addressed in section 2. A preliminary version of section 6 appeared in the author's Ph.D. thesis. 相似文献
11.
本文提出一种多级计分项目下的个人拟合统计量R, 考察它在检测6种常见的异常作答模式(作弊、猜测、随机、粗心、创新作答、混合异常)下的表现, 并与标准化对数似然统计量lzp进行比较。结果表明:(1) 在异常作答覆盖率较低并且异常作答类型为作弊和猜测时, R的检测率显著高于lzp; (2) 随着测验长度和被试异常程度的增加, 两种统计量的检测率都会上升; (3) 在一些条件下, R与lzp检测效果接近。实证数据分析进一步展示了R统计量的使用方法和过程, 结果也表明R统计量具有较好的应用前景。 相似文献
12.
Stochastic ordering using the latent trait and the sum score in polytomous IRT models 总被引:1,自引:0,他引:1
In a restricted class of item response theory (IRT) models for polytomous items the unweighted total score has monotone likelihood ratio (MLR) in the latent trait. MLR implies two stochastic ordering (SO) properties, denoted SOM and SOL, which are both weaker than MLR, but very useful for measurement with IRT models. Therefore, these SO properties are investigated for a broader class of IRT models for which the MLR property does not hold.In this study, first a taxonomy is given for nonparametric and parametric models for polytomous items based on the hierarchical relationship between the models. Next, it is investigated which models have the MLR property and which have the SO properties. It is shown that all models in the taxonomy possess the SOM property. However, counterexamples illustrate that many models do not, in general, possess the even more useful SOL property.Hemker's research was supported by the Netherlands Research Council, Grant 575-67-034. Junker's research was supported in part by the National Institutes of Health, Grant CA54852, and by the National Science Foundation, Grant DMS-94.04438. 相似文献
13.
A person-fit index for polytomous rasch models,latent class models,and their mixture generalizations
A normally distributed person-fit index is proposed for detecting aberrant response patterns in latent class models and mixture distribution IRT models for dichotomous and polytomous data.This article extends previous work on the null distribution of person-fit indices for the dichotomous Rasch model to a number of models for categorical data. A comparison of two different approaches to handle the skewness of the person-fit index distribution is included.Major parts of this paper were written while the first author worked at the Institute for Science Education, Kiel, Germany. Any opinions expressed in this paper are those of the authors and not necessarily of Educational Testing Service. The results presented in this paper were improved by valuable comments from J. Rost, K. Yamamoto, N.D. Verhelst, E. Bedrick and two anonymous reviewers. 相似文献
14.
When modeling the relationship between two nominal categorical variables, it is often desirable to include covariates to understand
how individuals differ in their response behavior. Typically, however, not all the relevant covariates are available, with
the result that the measured variables cannot fully account for the associations between the nominal variables. Under the
assumption that the observed and unobserved variables follow a homogeneous conditional Gaussian distribution, this paper proposesRC(M) regression models to decompose the residual associations between the polytomous variables. Based on Goodman's (1979, 1985)RC(M) association model, a distinctive feature ofRC(M) regression models is that they facilitate the joint estimation of effects due to manifest and omitted (continuous) variables
without requiring numerical integration. TheRC(M) regression models are illustrated using data from the High School and Beyond study (Tatsuoka & Lohnes, 1988).
This article was accepted for publication, when Willem J. Heiser was the Editor ofPsychometrika. This research was supported by grants from the National Science Foundation (#SBR96-17510 and #SBR94-09531) and the Bureau
of Educational Research at the University of Illinois. We thank Jee-Seon Kim for comments and computational assistance. 相似文献
15.
Generating items during testing: Psychometric issues and models 总被引:2,自引:0,他引:2
Susan E. Embretson 《Psychometrika》1999,64(4):407-433
On-line item generation is becoming increasingly feasible for many cognitive tests. Item generation seemingly conflicts with the well established principle of measuring persons from items with known psychometric properties. This paper examines psychometric principles and models required for measurement from on-line item generation. Three psychometric issues are elaborated for item generation. First, design principles to generate items are considered. A cognitive design system approach is elaborated and then illustrated with an application to a test of abstract reasoning. Second, psychometric models for calibrating generating principles, rather than specific items, are required. Existing item response theory (IRT) models are reviewed and a new IRT model that includes the impact on item discrimination, as well as difficulty, is developed. Third, the impact of item parameter uncertainty on person estimates is considered. Results from both fixed content and adaptive testing are presented.This article is based on the Presidential Address Susan E. Embretson gave on June 26, 1999 at the 1999 Annual Meeting of the Psychometric Society held at the University of Kansas in Lawrence, Kansas. —Editor 相似文献
16.
《The British journal of mathematical and statistical psychology》2006,59(2):379-395
This paper proposes two unidimensional item response theory (IRT) models for analysing normative forced‐choice personality items. Both models are derived from a common theoretical framework and arise as a result of different assumptions regarding the mechanism of choice. The simplest mechanism gives rise to the one‐parameter normal‐ogive model. The second mechanism gives rise to a new IRT model, which is closely related to the Coombs–Zinnes probabilistic unfolding model. The second model is compared theoretically to the normal‐ogive model in terms of item characteristic curves and amount of item information. Next, procedures for estimating the respondent and the item parameters in the second model are described. Finally, both models are empirically compared by using two well‐known personality measures. 相似文献
17.
A two-stage procedure is developed for analyzing structural equation models with continuous and polytomous variables. At the first stage, the maximum likelihood estimates of the thresholds, polychoric covariances and variances, and polyserial covariances are simultaneously obtained with the help of an appropriate transformation that significantly simplifies the computation. An asymptotic covariance matrix of the estiates is also computed. At the second stage, the parameters in the structural covariance model are obtained via the generalized least squares approach. Basic statistical properties of the estimates are derived and some illustrative examples and a small simulation study are reported.This research was supported in part by a research grant DA01070 from the U. S. Public Health Service. We are indebted to several referees and the editor for very valuable comments and suggestions for improvement of this paper. The computing assistance of King-Hong Leung and Man-Lai Tang is also gratefully acknowledged. 相似文献
18.
This paper discusses the application of a class of Rasch models to situations where test items are grouped into subsets and the common attributes of items within these subsets brings into question the usual assumption of conditional independence. The models are all expressed as particular cases of the random coefficients multinomial logit model developed by Adams and Wilson. This formulation allows a very flexible approach to the specification of alternative models, and makes model testing particularly straightforward. The use of the models is illustrated using item bundles constructed in the framework of the SOLO taxonomy of Biggs and Collis.The work of both authors was supported by fellowships from the National Academy of Education Spencer Fellowship. 相似文献
19.
Paul R. Rosenbaum 《Psychometrika》1984,49(3):425-435
When item characteristic curves are nondecreasing functions of a latent variable, the conditional or local independence of item responses given the latent variable implies nonnegative conditional covariances between all monotone increasing functions of a set of item responses given any function of the remaining item responses. This general result provides a basis for testing the conditional independence assumption without first specifying a parametric form for the nondecreasing item characteristic curves. The proposed tests are simple, have known asymptotic null distributions, and possess certain optimal properties. In an example, the conditional independence hypothesis is rejected for all possible forms of monotone item characteristic curves.The author acknowledges Paul W. Holland for valuable conversations on the subject of this paper; Henry Braun and Fred Lord for comments at a presentation on this subject which led to improvements in the paper; Carl H. Haag for permission to use the data in §4; Bruce Kaplan for assistance with computing; and two referees for helpful suggestions. Requests for reprints should be sent to Paul R. Rosenbaum 相似文献
20.
Non‐ignorable missingness item response theory models for choice effects in examinee‐selected items 下载免费PDF全文
Chen‐Wei Liu Wen‐Chung Wang 《The British journal of mathematical and statistical psychology》2017,70(3):499-524
Examinee‐selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non‐ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two‐dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non‐ignorable and to determine how to apply the new model to the data collected. Two follow‐up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non‐ignorable missing data were mistakenly treated as ignorable. 相似文献