首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The PARELLA model is a probabilistic parallelogram model that can be used for the measurement of latent attitudes or latent preferences. The data analyzed are the dichotomous responses of persons to stimuli, with a one (zero) indicating agreement (disagreement) with the content of the stimulus. The model provides a unidimensional representation of persons and items. The response probabilities are a function of the distance between person and stimulus: the smaller the distance, the larger the probability that a person will agree with the content of the stimulus. An estimation procedure based on expectation maximization and marginal maximum likelihood is developed and the quality of the resulting parameter estimates evaluated.I gratefully acknowledge Ivo Molenaar and Wijbrandt van Schuur for their advice and encouragement during the course of the investigation, Derk-Jan Kiewiet who constructed the program for the ML estimator for the person parameter and Anne Boomsma, Wendy Post, Tom Snijders, and David Thissen for their comments on smaller aspects of the investigation.  相似文献   

This article proposes a general mixture item response theory (IRT) framework that allows for classes of persons to differ with respect to the type of processes underlying the item responses. Through the use of mixture models, nonnested IRT models with different structures can be estimated for different classes, and class membership can be estimated for each person in the sample. If researchers are able to provide competing measurement models, this mixture IRT framework may help them deal with some violations of measurement invariance. To illustrate this approach, we consider a two-class mixture model, where a person’s responses to Likert-scale items containing a neutral middle category are either modeled using a generalized partial credit model, or through an IRTree model. In the first model, the middle category (“neither agree nor disagree”) is taken to be qualitatively similar to the other categories, and is taken to provide information about the person’s endorsement. In the second model, the middle category is taken to be qualitatively different and to reflect a nonresponse choice, which is modeled using an additional latent variable that captures a person’s willingness to respond. The mixture model is studied using simulation studies and is applied to an empirical example.  相似文献   

It is common practice in IRT to consider items as fixed and persons as random. Both, continuous and categorical person parameters are most often random variables, whereas for items only continuous parameters are used and they are commonly of the fixed type, although exceptions occur. It is shown in the present article that random item parameters make sense theoretically, and that in practice the random item approach is promising to handle several issues, such as the measurement of persons, the explanation of item difficulties, and trouble shooting with respect to DIF. In correspondence with these issues, three parts are included. All three rely on the Rasch model as the simplest model to study, and the same data set is used for all applications. First, it is shown that the Rasch model with fixed persons and random items is an interesting measurement model, both, in theory, and for its goodness of fit. Second, the linear logistic test model with an error term is introduced, so that the explanation of the item difficulties based on the item properties does not need to be perfect. Finally, two more models are presented: the random item profile model (RIP) and the random item mixture model (RIM). In the RIP, DIF is not considered a discrete phenomenon, and when a robust regression approach based on the RIP difficulties is applied, quite good DIF identification results are obtained. In the RIM, no prior anchor sets are defined, but instead a latent DIF class of items is used, so that posterior anchoring is realized (anchoring based on the item mixture). It is shown that both approaches are promising for the identification of DIF.  相似文献   

This article describes a generalized longitudinal mixture item response theory (IRT) model that allows for detecting latent group differences in item response data obtained from electronic learning (e-learning) environments or other learning environments that result in large numbers of items. The described model can be viewed as a combination of a longitudinal Rasch model, a mixture Rasch model, and a random-item IRT model, and it includes some features of the explanatory IRT modeling framework. The model assumes the possible presence of latent classes in item response patterns, due to initial person-level differences before learning takes place, to latent class-specific learning trajectories, or to a combination of both. Moreover, it allows for differential item functioning over the classes. A Bayesian model estimation procedure is described, and the results of a simulation study are presented that indicate that the parameters are recovered well, particularly for conditions with large item sample sizes. The model is also illustrated with an empirical sample data set from a Web-based e-learning environment.  相似文献   

As a method to ascertain person and item effects in psycholinguistics, a generalized linear mixed effect model (GLMM) with crossed random effects has met limitations in handing serial dependence across persons and items. This paper presents an autoregressive GLMM with crossed random effects that accounts for variability in lag effects across persons and items. The model is shown to be applicable to intensive binary time series eye-tracking data when researchers are interested in detecting experimental condition effects while controlling for previous responses. In addition, a simulation study shows that ignoring lag effects can lead to biased estimates and underestimated standard errors for the experimental condition effects.  相似文献   

A rasch model for partial credit scoring   总被引:24,自引:0,他引:24  
A unidimensional latent trait model for responses scored in two or more ordered categories is developed. This “Partial Credit” model is a member of the family of latent trait models which share the property of parameter separability and so permit “specifically objective” comparisons of persons and items. The model can be viewed as an extension of Andrich's Rating Scale model to situations in which ordered response alternatives are free to vary in number and structure from item to item. The difference between the parameters in this model and the “category boundaries” in Samejima's Graded Response model is demonstrated. An unconditional maximum likelihood procedure for estimating the model parameters is developed. Preparation of this paper was supported by grants from the Spencer Foundation and the National Institute for Justice. I would like to thank Professor Benjamin D. Wright of the University of Chicago for his very kind help with the various drafts of this paper.  相似文献   

A multinormal partial credit model for factor analysis of polytomously scored items with ordered response categories is derived using an extension of the Dutch Identity (Holland in Psychometrika 55:5?C18, 1990). In the model, latent variables are assumed to have a multivariate normal distribution conditional on unweighted sums of item scores, which are sufficient statistics. Attention is paid to maximum likelihood estimation of item parameters, multivariate moments of latent variables, and person parameters. It is shown that the maximum likelihood estimates can be found without the use of numerical integration techniques. More general models are discussed which can be used for testing the model, and it is shown how models with different numbers of latent variables can be tested against each other. In addition, multi-group extensions are proposed, which can be used for testing both measurement invariance and latent population differences. Models and procedures discussed are demonstrated in an empirical data example.  相似文献   

A matrix of the responses of persons (rows) to items (columns) can be analyzed for persons as the dual of the usual analyses for items. In personality tests, the stability of items varies with mean distance between the item and the points for persons, and the dual holds for persons. For items, the item- test correlation varies somewhat with stability and with the frequency of appropriate response processes (i.e., Ss answering the item as E intended them to), but stability and appropriateness are not correlated. For persons, the per- son-group correlation is independent of stability and appropriateness, these latter variables being correlated. Thus the pattern of relationships among item indices is different from that for the dual indices for people. Persons being complex, they approach personality test items in diverse ways about which little is known today.  相似文献   

Rasch models are characterised by sufficient statistics for all parameters. In the Rasch unidimensional model for two ordered categories, the parameterisation of the person and item is symmetrical and it is readily established that the total scores of a person and item are sufficient statistics for their respective parameters. In contrast, in the unidimensional polytomous Rasch model for more than two ordered categories, the parameterisation is not symmetrical. Specifically, each item has a vector of item parameters, one for each category, and each person only one person parameter. In addition, different items can have different numbers of categories and, therefore, different numbers of parameters. The sufficient statistic for the parameters of an item is itself a vector. In estimating the person parameters in presently available software, these sufficient statistics are not used to condition out the item parameters. This paper derives a conditional, pairwise, pseudo-likelihood and constructs estimates of the parameters of any number of persons which are independent of all item parameters and of the maximum scores of all items. It also shows that these estimates are consistent. Although Rasch’s original work began with equating tests using test scores, and not with items of a test, the polytomous Rasch model has not been applied in this way. Operationally, this is because the current approaches, in which item parameters are estimated first, cannot handle test data where there may be many scores with zero frequencies. A small simulation study shows that, when using the estimation equations derived in this paper, such a property of the data is no impediment to the application of the model at the level of tests. This opens up the possibility of using the polytomous Rasch model directly in equating test scores.  相似文献   

The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

This study linked nonlinear profile analysis (NPA) of dichotomous responses with an existing family of item response theory models and generalized latent variable models (GLVM). The NPA method offers several benefits over previous internal profile analysis methods: (a) NPA is estimated with maximum likelihood in a GLVM framework rather than relying on the choice of different dissimilarity measures that produce different results, (b) item and person parameters are computed during the same estimation step with an appropriate distribution for dichotomous variables, (c) the model estimates profile coordinate standard errors, and (d) additional individual-level variables can be included to model relationships with the profile parameters. An application examined experimental differences in topographic map comprehension among 288 subjects. The model produced a measure of overall test performance or comprehension in addition to pattern variables that measured the correspondence between subject response profiles and an item difficulty profile and an item-discrimination profile. The findings suggested that subjects who used 3-dimensional maps tended to correctly answer more items in addition to correctly answering items that were more discriminating indicators of map comprehension. The NPA analysis was also compared with results from a multidimensional item response theory model.  相似文献   

Generating items during testing: Psychometric issues and models   总被引:2,自引:0,他引:2  
On-line item generation is becoming increasingly feasible for many cognitive tests. Item generation seemingly conflicts with the well established principle of measuring persons from items with known psychometric properties. This paper examines psychometric principles and models required for measurement from on-line item generation. Three psychometric issues are elaborated for item generation. First, design principles to generate items are considered. A cognitive design system approach is elaborated and then illustrated with an application to a test of abstract reasoning. Second, psychometric models for calibrating generating principles, rather than specific items, are required. Existing item response theory (IRT) models are reviewed and a new IRT model that includes the impact on item discrimination, as well as difficulty, is developed. Third, the impact of item parameter uncertainty on person estimates is considered. Results from both fixed content and adaptive testing are presented.This article is based on the Presidential Address Susan E. Embretson gave on June 26, 1999 at the 1999 Annual Meeting of the Psychometric Society held at the University of Kansas in Lawrence, Kansas. —Editor  相似文献   

The many null distributions of person fit indices   总被引:1,自引:0,他引:1  
This paper deals with the situation of an investigator who has collected the scores ofn persons to a set ofk dichotomous items, and wants to investigate whether the answers of all respondents are compatible with the one parameter logistic test model of Rasch. Contrary to the standard analysis of the Rasch model, where all persons are kept in the analysis and badly fittingitems may be removed, this paper studies the alternative model in which a small minority ofpersons has an answer strategy not described by the Rasch model. Such persons are called anomalous or aberrant. From the response vectors consisting ofk symbols each equal to 0 or 1, it is desired to classify each respondent as either anomalous or as conforming to the model. As this model is probabilistic, such a classification will possibly involve false positives and false negatives. Both for the Rasch model and for other item response models, the literature contains several proposals for a person fit index, which expresses for each individual the plausibility that his/her behavior follows the model. The present paper argues that such indices can only provide a satisfactory solution to the classification problem if their statistical distribution is known under the null hypothesis that all persons answer according to the model. This distribution, however, turns out to be rather different for different values of the person's latent trait value. This value will be called ability parameter, although our results are equally valid for Rasch scales measuring other attributes.As the true ability parameter is unknown, one can only use its estimate in order to obtain an estimated person fit value and an estimated null hypothesis distribution. The paper describes three specifications for the latter: assuming that the true ability equals its estimate, integrating across the ability distribution assumed for the population, and conditioning on the total score, which is in the Rasch model the sufficient statistic for the ability parameter.Classification rules for aberrance will be worked out for each of the three specifications. Depending on test length, item parameters and desired accuracy, they are based on the exact distribution, its Monte Carlo estimate and a new and promising approximation based on the moments of the person fit statistic. Results for the likelihood person fit statistic are given in detail, the methods could also be applied to other fit statistics. A comparison of the three specifications results in the recommendation to condition on the total score, as this avoids some problems of interpretation that affect the other two specifications.The authors express their gratitude to the reviewers and to many colleagues for comments on an earlier version.  相似文献   

Two assumptions that are relevant to many applications using item response theory are the assumptions of monotonicity (M) and invariant item ordering (IIO). A latent class model is proposed for ordinal items with inequality constraints on the class-specific item means. This model is used as a tool for testing for violations of M and IIO. A Gibbs sampling scheme is used for estimating the model parameters. It is shown that the deviance information criterion can be used as an overall test of M and IIO, while posterior predictive checks can be used to test these assumptions at the item level. A real data application illustrates a model-fitting strategy for detecting items that violate M and IIO.  相似文献   

Lord and Wingersky have developed a method for computing the asymptotic variance-covariance matrix of maximum likelihood estimates for item and person parameters under some restrictions on the estimates which are needed in order to fix the latent scale. The method is tedious, but can be simplified for the Rasch model when one is only interested in the item parameters. This is demonstrated here under a suitable restriction on the item parameter estimates.  相似文献   

This paper presents an explanatory multidimensional multilevel random item response model and its application to reading data with multilevel item structure. The model includes multilevel random item parameters that allow consideration of variability in item parameters at both item and item group levels. Item-level random item parameters were included to model unexplained variance remaining when item related covariates were used to explain variation in item difficulties. Item group-level random item parameters were included to model dependency in item responses among items having the same item stem. Using the model, this study examined the dimensionality of a person’s word knowledge, termed lexical representation, and how aspects of morphological knowledge contributed to lexical representations for different persons, items, and item groups.  相似文献   

A nonlinear mixed model framework for item response theory   总被引:1,自引:0,他引:1  
Mixed models take the dependency between observations based on the same cluster into account by introducing 1 or more random effects. Common item response theory (IRT) models introduce latent person variables to model the dependence between responses of the same participant. Assuming a distribution for the latent variables, these IRT models are formally equivalent with nonlinear mixed models. It is shown how a variety of IRT models can be formulated as particular instances of nonlinear mixed models. The unifying framework offers the advantage that relations between different IRT models become explicit and that it is rather straightforward to see how existing IRT models can be adapted and extended. The approach is illustrated with a self-report study on anger.  相似文献   

The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.  相似文献   

Jin  Ick Hoon  Jeon  Minjeong 《Psychometrika》2019,84(1):236-260

Item response theory (IRT) is one of the most widely utilized tools for item response analysis; however, local item and person independence, which is a critical assumption for IRT, is often violated in real testing situations. In this article, we propose a new type of analytical approach for item response data that does not require standard local independence assumptions. By adapting a latent space joint modeling approach, our proposed model can estimate pairwise distances to represent the item and person dependence structures, from which item and person clusters in latent spaces can be identified. We provide an empirical data analysis to illustrate an application of the proposed method. A simulation study is provided to evaluate the performance of the proposed method in comparison with existing methods.


In this paper it is shown that under the random effects generalized partial credit model for the measurement of a single latent variable by a set of polytomously scored items, the joint marginal probability distribution of the item scores has a closed-form expression in terms of item category location parameters, parameters that characterize the distribution of the latent variable in the subpopulation of examinees with a zero score on all items, and item-scaling parameters. Due to this closed-form expression, all parameters of the random effects generalized partial credit model can be estimated using marginal maximum likelihood estimation without assuming a particular distribution of the latent variable in the population of examinees and without using numerical integration. Also due to this closed-form expression, new special cases of the random effects generalized partial credit model can be identified. In addition to these new special cases, a slightly more general model than the random effects generalized partial credit model is presented. This slightly more general model is called the extended generalized partial credit model. Attention is paid to maximum likelihood estimation of the parameters of the extended generalized partial credit model and to assessing the goodness of fit of the model using generalized likelihood ratio tests. Attention is also paid to person parameter estimation under the random effects generalized partial credit model. It is shown that expected a posteriori estimates can be obtained for all possible score patterns. A simulation study is carried out to show the usefulness of the proposed models compared to the standard models that assume normality of the latent variable in the population of examinees. In an empirical example, some of the procedures proposed are demonstrated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号