首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Tijmstra  Jesper  Bolsinova  Maria 《Psychometrika》2019,84(3):846-869

The assumption of latent monotonicity is made by all common parametric and nonparametric polytomous item response theory models and is crucial for establishing an ordinal level of measurement of the item score. Three forms of latent monotonicity can be distinguished: monotonicity of the cumulative probabilities, of the continuation ratios, and of the adjacent-category ratios. Observable consequences of these different forms of latent monotonicity are derived, and Bayes factor methods for testing these consequences are proposed. These methods allow for the quantification of the evidence both in favor and against the tested property. Both item-level and category-level Bayes factors are considered, and their performance is evaluated using a simulation study. The methods are applied to an empirical example consisting of a 10-item Likert scale to investigate whether a polytomous item scoring rule results in item scores that are of ordinal level measurement.

  相似文献   

2.
In a latent class IRT model in which the latent classes are ordered on one dimension, the class specific response probabilities are subject to inequality constraints. The number of these inequality constraints increase dramatically with the number of response categories per item, if assumptions like monotonicity or double monotonicity of the cumulative category response functions are postulated. A Markov chain Monte Carlo method, the Gibbs sampler, can sample from the multivariate posterior distribution of the parameters under the constraints. Bayesian model selection can be done by posterior predictive checks and Bayes factors. A simulation study is done to evaluate results of the application of these methods to ordered latent class models in three realistic situations. Also, an example of the presented methods is given for existing data with polytomous items. It can be concluded that the Bayesian estimation procedure can handle the inequality constraints on the parameters very well. However, the application of Bayesian model selection methods requires more research.  相似文献   

3.
It is shown that a unidimensional monotone latent variable model for binary items implies a restriction on the relative sizes of item correlations: The negative logarithm of the correlations satisfies the triangle inequality. This inequality is not implied by the condition that the correlations are nonnegative, the criterion that coefficient H exceeds 0.30, or manifest monotonicity. The inequality implies both a lower bound and an upper bound for each correlation between two items, based on the correlations of those two items with every possible third item. It is discussed how this can be used in Mokken’s (A theory and procedure of scale-analysis, Mouton, The Hague, 1971) scale analysis.  相似文献   

4.
Usually, methods for detection of differential item functioning (DIF) compare the functioning of items across manifest groups. However, the manifest groups with respect to which the items function differentially may not necessarily coincide with the true source of the bias. It is expected that DIF detection under a model that includes a latent DIF variable is more sensitive to this source of bias. In a simulation study, it is shown that a mixture item response theory model, which includes a latent grouping variable, performs better in identifying DIF items than DIF detection methods using manifest variables only. The difference between manifest and latent DIF detection increases as the correlation between the manifest variable and the true source of the DIF becomes smaller. Different sample sizes, relative group sizes, and significance levels are studied. Finally, an empirical example demonstrates the detection of heterogeneity in a minority sample using a latent grouping variable. Manifest and latent DIF detection methods are applied to a Vocabulary test of the General Aptitude Test Battery (GATB).  相似文献   

5.
In this paper it will be shown that a certain class of constrained latent class models may be interpreted as a special case of nonparametric multidimensional item response models. The parameters of this latent class model will be estimated using an application of the Gibbs sampler. It will be illustrated that the Gibbs sampler is an excellent tool if inequality constraints have to be taken into consideration when making inferences. Model fit will be investigated using posterior predictive checks. Checks for manifest monotonicity, the agreement between the observed and expected conditional association structure, marginal local homogeneity, and the number of latent classes will be presented.This paper is supported by grant S40-645 of the Dutch Organization for Scientific Research (NWO).  相似文献   

6.
This article describes a generalized longitudinal mixture item response theory (IRT) model that allows for detecting latent group differences in item response data obtained from electronic learning (e-learning) environments or other learning environments that result in large numbers of items. The described model can be viewed as a combination of a longitudinal Rasch model, a mixture Rasch model, and a random-item IRT model, and it includes some features of the explanatory IRT modeling framework. The model assumes the possible presence of latent classes in item response patterns, due to initial person-level differences before learning takes place, to latent class-specific learning trajectories, or to a combination of both. Moreover, it allows for differential item functioning over the classes. A Bayesian model estimation procedure is described, and the results of a simulation study are presented that indicate that the parameters are recovered well, particularly for conditions with large item sample sizes. The model is also illustrated with an empirical sample data set from a Web-based e-learning environment.  相似文献   

7.
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.  相似文献   

8.
A structural multilevel model is presented where some of the variables cannot be observed directly but are measured using tests or questionnaires. Observed dichotomous or ordinal polytomous response data serve to measure the latent variables using an item response theory model. The latent variables can be defined at any level of the multilevel model. A Bayesian procedure Markov chain Monte Carlo (MCMC), to estimate all parameters simultaneously is presented. It is shown that certain model checks and model comparisons can be done using the MCMC output. The techniques are illustrated using a simulation study and an application involving students' achievements on a mathematics test and test results regarding management characteristics of teachers and principles.  相似文献   

9.
Lihua Yao 《Psychometrika》2012,77(3):495-523
Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure of item pools, the population distribution of the simulees, the number of items selected, and the content area. The existing procedures such as Volume (Segall in Psychometrika, 61:331?C354, 1996), Kullback?CLeibler information (Veldkamp & van?der Linden in Psychometrika 67:575?C588, 2002), Minimize the error variance of the linear combination (van?der Linden in J. Educ. Behav. Stat. 24:398?C412, 1999), and Minimum Angle (Reckase in Multidimensional item response theory, Springer, New York, 2009) are compared to a new procedure, Minimize the error variance of the composite score with the optimized weight, proposed for the first time in this study. The intent is to find an item selection procedure that yields higher precisions for both the domain and composite abilities and a higher percentage of selected items from the item pool. The comparison is performed by examining the absolute bias, correlation, test reliability, time used, and item usage. Three sets of item pools are used with the item parameters estimated from real live CAT data. Results show that Volume and Minimum Angle performed similarly, balancing information for all content areas, while the other three procedures performed similarly, with a high precision for both domain and overall scores when selecting items with the required number of items for each domain. The new item selection procedure has the highest percentage of item usage. Moreover, for the overall score, it produces similar or even better results compared to those from the method that selects items favoring the general dimension using the general model (Segall in Psychometrika 66:79?C97, 2001); the general dimension method has low precision for the domain scores. In addition to the simulation study, the mathematical theories for certain procedures are derived. The theories are confirmed by the simulation applications.  相似文献   

10.
Two assumptions that are relevant to many applications using item response theory are the assumptions of monotonicity (M) and invariant item ordering (IIO). A latent class model is proposed for ordinal items with inequality constraints on the class-specific item means. This model is used as a tool for testing for violations of M and IIO. A Gibbs sampling scheme is used for estimating the model parameters. It is shown that the deviance information criterion can be used as an overall test of M and IIO, while posterior predictive checks can be used to test these assumptions at the item level. A real data application illustrates a model-fitting strategy for detecting items that violate M and IIO.  相似文献   

11.
Item responses that do not fit an item response theory (IRT) model may cause the latent trait value to be inaccurately estimated. In the past two decades several statistics have been proposed that can be used to identify nonfitting item score patterns. These statistics all yieldscalar values. Here, the use of the person response function (PRF) for identifying nonfitting item score patterns was investigated. The PRF is afunction and can be used for diagnostic purposes. First, the PRF is defined in a class of IRT models that imply an invariant item ordering. Second, a person-fit method proposed by Trabin & Weiss (1983) is reformulated in a nonparametric IRT context assuming invariant item ordering, and statistical theory proposed by Rosenbaum (1987a) is adapted to test locally whether a PRF is nonincreasing. Third, a simulation study was conducted to compare the use of the PRF with the person-fit statistic ZU3. It is concluded that the PRF can be used as a diagnostic tool in person-fit research.The authors are grateful to Coen A. Bernaards for preparing the figures used in this article, and to Wilco H.M. Emons for checking the calculations.  相似文献   

12.
The identifiability of item response models with nonparametrically specified item characteristic curves is considered. Strict identifiability is achieved, with a fixed latent trait distribution, when only a single set of item characteristic curves can possibly generate the manifest distribution of the item responses. When item characteristic curves belong to a very general class, this property cannot be achieved. However, for assessments with many items, it is shown that all models for the manifest distribution have item characteristic curves that are very near one another and pointwise differences between them converge to zero at all values of the latent trait as the number of items increases. An upper bound for the rate at which this convergence takes place is given. The main result provides theoretical support to the practice of nonparametric item response modeling, by showing that models for long assessments have the property of asymptotic identifiability. The research was partially supported by the National Institute of Health grant R01 CA81068-01.  相似文献   

13.
In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic classification models (DCMs). DCMs are a newer class of psychometric models that are designed to classify examinees according to levels of categorical latent traits. We examined the invariance property for general DCMs using the log-linear cognitive diagnosis model (LCDM) framework. We conducted a simulation study to examine the degree to which theoretical invariance of LCDM classifications and item parameter estimates can be observed under various sample and test characteristics. Results illustrated that LCDM classifications and item parameter estimates show clear invariance when adequate model data fit is present. To demonstrate the implications of this important property, we conducted additional analyses to show that using pre-calibrated tests to classify examinees provided consistent classifications across calibration samples with varying mastery profile distributions and across tests with varying difficulties.  相似文献   

14.
Jin  Ick Hoon  Jeon  Minjeong 《Psychometrika》2019,84(1):236-260

Item response theory (IRT) is one of the most widely utilized tools for item response analysis; however, local item and person independence, which is a critical assumption for IRT, is often violated in real testing situations. In this article, we propose a new type of analytical approach for item response data that does not require standard local independence assumptions. By adapting a latent space joint modeling approach, our proposed model can estimate pairwise distances to represent the item and person dependence structures, from which item and person clusters in latent spaces can be identified. We provide an empirical data analysis to illustrate an application of the proposed method. A simulation study is provided to evaluate the performance of the proposed method in comparison with existing methods.

  相似文献   

15.
Differential item functioning (DIF), referring to between-group variation in item characteristics above and beyond the group-level disparity in the latent variable of interest, has long been regarded as an important item-level diagnostic. The presence of DIF impairs the fit of the single-group item response model being used, and calls for either model modification or item deletion in practice, depending on the mode of analysis. Methods for testing DIF with continuous covariates, rather than categorical grouping variables, have been developed; however, they are restrictive in parametric forms, and thus are not sufficiently flexible to describe complex interaction among latent variables and covariates. In the current study, we formulate the probability of endorsing each test item as a general bivariate function of a unidimensional latent trait and a single covariate, which is then approximated by a two-dimensional smoothing spline. The accuracy and precision of the proposed procedure is evaluated via Monte Carlo simulations. If anchor items are available, we proposed an extended model that simultaneously estimates item characteristic functions (ICFs) for anchor items, ICFs conditional on the covariate for non-anchor items, and the latent variable density conditional on the covariate—all using regression splines. A permutation DIF test is developed, and its performance is compared to the conventional parametric approach in a simulation study. We also illustrate the proposed semiparametric DIF testing procedure with an empirical example.  相似文献   

16.
The sum score is often used to order respondents on the latent trait measured by the test. Therefore, it is desirable that under the chosen model the sum score stochastically orders the latent trait. It is known that unlike dichotomous item response theory (IRT) models, most polytomous IRT models do not imply stochastic ordering. It is unknown, however, (1) whether stochastic ordering is often or rarely violated and (2) whether violations yield a serious problem for practical data analysis. These are the central issues of this paper. First, some unanswered questions that pertain to polytomous IRT models implying stochastic ordering were investigated. Second, simulation studies were conducted to evaluate stochastic ordering in practical situations. It was found that for most polytomous IRT models that do not imply stochastic ordering, the sum score can be used safely to order respondents on the latent trait.The author would like to thank Klaas Sijtsma for commenting on earlier drafts of this paper.  相似文献   

17.
The purpose of this paper is to introduce a new method for fitting item response theory models with the latent population distribution estimated from the data using splines. A spline-based density estimation system provides a flexible alternative to existing procedures that use a normal distribution, or a different functional form, for the population distribution. A simulation study shows that the new procedure is feasible in practice, and that when the latent distribution is not well approximated as normal, two-parameter logistic (2PL) item parameter estimates and expected a posteriori scores (EAPs) can be improved over what they would be with the normal model. An example with real data compares the new method and the extant empirical histogram approach.  相似文献   

18.
An item response theory (IRT) model is used as a measurement error model for the dependent variable of a multilevel model. The dependent variable is latent but can be measured indirectly by using tests or questionnaires. The advantage of using latent scores as dependent variables of a multilevel model is that it offers the possibility of modelling response variation and measurement error and separating the influence of item difficulty and ability level. The two‐parameter normal ogive model is used for the IRT model. It is shown that the stochastic EM algorithm can be used to estimate the parameters which are close to the maximum likelihood estimates. This algorithm is easily implemented. The estimation procedure will be compared to an implementation of the Gibbs sampler in a Bayesian framework. Examples using real data are given.  相似文献   

19.
Multi‐group latent growth modelling in the structural equation modelling framework has been widely utilized for examining differences in growth trajectories across multiple manifest groups. Despite its usefulness, the traditional maximum likelihood estimation for multi‐group latent growth modelling is not feasible when one of the groups has no response at any given data collection point, or when all participants within a group have the same response at one of the time points. In other words, multi‐group latent growth modelling requires a complete covariance structure for each observed group. The primary purpose of the present study is to show how to circumvent these data problems by developing a simple but creative approach using an existing estimation procedure for growth mixture modelling. A Monte Carlo simulation study was carried out to see whether the modified estimation approach provided tangible results and to see how these results were comparable to the standard multi‐group results. The proposed approach produced results that were valid and reliable under the mentioned problematic data conditions. We also present a real data example and demonstrate that the proposed estimation approach can be used for the chi‐square difference test to check various types of measurement invariance as conducted in a standard multi‐group analysis.  相似文献   

20.
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response patterns. It also yielded more DIF items with larger effect sizes and more consistent item response patterns by substantive aspects (e.g., reading comprehension processes and cognitive complexity of items). Based on our findings, we suggest empirically evaluating the homogeneity assumption in international assessments because international populations cannot be assumed to have homogeneous item response patterns. Otherwise, differences in response patterns within these populations may be under-detected when conducting manifest DIF analyses. Detecting differences in item responses across international examinee populations has implications on the generalizability and meaningfulness of DIF findings as they apply to heterogeneous examinee subgroups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号