首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, optimal designs will be derived for estimating the ability parameters of the Rasch model when difficulty parameters are known. It is well established that a design is locally D-optimal if the ability and difficulty coincide. But locally optimal designs require that the ability parameters to be estimated are known. To attenuate this very restrictive assumption, prior knowledge on the ability parameter may be incorporated within a Bayesian approach. Several symmetric weight distributions, e.g., uniform, normal and logistic distributions, will be considered. Furthermore, maximin efficient designs are developed where the minimal efficiency is maximized over a specified range of ability parameters.  相似文献   

2.
Analysing ordinal data is becoming increasingly important in psychology, especially in the context of item response theory. The generalized partial credit model (GPCM) is probably the most widely used ordinal model and has found application in many large-scale educational assessment studies such as PISA. In the present paper, optimal test designs are investigated for estimating persons’ abilities with the GPCM for calibrated tests when item parameters are known from previous studies. We find that local optimality may be achieved by assigning non-zero probability only to the first and last categories independently of a person's ability. That is, when using such a design, the GPCM reduces to the dichotomous two-parameter logistic (2PL) model. Since locally optimal designs require the true ability to be known, we consider alternative Bayesian design criteria using weight distributions over the ability parameter space. For symmetric weight distributions, we derive necessary conditions for the optimal one-point design of two response categories to be Bayes optimal. Furthermore, we discuss examples of common symmetric weight distributions and investigate under what circumstances the necessary conditions are also sufficient. Since the 2PL model is a special case of the GPCM, all of these results hold for the 2PL model as well.  相似文献   

3.
Count data naturally arise in several areas of cognitive ability testing, such as processing speed, memory, verbal fluency, and divergent thinking. Contemporary count data item response theory models, however, are not flexible enough, especially to account for over- and underdispersion at the same time. For example, the Rasch Poisson counts model (RPCM) assumes equidispersion (conditional mean and variance coincide) which is often violated in empirical data. This work introduces the Conway–Maxwell–Poisson counts model (CMPCM) that can handle underdispersion (variance lower than the mean), equidispersion, and overdispersion (variance larger than the mean) in general and specifically at the item level. A simulation study revealed satisfactory parameter recovery at moderate sample sizes and mostly unbiased standard errors for the proposed estimation approach. In addition, plausible empirical reliability estimates resulted, while those based on the RPCM were biased downwards (underdispersion) and biased upwards (overdispersion) when the simulation model deviated from equidispersion. Finally, verbal fluency data were analysed and the CMPCM with item-specific dispersion parameters fitted the data best. Dispersion parameter estimates indicated underdispersion for three out of four items. Overall, these findings indicate the feasibility and importance of the suggested flexible count data modelling approach.  相似文献   

4.
In this paper, the efficiency of conditional maximum likelihood (CML) and marginal maximum likelihood (MML) estimation of the item parameters of the Rasch model in incomplete designs is investigated. The use of the concept of F-information (Eggen, 2000) is generalized to incomplete testing designs. The scaled determinant of the F-information matrix is used as a scalar measure of information contained in a set of item parameters. In this paper, the relation between the normalization of the Rasch model and this determinant is clarified. It is shown that comparing estimation methods with the defined information efficiency is independent of the chosen normalization. The generalization of the method to other models than the Rasch model is discussed. In examples, information comparisons are conducted. It is found that for both CML and MML some information is lost in all incomplete designs compared to complete designs. A general result is that with increasing test booklet length the efficiency of an incomplete design, compared to a complete design, is increasing, as is the efficiency of CML compared to MML. The main difference between CML and MML is seen in the effect of the length of the test booklet. It will be demonstrated that with very small booklets, there is a substantial loss in information (about 35%) with CML estimation, while this loss is only about 10% in MML estimation. However, with increasing test length, the differences between CML and MML quickly disappear.  相似文献   

5.
A logistic regression model is suggested for estimating the relation between a set of manifest predictors and a latent trait assumed to be measured by a set ofk dichotomous items. Usually the estimated subject parameters of latent trait models are biased, especially for short tests. Therefore, the relation between a latent trait and a set of predictors should not be estimated with a regression model in which the estimated subject parameters are used as a dependent variable. Direct estimation of the relation between the latent trait and one or more independent variables is suggested instead. Estimation methods and test statistics for the Rasch model are discussed and the model is illustrated with simulated and empirical data.  相似文献   

6.
He  Yinhong  Chen  Ping 《Psychometrika》2020,85(1):35-55

The maintenance of item bank is essential for continuously implementing adaptive tests. Calibration of new items online provides an opportunity to efficiently replenish items for the operational item bank. In this study, a new optimal design for online calibration (referred to as D-c) is proposed by incorporating the idea of original D-optimal design into the reformed D-optimal design proposed by van der Linden and Ren (Psychometrika 80:263–288, 2015) (denoted as D-VR design). To deal with the dependence of design criteria on the unknown item parameters of new items, Bayesian versions of the locally optimal designs (e.g., D-c and D-VR) are put forward by adding prior information to the new items. In the simulation implementation of the locally optimal designs, five calibration sample sizes were used to obtain different levels of estimation precision for the initial item parameters, and two approaches were used to obtain the prior distributions in Bayesian optimal designs. Results showed that the D-c design performed well and retired smaller number of new items than the D-VR design at almost all levels of examinee sample size; the Bayesian version of D-c using the prior obtained from the operational items worked better than that using the default priors in BILOG-MG and PARSCALE; and Bayesian optimal designs generally outperformed locally optimal designs when the initial item parameters of the new items were poorly estimated.

  相似文献   

7.
We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities’ distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general distribution generating the abilities, even for a finite number of probes. For the semiparametric Rasch model, only a finite number of properties of the general abilities’ distribution can be identified by a finite number of items, which are completely characterized. The full identification of the semiparametric Rasch model can be only achieved when an infinite number of items is available. The results are illustrated using simulated data.  相似文献   

8.
Infrequent count data in psychological research are commonly modelled using zero-inflated Poisson regression. This model can be viewed as a latent mixture of an "always-zero" component and a Poisson component. Hurdle models are an alternative class of two-component models that are seldom used in psychological research, but clearly separate the zero counts and the non-zero counts by using a left-truncated count model for the latter. In this tutorial we revisit both classes of models, and discuss model comparisons and the interpretation of their parameters. As illustrated with an example from relational psychology, both types of models can easily be fitted using the R-package pscl.  相似文献   

9.
Various different item response theory (IRT) models can be used in educational and psychological measurement to analyze test data. One of the major drawbacks of these models is that efficient parameter estimation can only be achieved with very large data sets. Therefore, it is often worthwhile to search for designs of the test data that in some way will optimize the parameter estimates. The results from the statistical theory on optimal design can be applied for efficient estimation of the parameters.A major problem in finding an optimal design for IRT models is that the designs are only optimal for a given set of parameters, that is, they are locally optimal. Locally optimal designs can be constructed with a sequential design procedure. In this paper minimax designs are proposed for IRT models to overcome the problem of local optimality. Minimax designs are compared to sequentially constructed designs for the two parameter logistic model and the results show that minimax design can be nearly as efficient as sequentially constructed designs.  相似文献   

10.
Ul Hassan  Mahmood  Miller  Frank 《Psychometrika》2019,84(4):1101-1128

Item calibration is a technique to estimate characteristics of questions (called items) for achievement tests. In computerized tests, item calibration is an important tool for maintaining, updating and developing new items for an item bank. To efficiently sample examinees with specific ability levels for this calibration, we use optimal design theory assuming that the probability to answer correctly follows an item response model. Locally optimal unrestricted designs have usually a few design points for ability. In practice, it is hard to sample examinees from a population with these specific ability levels due to unavailability or limited availability of examinees. To counter this problem, we use the concept of optimal restricted designs and show that this concept naturally fits to item calibration. We prove an equivalence theorem needed to verify optimality of a design. Locally optimal restricted designs provide intervals of ability levels for optimal calibration of an item. When assuming a two-parameter logistic model, several scenarios with D-optimal restricted designs are presented for calibration of a single item and simultaneous calibration of several items. These scenarios show that the naive way to sample examinees around unrestricted design points is not optimal.

  相似文献   

11.
Loglinear Rasch model tests   总被引:1,自引:0,他引:1  
Existing statistical tests for the fit of the Rasch model have been criticized, because they are only sensitive to specific violations of its assumptions. Contingency table methods using loglinear models have been used to test various psychometric models. In this paper, the assumptions of the Rasch model are discussed and the Rasch model is reformulated as a quasi-independence model. The model is a quasi-loglinear model for the incomplete subgroup × score × item 1 × item 2 × ... × itemk contingency table. Using ordinary contingency table methods the Rasch model can be tested generally or against less restrictive quasi-loglinear models to investigate specific violations of its assumptions.  相似文献   

12.
Five members of the Rasch family of latent trait models which have appeared more or less independently in the literature are brought together and identified as one model. In addition to sharing the distinguishing characteristic of the dichotomous Rasch model—separable person and item parameters and hence sufficient statistics—all five models share a common algebraic form and have as their basic element the fundamental process defined by Rasch's simple logistic expression. In these models, the sufficient statistics for person and item parameters are counts of events constructed to be indicative of the variable being measured, and the measures they enable are ‘fundamental’.  相似文献   

13.
When using linear models for cluster-correlated or longitudinal data, a common modeling practice is to begin by fitting a relatively simple model and then to increase the model complexity in steps. New predictors might be added to the model, or a more complex covariance structure might be specified for the observations. When fitting models for binary or ordered-categorical outcomes, however, comparisons between such models are impeded by the implicit rescaling of the model estimates that takes place with the inclusion of new predictors and/or random effects. This paper presents an approach for putting the estimates on a common scale to facilitate relative comparisons between models fit to binary or ordinal outcomes. The approach is developed for both population-average and unit-specific models.  相似文献   

14.
Consideration will be given to a model developed by Rasch that assumes scores observed on some types of attainment tests can be regarded as realizations of a Poisson process. The parameter of the Poisson distribution is assumed to be a product of two other parameters, one pertaining to the ability of the subject and a second pertaining to the difficulty of the test. Rasch's model is expanded by assuming a prior distribution, with fixed but unknown parameters, for the subject parameters. The test parameters are considered fixed. Secondly, it will be shown how additional between- and within-subjects factors can be incorporated. Methods for testing the fit and estimating the parameters of the model will be discussed, and illustrated by empirical examples.  相似文献   

15.
Historically, the Poisson process has been the “benchmark” model for many stochastic processes. When event counts from a particular process fail to be described adequately by a Poisson distribution, a researcher may turn to generalized Poisson processes to model the empirical data more accurately. Two common generalizations are the (1) heterogeneous Poisson process, in which the process rate is a Gamma random variable, and (2) contagious Poisson process, in which the process rate depends linearly on the current state of the process. Paradoxically, both the heterogeneous and contagious processes yield the same theoretical distribution for the number of events that occur in a single interval of time. Consequently, distinguishing between these two using event counts can be difficult. This situation is discussed, first reviewing the models and distribution, then giving strategies for choosing between them using event count data, and demonstrating these techniques on episodes of hospitalization for schizophrenia.  相似文献   

16.
When can a single variable be more accurate in binary choice than multiple sources of information? We derive analytically the probability that a single variable (SV) will correctly predict one of two choices when both criterion and predictor are continuous variables. We further provide analogous derivations for multiple regression (MR) and equal weighting (EW) and specify the conditions under which the models differ in expected predictive ability. Key factors include variability in cue validities, intercorrelation between predictors, and the ratio of predictors to observations in MR. Theory and simulations are used to illustrate the differential effects of these factors. Results directly address why and when “one-reason” decision making can be more effective than analyses that use more information. We thus provide analytical backing to intriguing empirical results that, to date, have lacked theoretical justification. There are predictable conditions for which one should expect “less to be more.”  相似文献   

17.
Conjunctive item response models are introduced such that (a) sufficient statistics for latent traits are not necessarily additive in item scores; (b) items are not necessarily locally independent; and (c) existing compensatory (additive) item response models including the binomial, Rasch, logistic, and general locally independent model are special cases. Simple estimates and hypothesis tests for conjunctive models are introduced and evaluated as well. Conjunctive models are also identified with cognitive models that assume the existence of several individually necessary component processes for a global ability. It is concluded that conjunctive models and methods may show promise for constructing improved tests and uncovering conjunctive cognitive structure. It is also concluded that conjunctive item response theory may help to clarify the relationships between local dependence, multidimensionality, and item response function form.I appreciate the many helpful suggestions that were given by the reviewers and Ivo Molenaar.  相似文献   

18.
Methods for the identification of differential item functioning (DIF) in Rasch models are typically restricted to the case of two subgroups. A boosting algorithm is proposed that is able to handle the more general setting where DIF can be induced by several covariates at the same time. The covariates can be both continuous and (multi‐)categorical, and interactions between covariates can also be considered. The method works for a general parametric model for DIF in Rasch models. Since the boosting algorithm selects variables automatically, it is able to detect the items which induce DIF. It is demonstrated that boosting competes well with traditional methods in the case of subgroups. The method is illustrated by an extensive simulation study and an application to real data.  相似文献   

19.
The well-known Rasch model is generalized to a multicomponent model, so that observations of component events are not needed to apply the model. It is shown that the generalized model has retained the property of the specific objectivity of the Rasch model. For a restricted variant of the model, maximum likelihood estimates of its parameters and a statistical test of the model are given. The results of an application to a mathematics test involving six components are described.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号