首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters. Simulated data sets were analyzed via two joint and two marginal Bayesian estimation procedures. The marginal Bayesian estimation procedures yielded consistently smaller root mean square differences than the joint Bayesian estimation procedures for item and ability estimates. As the sample size and test length increased, the four Bayes procedures yielded essentially the same result.The authors wish to thank the Editor and anonymous reviewers for their insightful comments and suggestions.  相似文献   

2.
Item response theory (IT) models are now in common use for the analysis of dichotomous item responses. This paper examines the sampling theory foundations for statistical inference in these models. The discussion includes: some history on the stochastic subject versus the random sampling interpretations of the probability in IRT models; the relationship between three versions of maximum likelihood estimation for IRT models; estimating versus estimating -predictors; IRT models and loglinear models; the identifiability of IRT models; and the role of robustness and Bayesian statistics from the sampling theory perspective.A presidential address can serve many different functions. This one is a report of investigations I started at least ten years ago to understand what IRT was all about. It is a decidedly one-sided view, but I hope it stimulates controversy and further research. I have profited from discussions of this material with many people including: Brian Junker, Charles Lewis, Nicholas Longford, Robert Mislevy, Ivo Molenaar, Donald Rock, Donald Rubin, Lynne Steinberg, Martha Stocking, William Stout, Dorothy Thayer, David Thissen, Wim van der Linden, Howard Wainer, and Marilyn Wingersky. Of course, none of them is responsible for any errors or misstatements in this paper. The research was supported in part by the Cognitive Science Program, Office of Naval Research under Contract No. Nooo14-87-K-0730 and by the Program Statistics Research Project of Educational Testing Service.  相似文献   

3.
A goodness of fit test presented by Andersen is shown to be incorrect. The correct test is described and a re-analysis of Andersen's data is provided.  相似文献   

4.
Latent trait models for binary responses to a set of test items are considered from the point of view of estimating latent trait parameters=( 1, , n ) and item parameters=( 1, , k ), where j may be vector valued. With considered a random sample from a prior distribution with parameter, the estimation of (, ) is studied under the theory of the EM algorithm. An example and computational details are presented for the Rasch model.This work was supported by Contract No. N00014-81-K-0265, Modification No. P00002, from Personnel and Training Research Programs, Psychological Sciences Division, Office of Naval Research. The authors wish to thank an anonymous reviewer for several valuable suggestions.  相似文献   

5.
Bayes modal estimation in item response models   总被引:1,自引:0,他引:1  
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one-, two-, and three-parameter binary logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.This research was supported by a grant from the Spencer Foundation, Chicago, IL. Comments and suggestions on earlier drafts by Charles Lewis, Frederic Lord, Rosenbaum, James Ramsey, Hiroshi Watanabe, the editor, and two anonymous referees are gratefully acknowledged.  相似文献   

6.
In this paper, the efficiency of conditional maximum likelihood (CML) and marginal maximum likelihood (MML) estimation of the item parameters of the Rasch model in incomplete designs is investigated. The use of the concept of F-information (Eggen, 2000) is generalized to incomplete testing designs. The scaled determinant of the F-information matrix is used as a scalar measure of information contained in a set of item parameters. In this paper, the relation between the normalization of the Rasch model and this determinant is clarified. It is shown that comparing estimation methods with the defined information efficiency is independent of the chosen normalization. The generalization of the method to other models than the Rasch model is discussed. In examples, information comparisons are conducted. It is found that for both CML and MML some information is lost in all incomplete designs compared to complete designs. A general result is that with increasing test booklet length the efficiency of an incomplete design, compared to a complete design, is increasing, as is the efficiency of CML compared to MML. The main difference between CML and MML is seen in the effect of the length of the test booklet. It will be demonstrated that with very small booklets, there is a substantial loss in information (about 35%) with CML estimation, while this loss is only about 10% in MML estimation. However, with increasing test length, the differences between CML and MML quickly disappear.  相似文献   

7.
Consideration will be given to a model developed by Rasch that assumes scores observed on some types of attainment tests can be regarded as realizations of a Poisson process. The parameter of the Poisson distribution is assumed to be a product of two other parameters, one pertaining to the ability of the subject and a second pertaining to the difficulty of the test. Rasch's model is expanded by assuming a prior distribution, with fixed but unknown parameters, for the subject parameters. The test parameters are considered fixed. Secondly, it will be shown how additional between- and within-subjects factors can be incorporated. Methods for testing the fit and estimating the parameters of the model will be discussed, and illustrated by empirical examples.  相似文献   

8.
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood estimation methods (conditional, marginal, and joint). Three information criteria fit indices (Akaike information criterion, Bayesian information criterion, and sample size adjusted BIC) were used in a simulation study and an empirical study. Findings of this study showed that the spurious latent class problem was observed with marginal maximum likelihood and joint maximum likelihood estimations. However, conditional maximum likelihood estimation showed no overextraction problem with non-normal ability distributions.  相似文献   

9.
A logistic regression model is suggested for estimating the relation between a set of manifest predictors and a latent trait assumed to be measured by a set ofk dichotomous items. Usually the estimated subject parameters of latent trait models are biased, especially for short tests. Therefore, the relation between a latent trait and a set of predictors should not be estimated with a regression model in which the estimated subject parameters are used as a dependent variable. Direct estimation of the relation between the latent trait and one or more independent variables is suggested instead. Estimation methods and test statistics for the Rasch model are discussed and the model is illustrated with simulated and empirical data.  相似文献   

10.
In this paper it is shown that under the random effects generalized partial credit model for the measurement of a single latent variable by a set of polytomously scored items, the joint marginal probability distribution of the item scores has a closed-form expression in terms of item category location parameters, parameters that characterize the distribution of the latent variable in the subpopulation of examinees with a zero score on all items, and item-scaling parameters. Due to this closed-form expression, all parameters of the random effects generalized partial credit model can be estimated using marginal maximum likelihood estimation without assuming a particular distribution of the latent variable in the population of examinees and without using numerical integration. Also due to this closed-form expression, new special cases of the random effects generalized partial credit model can be identified. In addition to these new special cases, a slightly more general model than the random effects generalized partial credit model is presented. This slightly more general model is called the extended generalized partial credit model. Attention is paid to maximum likelihood estimation of the parameters of the extended generalized partial credit model and to assessing the goodness of fit of the model using generalized likelihood ratio tests. Attention is also paid to person parameter estimation under the random effects generalized partial credit model. It is shown that expected a posteriori estimates can be obtained for all possible score patterns. A simulation study is carried out to show the usefulness of the proposed models compared to the standard models that assume normality of the latent variable in the population of examinees. In an empirical example, some of the procedures proposed are demonstrated.  相似文献   

11.
When modeling the relationship between two nominal categorical variables, it is often desirable to include covariates to understand how individuals differ in their response behavior. Typically, however, not all the relevant covariates are available, with the result that the measured variables cannot fully account for the associations between the nominal variables. Under the assumption that the observed and unobserved variables follow a homogeneous conditional Gaussian distribution, this paper proposesRC(M) regression models to decompose the residual associations between the polytomous variables. Based on Goodman's (1979, 1985)RC(M) association model, a distinctive feature ofRC(M) regression models is that they facilitate the joint estimation of effects due to manifest and omitted (continuous) variables without requiring numerical integration. TheRC(M) regression models are illustrated using data from the High School and Beyond study (Tatsuoka & Lohnes, 1988). This article was accepted for publication, when Willem J. Heiser was the Editor ofPsychometrika. This research was supported by grants from the National Science Foundation (#SBR96-17510 and #SBR94-09531) and the Bureau of Educational Research at the University of Illinois. We thank Jee-Seon Kim for comments and computational assistance.  相似文献   

12.
Item response theory models posit latent variables to account for regularities in students' performances on test items. Wilson's “Saltus” model extends the ideas of IRT to development that occurs in stages, where expected changes can be discontinuous, show different patterns for different types of items, or even exhibit reversals in probabilities of success on certain tasks. Examples include Piagetian stages of psychological development and Siegler's rule-based learning. This paper derives marginal maximum likelihood (MML) estimation equations for the structural parameters of the Saltus model and suggests a computing approximation based on the EM algorithm. For individual examinees, empirical Bayes probabilities of learning-stage are given, along with proficiency parameter estimates conditional on stage membership. The MML solution is illustrated with simulated data and an example from the domain of mixed number subtraction. The authors' names appear in alphabetical order. We would like to thank Karen Draney for computer programming, Kikumi Tatsuoka for allowing us to use the mixed-number subtraction data, and Eric Bradlow, Chan Dayton, Kikumi Tatsuoka, and four anonymous referees for helpful suggestions. The first author's work was supported by Contract No. N00014-88-K-0304, R&T 4421552, from the Cognitive Sciences Program, Cognitive and Neural Sciences Division, Office of Naval Research, and by the Program Research Planning Council of Educational Testing Service. The second author's work was supported by a National Academy of Education Spencer Fellowship and by a Junior Faculty Research Grant from the Committee on Research, University of California at Berkeley. A copy of the Saltus computer program can be obtained from the second author.  相似文献   

13.
The achievement level is a variable measured with error, that can be estimated by means of the Rasch model. Teacher grades also measure the achievement level but they are expressed on a different scale. This paper proposes a method for combining these two scores to obtain a synthetic measure of the achievement level based on the theory developed for regression with covariate measurement error. In particular, the focus is on ordinal scaled grades, using the SIMEX method for measurement error correction. The result is a measure comparable across subjects with smaller measurement error variance. An empirical application illustrates the method.  相似文献   

14.
A method is proposed for constructing indices as linear functions of variables such that the reliability of the compound score is maximized. Reliability is defined in the framework of latent variable modeling [i.e., item response theory (IRT)] and optimal weights of the components of the index are found by maximizing the posterior variance relative to the total latent variable variance. Three methods for estimating the weights are proposed. The first is a likelihood-based approach, that is, marginal maximum likelihood (MML). The other two are Bayesian approaches based on Markov chain Monte Carlo (MCMC) computational methods. One is based on an augmented Gibbs sampler specifically targeted at IRT, and the other is based on a general purpose Gibbs sampler such as implemented in OpenBugs and Jags. Simulation studies are presented to demonstrate the procedure and to compare the three methods. Results are very similar, so practitioners may be suggested the use of the easily accessible latter method. A real-data set pertaining to the 28-joint Disease Activity Score is used to show how the methods can be applied in a complex measurement situation with multiple time points and mixed data formats.  相似文献   

15.
Abstract

This paper evaluated multilevel reliability measures in two-level nested designs (e.g., students nested within teachers) within an item response theory framework. A simulation study was implemented to investigate the behavior of the multilevel reliability measures and the uncertainty associated with the measures in various multilevel designs regarding the number of clusters, cluster sizes, and intraclass correlations (ICCs), and in different test lengths, for two parameterizations of multilevel item response models with separate item discriminations or the same item discrimination over levels. Marginal maximum likelihood estimation (MMLE)-multiple imputation and Bayesian analysis were employed to evaluate the accuracy of the multilevel reliability measures and the empirical coverage rates of Monte Carlo (MC) confidence or credible intervals. Considering the accuracy of the multilevel reliability measures and the empirical coverage rate of the intervals, the results lead us to generally recommend MMLE-multiple imputation. In the model with separate item discriminations over levels, marginally acceptable accuracy of the multilevel reliability measures and empirical coverage rate of the MC confidence intervals were found in a limited condition, 200 clusters, 30 cluster size, .2 ICC, and 40 items, in MMLE-multiple imputation. In the model with the same item discrimination over levels, the accuracy of the multilevel reliability measures and the empirical coverage rate of the MC confidence intervals were acceptable in all multilevel designs we considered with 40 items under MMLE-multiple imputation. We discuss these findings and provide guidelines for reporting multilevel reliability measures.  相似文献   

16.
Lord and Wingersky have developed a method for computing the asymptotic variance-covariance matrix of maximum likelihood estimates for item and person parameters under some restrictions on the estimates which are needed in order to fix the latent scale. The method is tedious, but can be simplified for the Rasch model when one is only interested in the item parameters. This is demonstrated here under a suitable restriction on the item parameter estimates.  相似文献   

17.
This paper is concerned with the analysis of structural equation models with polytomous variables. A computationally efficient three-stage estimator of the thresholds and the covariance structure parameters, based on partition maximum likelihood and generalized least squares estimation, is proposed. An example is presented to illustrate the method.This research was supported in part by a research grant DA01070 from the U.S. Public Health Service. The production assistance of Julie Speckart is gratefully acknowledged.  相似文献   

18.
Equivalence of marginal likelihood of the two-parameter normal ogive model in item response theory (IRT) and factor analysis of dichotomized variables (FA) was formally proved. The basic result on the dichotomous variables was extended to multicategory cases, both ordered and unordered categorical data. Pair comparison data arising from multiple-judgment sampling were discussed as a special case of the unordered categorical data. A taxonomy of data for the IRT and FA models was also attempted.The work reported in this paper has been supported by Grant A6394 to the first author from the Natural Sciences and Engineering Research Council of Canada.  相似文献   

19.
Applications of item response theory, which depend upon its parameter invariance property, require that parameter estimates be unbiased. A new method, weighted likelihood estimation (WLE), is derived, and proved to be less biased than maximum likelihood estimation (MLE) with the same asymptotic variance and normal distribution. WLE removes the first order bias term from MLE. Two Monte Carlo studies compare WLE with MLE and Bayesian modal estimation (BME) of ability in conventional tests and tailored tests, assuming the item parameters are known constants. The Monte Carlo studies favor WLE over MLE and BME on several criteria over a wide range of the ability scale.  相似文献   

20.
A method for joint analysis of reaction times and same-different judgments is discussed. A set of stimuli is assumed to have some parametric representation which uniquely defines dissimilarities between the stimuli. Those dissimilarities are then related to the observed reaction times and same-different judgments through a model of psychological processes. Three representation models of dissimilarities are considered, the Minkowski power distance model, the linear model, and Tversky's feature matching model. Maximum likelihood estimation procedures are developed and implemented in the form of a FORTRAN program. An example is given to illustrate the kind of analyses that can be performed by the proposed method.The work reported in this paper is supported by Grant A6394 to the first author from the Natural Sciences and Engineering Research Council of Canada. Portions of this study have been presented at the Psychometric Society meeting in Chapel Hill, N.C., in May, 1981. We thank Tony Marley, Jim Ramsay and anonymous reviewers for their helpful comments. MAXRT, a computer program which performs the computations described in this paper may be obtained by writing to the first author.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号