首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Huynh Huynh 《Psychometrika》1994,59(1):111-119
Given a Masters partial credit item withn known step difficulties, conditions are stated for the existence of a set of (locally) independent Rasch binary items such that their raw score and the partial credit raw score have identical probability density functions. The conditions are those for the existence ofn positive values with predetermined elementary symmetric functions and include the requirement that then step difficulties form an increasing sequence.  相似文献   

For each Rasch (Masters) partial credit item, there exists a set of independent Rasch binary and indecomposable trinary items for which the sum of the scores and the partial credit score have identical probability density functions. If each indecomposable trinary item is further expressed as the sum of two binary items, then the binary items are positively dependent and cannot be both of the Rasch type. This paper was written while the author was working with Steve Ferrara and Hillary Michaels on some technical aspects of the Maryland School Performance Assessment Program. The author had been puzzled by the fact that most MSPAP assessment items have three or less score categories. With a psychometric justification now being apparent, this paper is dedicated to both of them.  相似文献   

Methods for the identification of differential item functioning (DIF) in Rasch models are typically restricted to the case of two subgroups. A boosting algorithm is proposed that is able to handle the more general setting where DIF can be induced by several covariates at the same time. The covariates can be both continuous and (multi‐)categorical, and interactions between covariates can also be considered. The method works for a general parametric model for DIF in Rasch models. Since the boosting algorithm selects variables automatically, it is able to detect the items which induce DIF. It is demonstrated that boosting competes well with traditional methods in the case of subgroups. The method is illustrated by an extensive simulation study and an application to real data.  相似文献   

The item response function (IRF) for a polytomously scored item is defined as a weighted sum of the item category response functions (ICRF, the probability of getting a particular score for a randomly sampled examinee of ability ). This paper establishes the correspondence between an IRF and a unique set of ICRFs for two of the most commonly used polytomous IRT models (the partial credit models and the graded response model). Specifically, a proof of the following assertion is provided for these models: If two items have the same IRF, then they must have the same number of categories; moreover, they must consist of the same ICRFs. As a corollary, for the Rasch dichotomous model, if two tests have the same test characteristic function (TCF), then they must have the same number of items. Moreover, for each item in one of the tests, an item in the other test with an identical IRF must exist. Theoretical as well as practical implications of these results are discussed.This research was supported by Educational Testing Service Allocation Projects No. 79409 and No. 79413. The authors wish to thank John Donoghue, Ming-Mei Wang, Rebecca Zwick, and Zhiliang Ying for their useful comments and discussions. The authors also wish to thank three anonymous reviewers for their comments.  相似文献   

A loglinear IRT model is proposed that relates polytomously scored item responses to a multidimensional latent space. The analyst may specify a response function for each response, indicating which latent abilities are necessary to arrive at that response. Each item may have a different number of response categories, so that free response items are more easily analyzed. Conditional maximum likelihood estimates are derived and the models may be tested generally or against alternative loglinear IRT models.Hank Kelderman is currently affiliated with Vrije Universiteit, Amsterdam.We thank Linda Vodegel-Matzen of the Division of Developmental Psychology of the University of Amsterdam for making available the data used in the example in this article.  相似文献   

Recent detection methods for Differential Item Functioning (DIF) include approaches like Rasch Trees, DIFlasso, GPCMlasso and Item Focussed Trees, all of which - in contrast to well established methods - can handle metric covariates inducing DIF. A new estimation method shall address their downsides by mainly aiming at combining three central virtues: the use of conditional likelihood for estimation, the incorporation of linear influence of metric covariates on item difficulty and the possibility to detect different DIF types: certain items showing DIF, certain covariates inducing DIF, or certain covariates inducing DIF in certain items. Each of the approaches mentioned lacks in two of these aspects. We introduce a method for DIF detection, which firstly utilizes the conditional likelihood for estimation combined with group Lasso-penalization for item or variable selection and L1-penalization for interaction selection, secondly incorporates linear effects instead of approximation through step functions, and thirdly provides the possibility to investigate any of the three DIF types. The method is described theoretically, challenges in implementation are discussed. A dataset is analysed for all DIF types and shows comparable results between methods. Simulation studies per DIF type reveal competitive performance of cmlDIFlasso, particularly when selecting interactions in case of large sample sizes and numbers of parameters. Coupled with low computation times, cmlDIFlasso seems a worthwhile option for applied DIF detection.  相似文献   

Loglinear unidimensional and multidimensional Rasch models are considered for the analysis of repeated observations of polytomous indicators with ordered response categories. Reparameterizations and parameter restrictions are provided which facilitate specification of a variety of hypotheses about latent processes of change. Models of purely quantitative change in latent traits are proposed as well as models including structural change. A conditional likelihood ratio test is presented for the comparison of unidimensional and multiple scales Rasch models. In the context of longitudinal research, this renders possible the statistical test of homogeneity of change against subject-specific change in latent traits. Applications to two empirical data sets illustrate the use of the models.The author is greatly indebted to Ulf Böckenholt, Rolf Langeheine, and several anonymous reviewers for many helpful suggestions.  相似文献   

In this paper we derive optimal designs for the Rasch Poisson counts model and its extended version of the (generalized) negative binomial counts model incorporating several binary predictors for the difficulty parameter. To efficiently estimate the regression coefficients of the predictors, locally D-optimal designs are developed. After an introduction to the Rasch Poisson counts model and its extension, we will specify these models as particular generalized linear models. Based on this embedding, optimal designs for both models including several binary explanatory variables will be presented. Therefore, we will derive conditions on the effect sizes for certain designs to be locally D-optimal. Finally, it is pointed out that the results derived for the Rasch Poisson models can be applied for more general Poisson regression models which should receive more attention in future psychological research.  相似文献   

Loglinear Rasch model tests   总被引:1,自引:0,他引:1  
Existing statistical tests for the fit of the Rasch model have been criticized, because they are only sensitive to specific violations of its assumptions. Contingency table methods using loglinear models have been used to test various psychometric models. In this paper, the assumptions of the Rasch model are discussed and the Rasch model is reformulated as a quasi-independence model. The model is a quasi-loglinear model for the incomplete subgroup × score × item 1 × item 2 × ... × itemk contingency table. Using ordinary contingency table methods the Rasch model can be tested generally or against less restrictive quasi-loglinear models to investigate specific violations of its assumptions.  相似文献   

A fundamental assumption of most IRT models is that items measure the same unidimensional latent construct. For the polytomous Rasch model two ways of testing this assumption against specific multidimensional alternatives are discussed. One, a marginal approach assuming a multidimensional parametric latent variable distribution, and, two, a conditional approach with no distributional assumptions about the latent variable. The second approach generalizes the Martin-Löf test for the dichotomous Rasch model in two ways: to polytomous items and to a test against an alternative that may have more than two dimensions. A study on occupational health is used to motivate and illustrate the methods.The authors would like to thank Niels Keiding, Klaus Larsen and the anonymous reviewers for valuable comments to a previous version of this paper. This research was supported by a grant from the Danish Research Academy and by a general research grant from Quality Metric, Inc.  相似文献   

Analysing ordinal data is becoming increasingly important in psychology, especially in the context of item response theory. The generalized partial credit model (GPCM) is probably the most widely used ordinal model and has found application in many large-scale educational assessment studies such as PISA. In the present paper, optimal test designs are investigated for estimating persons’ abilities with the GPCM for calibrated tests when item parameters are known from previous studies. We find that local optimality may be achieved by assigning non-zero probability only to the first and last categories independently of a person's ability. That is, when using such a design, the GPCM reduces to the dichotomous two-parameter logistic (2PL) model. Since locally optimal designs require the true ability to be known, we consider alternative Bayesian design criteria using weight distributions over the ability parameter space. For symmetric weight distributions, we derive necessary conditions for the optimal one-point design of two response categories to be Bayes optimal. Furthermore, we discuss examples of common symmetric weight distributions and investigate under what circumstances the necessary conditions are also sufficient. Since the 2PL model is a special case of the GPCM, all of these results hold for the 2PL model as well.  相似文献   

Items bundles     
An item bundle is a small group of multiple choice items that share a common reading passage or graph, or a small group of matching items that share distractors. Item bundles are easily identified by paging through a copy of a test. Bundled items may violate the latent conditional independence assumption of unidimensional item response theory (IRT), but such a violation would not typically suggest the existence of a new fundamental human ability to read one specific reading passage or to interpret one specific graph. It is important, therefore, to have theoretical concepts and empirical checks that distinguish between, on the one hand, anticipated violations of latent conditional independence within item bundles, and, on the other hand, violations that cannot be attributed to idiosyncratic features of test format and instead suggest departures from unidimensionalty. To this end, two theorems on unidimensional IRT are extended to describe observable item response distributions when there is conditional independencebetween but not necessarilywithin item bundles.The author is grateful to Ivo Molenaar and the referees for many helpful suggestions, and to D. Thayer for assistance with computing.  相似文献   

This paper investigates the psychometric properties of the values in action (VIA) character strengths (Peterson and Seligman, 2004 Peterson, C., & Seligman, M. E. P. (2004). Character strengths and virtues: A handbook and classification. Oxford, United Kingdom: Oxford University Press. [Google Scholar]). A sample of 904 South African undergraduate students (female=77%, male=23%, black=70%, mean age=21.07 years, SD age=2.73 years) was assessed using a 380-item questionnaire that included the items from the international personality item pool (IPIP) values in action (VIA) measure of 24 character strengths as well as additional items based on the underlying theory of the particular constructs. Responses were analysed with the Rasch rating scale model. Reliability coefficients were computed for the retained scale items. The majority (21) of the scales demonstrated satisfactory Rasch model fit and good reliability of scores. The finding that a large proportion of strengths exhibited differential item functioning for at least one of (1) gender, (2) ethnicity and (3) home language group, challenges the assumption that character strengths are necessarily accultural, indicating qualitative distinctions in construct conceptualisations and measurement as a function of emic factors.  相似文献   

主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,介绍了国外相关的典型应用,并且讨论了该模型的应用条件。  相似文献   

Extensions of the partial credit model   总被引:1,自引:0,他引:1  
The partial credit model, developed by Masters (1982), is a unidimensional latent trait model for responses scored in two or more ordered categories. In the present paper some extensions of the model are presented. First, a marginal maximum likelihood estimation procedure is developed which allows for incomplete data and linear restrictions on both the item and the population parameters. Secondly, two statistical tests for evaluating model fit are presented: the former test has power against violation of the assumption about the ability distribution, the latter test offers the possibility of identifying specific items that do not fit the model.The authors are indepted to professor Wim van der Linden and Huub Verstralen for their helpful comments.  相似文献   

In item response models of the Rasch type (Fischer & Molenaar, 1995), item parameters are often estimated by the conditional maximum likelihood (CML) method. This paper addresses the loss of information in CML estimation by using the information concept of F-information (Liang, 1983). This concept makes it possible to specify the conditions for no loss of information and to define a quantification of information loss. For the dichotomous Rasch model, the derivations will be given in detail to show the use of the F-information concept for making comparisons for different estimation methods. It is shown that by using CML for item parameter estimation, some information is almost always lost. But compared to JML (joint maximum likelihood) as well as to MML (marginal maximum likelihood) the loss is very small. The reported efficiency in the use of information of CML to JML and to MML in several comparisons is always larger than 93%, and in tests with a length of 20 items or more, larger than 99%.  相似文献   

A method for analyzing test item responses is proposed to examine differential item functioning (DIF) in multiple-choice items through a combination of the usual notion of DIF, for correct/incorrect responses and information about DIF contained in each of the alternatives. The proposed method uses incomplete latent class models to examine whether DIF is caused by the attractiveness of the alternatives, difficulty of the item, or both. DIF with respect to either known or unknown subgroups can be tested by a likelihood ratio test that is asymptotically distributed as a chi-square random variable.  相似文献   

HSK主观考试评分的Rasch实验分析   总被引:1,自引:0,他引:1  
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,设计了基于该模型的HSK主观考试评分质量控制应用框架,利用HSK作文评分数据进行了实验验证。  相似文献   

采用项目反应理论(IRT)的多侧面Rasch模型(MFRM),分析评价中心技术中无领导小组讨论(LGD)的测评结果,探讨被试能力水平、评委评分宽严度、评分内部一致性、维度难度和评定等级等问题,进而讨论各种偏差。通过 MFRM 分析人事测评结果,可深入了解被试能力的真实差异、甑别维度难度、探查测评误差源,从而完善测评试题编制、评估或诊断评委合格性、提高测评维度与测评目的匹配性,为拓展项目反应理论在人事测评中的应用提供独特视角。  相似文献   

采用Rosenberg自尊量表(RSES)对425名在校大学生进行施测,应用项目反应理论的Rasch模型对项目指标进行分析及DIF检验。结果表明,Rosenberg自尊量表具有单维性,量表的信度为0.84; 除项目8以外,其他项目拟合指标良好,较适用来区分中等及偏低自尊水平的个体,项目功能差异检验发现在项目1和项目5上存在DIF,表现为男生自尊水平要高于女生。相对于经典测量理论,应用Rasch模型分析Rosenberg自尊量表具有优势,为进一步的完善和使用该自尊量表提供依据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号