首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The item response function (IRF) for a polytomously scored item is defined as a weighted sum of the item category response functions (ICRF, the probability of getting a particular score for a randomly sampled examinee of ability ). This paper establishes the correspondence between an IRF and a unique set of ICRFs for two of the most commonly used polytomous IRT models (the partial credit models and the graded response model). Specifically, a proof of the following assertion is provided for these models: If two items have the same IRF, then they must have the same number of categories; moreover, they must consist of the same ICRFs. As a corollary, for the Rasch dichotomous model, if two tests have the same test characteristic function (TCF), then they must have the same number of items. Moreover, for each item in one of the tests, an item in the other test with an identical IRF must exist. Theoretical as well as practical implications of these results are discussed.This research was supported by Educational Testing Service Allocation Projects No. 79409 and No. 79413. The authors wish to thank John Donoghue, Ming-Mei Wang, Rebecca Zwick, and Zhiliang Ying for their useful comments and discussions. The authors also wish to thank three anonymous reviewers for their comments.  相似文献   

2.
The new R package flirt is introduced for flexible item response theory (IRT) modeling of psychological, educational, and behavior assessment data. flirt integrates a generalized linear and nonlinear mixed modeling framework with graphical model theory. The graphical model framework allows for efficient maximum likelihood estimation. The key feature of flirt is its modular approach to facilitate convenient and flexible model specifications. Researchers can construct customized IRT models by simply selecting various modeling modules, such as parametric forms, number of dimensions, item and person covariates, person groups, link functions, etc. In this paper, we describe major features of flirt and provide examples to illustrate how flirt works in practice.  相似文献   

3.
In this paper it is shown that under the random effects generalized partial credit model for the measurement of a single latent variable by a set of polytomously scored items, the joint marginal probability distribution of the item scores has a closed-form expression in terms of item category location parameters, parameters that characterize the distribution of the latent variable in the subpopulation of examinees with a zero score on all items, and item-scaling parameters. Due to this closed-form expression, all parameters of the random effects generalized partial credit model can be estimated using marginal maximum likelihood estimation without assuming a particular distribution of the latent variable in the population of examinees and without using numerical integration. Also due to this closed-form expression, new special cases of the random effects generalized partial credit model can be identified. In addition to these new special cases, a slightly more general model than the random effects generalized partial credit model is presented. This slightly more general model is called the extended generalized partial credit model. Attention is paid to maximum likelihood estimation of the parameters of the extended generalized partial credit model and to assessing the goodness of fit of the model using generalized likelihood ratio tests. Attention is also paid to person parameter estimation under the random effects generalized partial credit model. It is shown that expected a posteriori estimates can be obtained for all possible score patterns. A simulation study is carried out to show the usefulness of the proposed models compared to the standard models that assume normality of the latent variable in the population of examinees. In an empirical example, some of the procedures proposed are demonstrated.  相似文献   

4.

Objective

Variability in infant sleep and negative affective behavior (NAB) is a developmental phenomenon that has long been of interest to researchers and clinicians. However, analyses and delineation of such temporal patterns were often limited to basic statistical approaches, which may prevent adequate identification of meaningful variation within these patterns. Modern statistical procedures such as additive models may detect specific patterns of temporal variation in infant behavior more effectively.

Method

Hundred and twenty-one mothers were asked to record different behaviors of their 4–44 weeks old healthy infants by diaries for three days consecutively. Circadian patterns as well as individual trajectories and day-to-day variability of infant sleep and NAB were modeled with generalized linear models (GLMs) including a linear and quadratic polynomial for time, a GLM with a polynomial of the 8th order, a GLM with a harmonic function, a generalized linear mixed model (GLMM) with a polynomial of the 8th order, a generalized additive model, and a generalized additive mixed model (GAMM).

Results

The semi-parametric model GAMM was found to fit the data of infant sleep better than any other parametric model used. GLMM with a polynomial of the 8th order and GAMM modeled temporal patterns of infant NAB equally well, although the GLMM exhibited a slightly better model fit while GAMM was easier to interpret. Besides the well-known evening clustering in infant NAB we found a significant second peak in NAB around midday that was not affected by the constant decline in the amounts of NAB across the 3-day study period.

Conclusion

Using advanced statistical procedures (GAMM and GLMM) even small variations and phenomena in infant behavior can be reliably detected. Future studies investigating variability and temporal patterns in infant variables may benefit from these statistical approaches.  相似文献   

5.
When scaling data using item response theory, valid statements based on the measurement model are only permissible if the model fits the data. Most item fit statistics used to assess the fit between observed item responses and the item responses predicted by the measurement model show significant weaknesses, such as the dependence of fit statistics on sample size and number of items. In order to assess the size of misfit and to thus use the fit statistic as an effect size, dependencies on properties of the data set are undesirable. The present study describes a new approach and empirically tests it for consistency. We developed an estimator of the distance between the predicted item response functions (IRFs) and the true IRFs by semiparametric adaptation of IRFs. For the semiparametric adaptation, the approach of extended basis functions due to Ramsay and Silverman (2005) is used. The IRF is defined as the sum of a linear term and a more flexible term constructed via basis function expansions. The group lasso method is applied as a regularization of the flexible term, and determines whether all parameters of the basis functions are fixed at zero or freely estimated. Thus, the method serves as a selection criterion for items that should be adjusted semiparametrically. The distance between the predicted and semiparametrically adjusted IRF of misfitting items can then be determined by describing the fitting items by the parametric form of the IRF and the misfitting items by the semiparametric approach. In a simulation study, we demonstrated that the proposed method delivers satisfactory results in large samples (i.e., N ≥ 1,000).  相似文献   

6.
We propose a generalization of the speed–accuracy response model (SARM) introduced by Maris and van der Maas (Psychometrika 77:615–633, 2012). In these models, the scores that result from a scoring rule that incorporates both the speed and accuracy of item responses are modeled. Our generalization is similar to that of the one-parameter logistic (or Rasch) model to the two-parameter logistic (or Birnbaum) model in item response theory. An expectation–maximization (EM) algorithm for estimating model parameters and standard errors was developed. Furthermore, methods to assess model fit are provided in the form of generalized residuals for item score functions and saddlepoint approximations to the density of the sum score. The presented methods were evaluated in a small simulation study, the results of which indicated good parameter recovery and reasonable type I error rates for the residuals. Finally, the methods were applied to two real data sets. It was found that the two-parameter SARM showed improved fit compared to the one-parameter SARM in both data sets.  相似文献   

7.
The partial credit model is considered under the assumption of a certain linear decomposition of the item × category parameters ih into basic parameters j. This model is referred to as the linear partial credit model. A conditional maximum likelihood algorithm for estimation of the j is presented, based on (a) recurrences for the combinatorial functions involved, and (b) using a quasi-Newton approach, the so-called Broyden-Fletcher-Goldfarb-Shanno (BFGS) method; (a) guarantees numerically stable results, (b) avoids the direct computation of the Hesse matrix, yet produces a sequence of certain positive definite matricesB k ,k=1, 2, ..., converging to the asymptotic variance-covariance matrix of the . The practicality of these numerical methods is demonstrated both by means of simulations and of an empirical application to the measurement of treatment effects in patients with psychosomatic disorders.The authors thank one anonymous reviewer for his constructive comments. Moreover, they thankfully acknowledge financial support by the Österreichische Nationalbank (Austrian National Bank) under Grant No. 3720.  相似文献   

8.
A model‐based procedure for assessing the extent to which missing data can be ignored and handling non‐ignorable missing data is presented. The procedure is based on item response theory modelling. As an example, the approach is worked out in detail in conjunction with item response data modelled using the partial credit and generalized partial credit models. Simulation studies are carried out to assess the extent to which the bias caused by ignoring the missing‐data mechanism can be reduced. Finally, the feasibility of the procedure is demonstrated using data from a study to calibrate a medical disability scale.  相似文献   

9.
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.  相似文献   

10.
A multinormal partial credit model for factor analysis of polytomously scored items with ordered response categories is derived using an extension of the Dutch Identity (Holland in Psychometrika 55:5?C18, 1990). In the model, latent variables are assumed to have a multivariate normal distribution conditional on unweighted sums of item scores, which are sufficient statistics. Attention is paid to maximum likelihood estimation of item parameters, multivariate moments of latent variables, and person parameters. It is shown that the maximum likelihood estimates can be found without the use of numerical integration techniques. More general models are discussed which can be used for testing the model, and it is shown how models with different numbers of latent variables can be tested against each other. In addition, multi-group extensions are proposed, which can be used for testing both measurement invariance and latent population differences. Models and procedures discussed are demonstrated in an empirical data example.  相似文献   

11.
A new item response theory (IRT) model with a tree structure has been introduced for modeling item response processes with a tree structure. In this paper, we present a generalized item response tree model with a flexible parametric form, dimensionality, and choice of covariates. The utilities of the model are demonstrated with two applications in psychological assessments for investigating Likert scale item responses and for modeling omitted item responses. The proposed model is estimated with the freely available R package flirt (Jeon et al., 2014b).  相似文献   

12.
In a broad class of item response theory (IRT) models for dichotomous items the unweighted total score has monotone likelihood ratio (MLR) in the latent trait. In this study, it is shown that for polytomous items MLR holds for the partial credit model and a trivial generalization of this model. MLR does not necessarily hold if the slopes of the item step response functions vary over items, item steps, or both. MLR holds neither for Samejima's graded response model, nor for nonparametric versions of these three polytomous models. These results are surprising in the context of Grayson's and Huynh's results on MLR for nonparametric dichotomous IRT models, and suggest that establishing stochastic ordering properties for nonparametric polytomous IRT models will be much harder.Hemker's research was supported by the Netherlands Research Council, Grant 575-67-034. Junker's research was supported in part by the National Institutes of Health, Grant CA54852, and by the National Science Foundation, Grant DMS-94.04438.  相似文献   

13.
For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if 21 < 4 ln 2 and bimodal otherwi The locations and values of the maxima are derived. Furthermore, it is demonstrated that the value of the maximum is decreasing in 21. Consequently, the maximum of a unimodal item information function is always larger than the maximum of a bimodal one, and similarly for the item discrimination function.The work reported herein was partially supported under the National Assessment of Educational Progress (Grant No. R999G30002; CFDA No. 84.999G) as administered by the Office of Educational Research and Improvement, US Department of Education.  相似文献   

14.
高旭亮  汪大勋  王芳  蔡艳  涂冬波 《心理学报》2019,51(12):1386-1397
基于分部评分模型的思路, 本文提出了一般化的分部评分认知诊断模型(General Partial Credit Diagnostic Model, GPCDM), 与国际上已有的基于分部评分模型思路的多级评分模型GDM (von Davier, 2008)和PC-DINA (de la Torre, 2012)相比, GPCDM的Q矩阵定义更加灵活, 项目参数的约束条件更少。Monte Carlo实验研究表明, GPCDM模型的参数估计精度指标RMSE介于[0.015, 0.043], 表明估计精度尚可; TIMSS (2007)实证数据应用研究表明, 与GDM和PC-DINA模型相比, GPCDM与该数据的拟合度更好, 并且使用GPCDM分析该数据的诊断效果也更优。总之, 本研究提供了一种约束条件更少、功能更为强大的多级评分认知诊断模型。  相似文献   

15.
A person fit test based on the Lagrange multiplier test is presented for three item response theory models for polytomous items: the generalized partial credit model, the sequential model, and the graded response model. The test can also be used in the framework of multidimensional ability parameters. It is shown that the Lagrange multiplier statistic can take both the effects of estimation of the item parameters and the estimation of the person parameters into account. The Lagrange multiplier statistic has an asymptotic χ2-distribution. The Type I error rate and power are investigated using simulation studies. Results show that test statistics that ignore the effects of estimation of the persons’ ability parameters have decreased Type I error rates and power. Incorporating a correction to account for the effects of the estimation of the persons’ ability parameters results in acceptable Type I error rates and power characteristics; incorporating a correction for the estimation of the item parameters has very little additional effect. It is investigated to what extent the three models give comparable results, both in the simulation studies and in an example using data from the NEO Personality Inventory-Revised.  相似文献   

16.
Abstract

In this paper, we apply Vuong’s general approach of model selection to the comparison of nested and non-nested unidimensional and multidimensional item response theory (IRT) models. Vuong’s approach of model selection is useful because it allows for formal statistical tests of both nested and non-nested models. However, only the test of non-nested models has been applied in the context of IRT models to date. After summarizing the statistical theory underlying the tests, we investigate the performance of all three distinct Vuong tests in the context of IRT models using simulation studies and real data. In the non-nested case we observed that the tests can reliably distinguish between the graded response model and the generalized partial credit model. In the nested case, we observed that the tests typically perform as well as or sometimes better than the traditional likelihood ratio test. Based on these results, we argue that Vuong’s approach provides a useful set of tools for researchers and practitioners to effectively compare competing nested and non-nested IRT models.  相似文献   

17.
Probabilistic models with one or more latent variables are designed to report on a corresponding number of skills or cognitive attributes. Multidimensional skill profiles offer additional information beyond what a single test score can provide, if the reported skills can be identified and distinguished reliably. Many recent approaches to skill profile models are limited to dichotomous data and have made use of computationally intensive estimation methods such as Markov chain Monte Carlo, since standard maximum likelihood (ML) estimation techniques were deemed infeasible. This paper presents a general diagnostic model (GDM) that can be estimated with standard ML techniques and applies to polytomous response variables as well as to skills with two or more proficiency levels. The paper uses one member of a larger class of diagnostic models, a compensatory diagnostic model for dichotomous and partial credit data. Many well‐known models, such as univariate and multivariate versions of the Rasch model and the two‐parameter logistic item response theory model, the generalized partial credit model, as well as a variety of skill profile models, are special cases of this GDM. In addition to an introduction to this model, the paper presents a parameter recovery study using simulated data and an application to real data from the field test for TOEFL® Internet‐based testing.  相似文献   

18.
The process-component approach has become quite popular for examining many psychological concepts. A typical example is the model with internal restrictions on item difficulty (MIRID) described by Butter (1994) Butter, R. 1994. Item response models with internal restrictions on item difficulty, Doctoral thesis Leuven, , Belgium: Katholieke Universiteit..  [Google Scholar] and Butter, De Boeck, and Verhelst (1998). This study proposes a hierarchical generalized random-situation random-weight MIRID. The proposed model is more flexible for formulating endogenous latent variables within a multilevel framework, allowing the analysis of polytomous data with complex models (e.g., including item discriminations, random situations, random weights, and heteroskedasticity). The parameters in the proposed model can be estimated using the computer program WinBUGS, which adopts Markov Chain Monte Carlo algorithms. To illustrate the application of the proposed model, a real data set about guilt is analyzed and a comparison of MIRIDs for various conditions is conducted.  相似文献   

19.
A probabilistic choice model is developed for paired comparisons data about psychophysical stimuli. The model is based on Thurstone's Law of Comparative Judgment Case V and assumes that each stimulus is measured on a small number of physical variables. The utility of a stimulus is related to its values on the physical variables either by means of an additive univariate spline model or by means of multivariate spline model. In the additive univariate spline model, a separate univariate spline transformation is estimated for each physical dimension and the utility of a stimulus is assumed to be an additive combination of these transformed values. In the multivariate spline model, the utility of a stimulus is assumed to be a general multivariate spline function in the physical variables. The use of B splines for estimating the transformation functions is discussed and it is shown how B splines can be generalized to the multivariate case by using as basis functions tensor products of the univariate basis functions. A maximum likelihood estimation procedure for the Thurstone Case V model with spline transformation is described and applied for illustrative purposes to various artificial and real data sets. Finally, the model is extended using a latent class approach to the case where there are unreplicated paired comparisons data from a relatively large number of subjects drawn from a heterogeneous population. An EM algorithm for estimating the parameters in this extended model is outlined and illustrated on some real data.The first author is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek. The authors are indebted to Ulf Böckenholt and Yoshio Takane for useful comments on an earlier draft of this paper.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号