首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An item response theory (IRT) model is used as a measurement error model for the dependent variable of a multilevel model. The dependent variable is latent but can be measured indirectly by using tests or questionnaires. The advantage of using latent scores as dependent variables of a multilevel model is that it offers the possibility of modelling response variation and measurement error and separating the influence of item difficulty and ability level. The two‐parameter normal ogive model is used for the IRT model. It is shown that the stochastic EM algorithm can be used to estimate the parameters which are close to the maximum likelihood estimates. This algorithm is easily implemented. The estimation procedure will be compared to an implementation of the Gibbs sampler in a Bayesian framework. Examples using real data are given.  相似文献   

2.
It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.  相似文献   

3.
阶层线性模型是处理阶层结构数据的高级统计方法, 项目反应理论是精确测量被试能力的现代测量理论。多水平项目反应理论将阶层线性模型和项目反应理论相结合, 将项目反应模型嵌套在阶层线性模型内, 实现了项目参数和不同水平能力参数的估计, 对回归系数和误差项变异的估计也更加精确。作者概述了多水平项目反应理论的发展历程, 并从项目功能差异、测验等值、学校效能研究等方面评述了多水平项目反应理论在心理与教育测量中的应用, 总结了多水平项目反应理论的价值, 同时展望了今后的研究趋势。  相似文献   

4.
刘红云  骆方 《心理学报》2008,40(1):92-100
作者简要介绍了多水平项目反应模型,对多水平项目反应理论与通常项目反应理论之间的关系进行了探讨,得到了多水平项目反应模型参数与通常项目反应模型参数之间的关系,并讨论了多水平项目反应模型的推广模型。通过一个实际例子,用多水平项目反应模型对测验中项目的特征进行分析;检验个体水平和组水平预测变量对能力参数的影响;对项目功能差异进行分析。最后文章就多水平项目反应理论模型的优势与不足进行了讨论  相似文献   

5.
In Item Response Theory (IRT), item characteristic curves (ICCs) are illustrated through logistic models or normal ogive models, and the probability that examinees give the correct answer is usually a monotonically increasing function of their ability parameters. However, since only limited patterns of shapes can be obtained from logistic models or normal ogive models, there is a possibility that the model applied does not fit the data. As a result, the existing method can be rejected because it cannot deal with various item response patterns. To overcome these problems, we propose a new semiparametric IRT model using a Dirichlet process mixture logistic distribution. Our method does not rely on assumptions but only requires that the ICCs be a monotonically nondecreasing function; that is, our method can deal with more types of item response patterns than the existing methods, such as the one-parameter normal ogive models or the two- or three-parameter logistic models.  相似文献   

6.
在测量具有层阶结构的潜质时, 标准项目反应模型对项目参数估计和能力参数估计都具有较低的效率, 多维项目反应模型虽然在估计第一阶潜质时具有高效性, 但没有考虑到潜质层阶的情况, 所以它不适合用来处理具有层阶结构的潜质; 而高阶项目反应模型在处理这种具有层阶结构的潜质时, 不仅能够高效准确地对项目参数和能力参数进行估计, 而且还能同时获得高阶潜质与低阶潜质。目前存在的高阶项目反应模型有高阶DINA模型、高阶双参数正态肩型层阶模型、高阶逻辑斯蒂模型、多级评分的高阶项目反应模型和高阶题组模型。未来对高阶项目反应模型的研究方向应注意多水平高阶项目反应模型、项目内多维情况下的高阶项目反应模型以及高阶认知诊断模型。  相似文献   

7.
This paper proposes two unidimensional item response theory (IRT) models for analysing normative forced‐choice personality items. Both models are derived from a common theoretical framework and arise as a result of different assumptions regarding the mechanism of choice. The simplest mechanism gives rise to the one‐parameter normal‐ogive model. The second mechanism gives rise to a new IRT model, which is closely related to the Coombs–Zinnes probabilistic unfolding model. The second model is compared theoretically to the normal‐ogive model in terms of item characteristic curves and amount of item information. Next, procedures for estimating the respondent and the item parameters in the second model are described. Finally, both models are empirically compared by using two well‐known personality measures.  相似文献   

8.
Relations are examined between latent trait and latent class models for item response data. Conditions are given for the two-latent class and two-parameter normal ogive models to agree, and relations between their item parameters are presented. Generalizationss are then made to continuous models with more than one latent trait and discrete models with more than two latent classes, and methods are presented for relating latent class models to factor models for dichotomized variables. Results are illustrated using data from the Law School Admission Test, previously analyzed by several authors.  相似文献   

9.
应用项目反应理论对《中国士兵人格问卷》的项目分析   总被引:4,自引:0,他引:4  
采用项目反应理论(IRT)对《中国士兵人格问卷》进行项目分析。计算机呈现中国士兵人格问卷(CSPQ)对100,523名适龄男性青年进行测验,随机抽取2676名任一维度标准分均低于70的定为合格组;将任一维度大于70分并经专业人员访谈不合格的274名定为不合格组;从精神病院抽取男性年龄相当的221名缓解期精神分裂症患者定为精神病组,并完成CSPQ测验。运用基于IRT的双参数Logistic模型进行分析;结果发现,区分度参数超过区间(0.30,4.00)的条目删除前后,被试的能力值与标准分均存在显著相关;精神病组的测验分数经IRT分析,图形曲线与不合格组有高度吻合。研究结果说明,在测验精度基本相同的条件下,应用IRT可以减少施测条目,提高测验效率,可在一定程度上更精确地区分被试的特质水平  相似文献   

10.
For testlet response data, traditional item response theory (IRT) models are often not appropriate due to local dependence presented among items within a common testlet. Several testlet‐based IRT models have been developed to model examinees' responses. In this paper, a new two‐parameter normal ogive testlet response theory (2PNOTRT) model for dichotomous items is proposed by introducing testlet discrimination parameters. A Bayesian model parameter estimation approach via a data augmentation scheme is developed. Simulations are conducted to evaluate the performance of the proposed 2PNOTRT model. The results indicated that the estimation of item parameters is satisfactory overall from the viewpoint of convergence. Finally, the proposed 2PNOTRT model is applied to a set of real testlet data.  相似文献   

11.
Equivalence of marginal likelihood of the two-parameter normal ogive model in item response theory (IRT) and factor analysis of dichotomized variables (FA) was formally proved. The basic result on the dichotomous variables was extended to multicategory cases, both ordered and unordered categorical data. Pair comparison data arising from multiple-judgment sampling were discussed as a special case of the unordered categorical data. A taxonomy of data for the IRT and FA models was also attempted.The work reported in this paper has been supported by Grant A6394 to the first author from the Natural Sciences and Engineering Research Council of Canada.  相似文献   

12.
丁树良  罗芬  戴海琦  朱玮 《心理学报》2007,39(4):730-736
在IRT框架下,建立了0-1评分方式下单维双参数Logistic多题多做(MAMI)测验模型。与Spray给出的一题多做(MASI)模型相比,MAMI不仅模型更加精致,而且扩展了适用范围,参数估计方法也不同,采用EM算法求取项目参数。Monte Carlo模拟结果显示,应用MAMI测验模型与测验题量作相应增加的作法相比,两者给出的能力估计精度相同,但MAMI模型给出的项目参数估计精度更高。如果将MAMI测验模型与被试人数相应增加的作法相比,项目参数的估计精度相同,但MAMI给出的能力参数估计精度更高。这个发现表明,在一定条件下若允许修改答案,并采用累加式记分方式,纵使题量不变,也可使能力估计的精度相当于题量增加一倍的估计精度,而项目参数估计精度也会提高。这些发现不仅对技能评价和认知能力评价有参考价值,而且对数据的处理方式也有参考价值  相似文献   

13.
Abstract

Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity—the number of model parameters estimated simultaneously—affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples.  相似文献   

14.
In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic classification models (DCMs). DCMs are a newer class of psychometric models that are designed to classify examinees according to levels of categorical latent traits. We examined the invariance property for general DCMs using the log-linear cognitive diagnosis model (LCDM) framework. We conducted a simulation study to examine the degree to which theoretical invariance of LCDM classifications and item parameter estimates can be observed under various sample and test characteristics. Results illustrated that LCDM classifications and item parameter estimates show clear invariance when adequate model data fit is present. To demonstrate the implications of this important property, we conducted additional analyses to show that using pre-calibrated tests to classify examinees provided consistent classifications across calibration samples with varying mastery profile distributions and across tests with varying difficulties.  相似文献   

15.
The purpose of this paper is to introduce a new method for fitting item response theory models with the latent population distribution estimated from the data using splines. A spline-based density estimation system provides a flexible alternative to existing procedures that use a normal distribution, or a different functional form, for the population distribution. A simulation study shows that the new procedure is feasible in practice, and that when the latent distribution is not well approximated as normal, two-parameter logistic (2PL) item parameter estimates and expected a posteriori scores (EAPs) can be improved over what they would be with the normal model. An example with real data compares the new method and the extant empirical histogram approach.  相似文献   

16.
A method of estimating item response theory (IRT) equating coefficients by the common-examinee design with the assumption of the two-parameter logistic model is provided. The method uses the marginal maximum likelihood estimation, in which individual ability parameters in a common-examinee group are numerically integrated out. The abilities of the common examinees are assumed to follow a normal distribution but with an unknown mean and standard deviation on one of the two tests to be equated. The distribution parameters are jointly estimated with the equating coefficients. Further, the asymptotic standard errors of the estimates of the equating coefficients and the parameters for the ability distribution are given. Numerical examples are provided to show the accuracy of the method.  相似文献   

17.
Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings.  相似文献   

18.
A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization of the procedure to a model with multidimensional ability parameters are presented. The procedure is a generalization of a procedure by Albert (1992) for estimating the two-parameter normal ogive model. The procedure supports analyzing data from multiple populations and incomplete designs. It is shown that restrictions can be imposed on the factor matrix for testing specific hypotheses about the ability structure. The technique is illustrated using simulated and real data. The authors would like to thank Norman Verhelst for his valuable comments and ACT, CITO group and SweSAT for the use of their data.  相似文献   

19.
A nonlinear mixed model framework for item response theory   总被引:1,自引:0,他引:1  
Mixed models take the dependency between observations based on the same cluster into account by introducing 1 or more random effects. Common item response theory (IRT) models introduce latent person variables to model the dependence between responses of the same participant. Assuming a distribution for the latent variables, these IRT models are formally equivalent with nonlinear mixed models. It is shown how a variety of IRT models can be formulated as particular instances of nonlinear mixed models. The unifying framework offers the advantage that relations between different IRT models become explicit and that it is rather straightforward to see how existing IRT models can be adapted and extended. The approach is illustrated with a self-report study on anger.  相似文献   

20.
In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号