首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Various different item response theory (IRT) models can be used in educational and psychological measurement to analyze test data. One of the major drawbacks of these models is that efficient parameter estimation can only be achieved with very large data sets. Therefore, it is often worthwhile to search for designs of the test data that in some way will optimize the parameter estimates. The results from the statistical theory on optimal design can be applied for efficient estimation of the parameters.A major problem in finding an optimal design for IRT models is that the designs are only optimal for a given set of parameters, that is, they are locally optimal. Locally optimal designs can be constructed with a sequential design procedure. In this paper minimax designs are proposed for IRT models to overcome the problem of local optimality. Minimax designs are compared to sequentially constructed designs for the two parameter logistic model and the results show that minimax design can be nearly as efficient as sequentially constructed designs.  相似文献   

2.
Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.  相似文献   

3.
应用项目反应理论对《中国士兵人格问卷》的项目分析   总被引:4,自引:0,他引:4  
采用项目反应理论(IRT)对《中国士兵人格问卷》进行项目分析。计算机呈现中国士兵人格问卷(CSPQ)对100,523名适龄男性青年进行测验,随机抽取2676名任一维度标准分均低于70的定为合格组;将任一维度大于70分并经专业人员访谈不合格的274名定为不合格组;从精神病院抽取男性年龄相当的221名缓解期精神分裂症患者定为精神病组,并完成CSPQ测验。运用基于IRT的双参数Logistic模型进行分析;结果发现,区分度参数超过区间(0.30,4.00)的条目删除前后,被试的能力值与标准分均存在显著相关;精神病组的测验分数经IRT分析,图形曲线与不合格组有高度吻合。研究结果说明,在测验精度基本相同的条件下,应用IRT可以减少施测条目,提高测验效率,可在一定程度上更精确地区分被试的特质水平  相似文献   

4.
It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.  相似文献   

5.
Item responses that do not fit an item response theory (IRT) model may cause the latent trait value to be inaccurately estimated. In the past two decades several statistics have been proposed that can be used to identify nonfitting item score patterns. These statistics all yieldscalar values. Here, the use of the person response function (PRF) for identifying nonfitting item score patterns was investigated. The PRF is afunction and can be used for diagnostic purposes. First, the PRF is defined in a class of IRT models that imply an invariant item ordering. Second, a person-fit method proposed by Trabin & Weiss (1983) is reformulated in a nonparametric IRT context assuming invariant item ordering, and statistical theory proposed by Rosenbaum (1987a) is adapted to test locally whether a PRF is nonincreasing. Third, a simulation study was conducted to compare the use of the PRF with the person-fit statistic ZU3. It is concluded that the PRF can be used as a diagnostic tool in person-fit research.The authors are grateful to Coen A. Bernaards for preparing the figures used in this article, and to Wilco H.M. Emons for checking the calculations.  相似文献   

6.
7.
An IRT model with a parameter-driven process for change is proposed. Quantitative differences between persons are taken into account by a continuous latent variable, as in common IRT models. In addition, qualitative interindividual differences and autodependencies are accounted for by assuming within-subject variability with respect to the parameters of the IRT model. In particular, the parameters of the IRT model are governed by an unobserved or “hidden'” homogeneous Markov process. The model includes the mixture linear logistic test model (Mislevy & Verhelst, 1990), the mixture Rasch model (Rost, 1990), and the Saltus model (Wilson, 1989) as specific instances. The model is applied to a longitudinal experiment on discontinuity in conservation acquisition (van der Maas, 1993). Frank Rijmen was supported by the Fund for Scientific Research Flanders (FWO), the GOA/2000/02 granted by the Katholieke Universiteit Leuven to Paul De Boeck and Iven Van Mechelen, and the PDM/02/067 granted by the Katholieke Universiteit Leuven to Paul De Boeck.  相似文献   

8.
Bayesian estimation of a multilevel IRT model using gibbs sampling   总被引:3,自引:0,他引:3  
In this article, a two-level regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather than observed scores as dependent variables of a multilevel model is that it offers the possibility of separating the influence of item difficulty and ability level and modeling response variation and measurement error. Another advantage is that, contrary to observed scores, latent scores are test-independent, which offers the possibility of using results from different tests in one analysis where the parameters of the IRT model and the multilevel model can be concurrently estimated. The two-parameter normal ogive model is used for the IRT measurement model. It will be shown that the parameters of the two-parameter normal ogive model and the multilevel model can be estimated in a Bayesian framework using Gibbs sampling. Examples using simulated and real data are given.  相似文献   

9.
EDITOR'S NOTE     
In the framework of a linear logistic testing model, Mislevy, Sheehan, and Wingersky (1993) showed how to incorporate collateral information in estimating item parameters required for test equating. The purpose of the study was to explore the feasibility of applying this method to equate tests constructed for college entrance examination by comparing its results with those of the item response theory (IRT) true-score equating. Overall, the equating results based on collateral information are relatively comparable with those of IRT equating. In terms of R2's, the prediction equations for item characteristics are good to excellent. The significant levels of correlation coefficients between IRT calibrated b (difficulty level) and predicted b parameters range from around .01 to .05. The goodness of fit of true-score test characteristic curves (TCCs) based on collateral information to IRT true-score TCCs are excellent. Results of the study are discussed in light of factors that may affect the validity of using collateral information in test equating.  相似文献   

10.
This paper proposes two unidimensional item response theory (IRT) models for analysing normative forced‐choice personality items. Both models are derived from a common theoretical framework and arise as a result of different assumptions regarding the mechanism of choice. The simplest mechanism gives rise to the one‐parameter normal‐ogive model. The second mechanism gives rise to a new IRT model, which is closely related to the Coombs–Zinnes probabilistic unfolding model. The second model is compared theoretically to the normal‐ogive model in terms of item characteristic curves and amount of item information. Next, procedures for estimating the respondent and the item parameters in the second model are described. Finally, both models are empirically compared by using two well‐known personality measures.  相似文献   

11.
刘红云  骆方 《心理学报》2008,40(1):92-100
作者简要介绍了多水平项目反应模型,对多水平项目反应理论与通常项目反应理论之间的关系进行了探讨,得到了多水平项目反应模型参数与通常项目反应模型参数之间的关系,并讨论了多水平项目反应模型的推广模型。通过一个实际例子,用多水平项目反应模型对测验中项目的特征进行分析;检验个体水平和组水平预测变量对能力参数的影响;对项目功能差异进行分析。最后文章就多水平项目反应理论模型的优势与不足进行了讨论  相似文献   

12.
Information entropy and mutual information were investigated in discrete movement aiming tasks over a wide range of spatial (20-160 mm) and temporal (250-1250 ms) constraints. Information entropy was calculated using two distinct analyses: (1) with no assumption on the nature of the data distribution; and (2) assuming the data have a normal distribution. The two analyses showed different results in the estimate of entropy that also changed as a function of task goals, indicating that the movement trajectory data were not from a normal distribution. It was also found that the information entropy of the discrete aiming movements was lower than the task defined indices of difficulty (ID) that were selected for the congruence with Fitts' law. Mutual information between time points of the trajectory was strongly influenced by the average movement velocity and the acceleration/deceleration segments of the movement. The entropy analysis revealed structure to the variability of the movement trajectory and outcome that has been masked by the traditional distributional analyses of discrete aiming movements.  相似文献   

13.
In Item Response Theory (IRT), item characteristic curves (ICCs) are illustrated through logistic models or normal ogive models, and the probability that examinees give the correct answer is usually a monotonically increasing function of their ability parameters. However, since only limited patterns of shapes can be obtained from logistic models or normal ogive models, there is a possibility that the model applied does not fit the data. As a result, the existing method can be rejected because it cannot deal with various item response patterns. To overcome these problems, we propose a new semiparametric IRT model using a Dirichlet process mixture logistic distribution. Our method does not rely on assumptions but only requires that the ICCs be a monotonically nondecreasing function; that is, our method can deal with more types of item response patterns than the existing methods, such as the one-parameter normal ogive models or the two- or three-parameter logistic models.  相似文献   

14.
Abstract

Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity—the number of model parameters estimated simultaneously—affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples.  相似文献   

15.
本文首次提出使用广义线性混合模型(Generalized Linear Mixed Model, GLMM)对概化理论(GT)和项目反应理论(IRT)进行统合,即在一次统计中就能同时获得GT和IRT所需要的估计结果。模拟研究结果显示:相比于传统的GT方差分量估计方法——期望均值平方(Expected Mean Squares, EMS),GLMM可以获得更准确的方差分量、G系数和Φ系数,而且GLMM获得的题目难度参数估计精度优于传统Rasch模型。实证研究展示GLMM在实际心理测量数据分析中的应用。  相似文献   

16.
多维项目反应理论等级反应模型   总被引:2,自引:0,他引:2  
杜文久  肖涵敏 《心理学报》2012,44(10):1402-1407
基于因子分析和单维项目反应理论的多维项目反应理论是测量理论的新发展方向之一。但是, 多维项目反应理论仍处于不成熟的发展阶段, 多数研究也只是以二级评分为主。本文首先介绍了逻辑斯蒂形式的多维等级反应模型, 并以二维等级反应模型为例, 分析了模型的数学函数图像及其性质。然后, 推导出了多维等级反应模型的项目信息函数, 并结合实例进行了讨论。进一步地, 本文阐述了使用联合极大似然估计和马尔科夫链蒙特卡洛方法估计多维等级反应模型参数的思想。最后, 指出了一些有待研究的问题。  相似文献   

17.
ABSTRACT Correlational and factor‐analytic methods indicate that abnormal and normal personality constructs may be tapping the same underlying latent trait. However, they do not systematically demonstrate that measures of abnormal personality capture more extreme ranges of the latent trait than measures of normal range personality. Item Response Theory (IRT) methods, in contrast, do provide this information. In the present study, we use IRT methods to evaluate the range of the latent trait assessed with a normal personality measure and a measure of psychopathy as one example of an abnormal personality construct. Contrary to the expectation that the measure of psychopathy would be more extreme than the measure of normal personality traits, the measures overlapped substantially in terms of the regions of the latent trait for which they provide information. Moreover, both types of inventories were limited in terms of measurement bandwidth, such that they did not provide information across the entire latent trait continuum. Implications and future directions are discussed.  相似文献   

18.
Item response theory (IRT) models are the central tools in modern measurement and advanced psychometrics. We offer a MATLAB IRT modeling (IRTm) toolbox that is freely available and that follows an explicit design matrix approach, giving the end user control and flexibility in building a model that goes beyond standard models, such as the Rasch model (Rasch, 1960) and the two-parameter logistic model. As such, IRTm allows for a large variety of unidimensional IRT models for binary responses, the incorporation of additional person and item information, and deviations from common model assumptions. An exclusive key feature of the toolbox is the inclusion of copula IRT models to handle local item dependencies. Two appendixes for this report, containing example code and information on the general copula IRT in IRTm, may be downloaded from brm.psychonomic-journals.org/content/supplemental.  相似文献   

19.
Woods CM 《心理学方法》2006,11(3):253-270
Popular methods for fitting unidimensional item response theory (IRT) models to data assume that the latent variable is normally distributed in the population of respondents, but this can be unreasonable for some variables. Ramsay-curve IRT (RC-IRT) was developed to detect and correct for this nonnormality. The primary aims of this article are to introduce RC-IRT less technically than it has been described elsewhere; to evaluate RC-IRT for ordinal data via simulation, including new approaches for model selection; and to illustrate RC-IRT with empirical examples. The empirical examples demonstrate the utility of RC-IRT for real data, and the simulation study indicates that when the latent distribution is skewed, RC-IRT results can be more accurate than those based on the normal model. Along with a plot of candidate curves, the Hannan-Quinn criterion is recommended for model selection.  相似文献   

20.
This paper first discusses the relationship between Kullback–Leibler information (KL) and Fisher information in the context of multi-dimensional item response theory and is further interpreted for the two-dimensional case, from a geometric perspective. This explication should allow for a better understanding of the various item selection methods in multi-dimensional adaptive tests (MAT) which are based on these two information measures. The KL information index (KI) method is then discussed and two theorems are derived to quantify the relationship between KI and item parameters. Due to the fact that most of the existing item selection algorithms for MAT bear severe computational complexity, which substantially lowers the applicability of MAT, two versions of simplified KL index (SKI), built from the analytical results, are proposed to mimic the behavior of KI, while reducing the overall computational intensity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号