期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust Measurement via A Fused Latent and Graphical Item Response Theory Model

Yunxiao Chen Xiaoou Li Jingchen Liu Zhiliang Ying 《Psychometrika》2018,83(3):538-562

Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits. 相似文献

2.

多水平IRT的发展与应用述评

刘慧简小珠张敏强熊悦欣《心理科学进展》2012,20(4):627-632

阶层线性模型是处理阶层结构数据的高级统计方法, 项目反应理论是精确测量被试能力的现代测量理论。多水平项目反应理论将阶层线性模型和项目反应理论相结合, 将项目反应模型嵌套在阶层线性模型内, 实现了项目参数和不同水平能力参数的估计, 对回归系数和误差项变异的估计也更加精确。作者概述了多水平项目反应理论的发展历程, 并从项目功能差异、测验等值、学校效能研究等方面评述了多水平项目反应理论在心理与教育测量中的应用, 总结了多水平项目反应理论的价值, 同时展望了今后的研究趋势。相似文献

3.

Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis

Chen Yunxiao Li Xiaoou Zhang Siliang 《Psychometrika》2019,84(1):124-146

Joint maximum likelihood (JML) estimation is one of the earliest approaches to fitting item response theory (IRT) models. This procedure treats both the item and person parameters as unknown but fixed model parameters and estimates them simultaneously by solving an optimization problem. However, the JML estimator is known to be asymptotically inconsistent for many IRT models, when the sample size goes to infinity and the number of items keeps fixed. Consequently, in the psychometrics literature, this estimator is less preferred to the marginal maximum likelihood (MML) estimator. In this paper, we re-investigate the JML estimator for high-dimensional exploratory item factor analysis, from both statistical and computational perspectives. In particular, we establish a notion of statistical consistency for a constrained JML estimator, under an asymptotic setting that both the numbers of items and people grow to infinity and that many responses may be missing. A parallel computing algorithm is proposed for this estimator that can scale to very large datasets. Via simulation studies, we show that when the dimensionality is high, the proposed estimator yields similar or even better results than those from the MML estimator, but can be obtained computationally much more efficiently. An illustrative real data example is provided based on the revised version of Eysenck’s Personality Questionnaire (EPQ-R).

相似文献

4.

Assessing Item Fit for Unidimensional Item Response Theory Models Using Residuals from Estimated Item Response Functions

Shelby J. Haberman Sandip Sinharay Kyong Hee Chon 《Psychometrika》2013,78(3):417-440

Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models. 相似文献

5.

Log-Multiplicative Association Models as Item Response Models 总被引：1，自引：0，他引：1

Carolyn J. Anderson Hsiu-Ting Yu 《Psychometrika》2007,72(1):5-23

Log-multiplicative association (LMA) models, which are special cases of log-linear models, have interpretations in terms of latent continuous variables. Two theoretical derivations of LMA models based on item response theory (IRT) arguments are presented. First, we show that Anderson and colleagues (Anderson &; Vermunt, 2000; Anderson &; Böckenholt, 2000; Anderson, 2002), who derived LMA models from statistical graphical models, made the equivalent assumptions as Holland (1990) when deriving models for the manifest probabilities of response patterns based on an IRT approach. We also present a second derivation of LMA models where item response functions are specified as functions of rest-scores. These various connections provide insights into the behavior of LMA models as item response models and point out philosophical issues with the use of LMA models as item response models. We show that even for short tests, LMA and standard IRT models yield very similar to nearly identical results when data arise from standard IRT models. Log-multiplicative association models can be used as item response models and do not require numerical integration for estimation. 相似文献

6.

Specifying Ability Growth Models Using a Multidimensional Item Response Model for Repeated Measures Categorical Ordinal Item Response Data

Insu Paek Zhen Li Hyun-Jeong Park 《Multivariate behavioral research》2016,51(4):569-580

When categorical ordinal item response data are collected over multiple timepoints from a repeated measures design, an item response theory (IRT) modeling approach whose unit of analysis is an item response is suitable. This study proposes a few longitudinal IRT models and illustrates how a popular compensatory multidimensional IRT model can be utilized to formulate such longitudinal IRT models, which permits an investigation of ability growth at both individual and population levels. The equivalence of an existing multidimensional IRT model and those longitudinal IRT models is also elaborated so that one can make use of an existing multidimensional IRT model to implement the longitudinal IRT models. 相似文献

7.

The Assessment of Dimensionality for Use in Item Response Theory

《Multivariate behavioral research》2013,48(4):765-792

The application of item response theory (IRT) models requires the identification of the data's dimensionality. A popular method for determining the number of latent dimensions is the factor analysis of a correlation matrix. Unlike factor analysis, which is based on a linear model, IRT assumes a nonlinear relationship between item performance and ability. Because multidimensional scaling (MDS) assumes a monotonic relationship this method may be useful for the assessment of a data set's dimensionality for use with IRT models. This study compared MDS, exploratory and confirmatory factor analysis (EFA and CFA, respectively) in the assessment of the dimensionality of data sets which had been generated to be either one- or two-dimensional. In addition, the data sets differed in the degree of interdimensional correlation and in the number of items defining a dimension. Results showed that MDS and CFA were able to correctly identify the number of latent dimensions for all data sets. In general, EFA was able to correctly identify the data's dimensionality, except for data whose interdimensional correlation was high. 相似文献

8.

多水平项目反应理论模型在测验发展中的应用

刘红云骆方《心理学报》2008,40(1):92-100

作者简要介绍了多水平项目反应模型,对多水平项目反应理论与通常项目反应理论之间的关系进行了探讨,得到了多水平项目反应模型参数与通常项目反应模型参数之间的关系,并讨论了多水平项目反应模型的推广模型。通过一个实际例子,用多水平项目反应模型对测验中项目的特征进行分析;检验个体水平和组水平预测变量对能力参数的影响;对项目功能差异进行分析。最后文章就多水平项目反应理论模型的优势与不足进行了讨论相似文献

9.

Testing Differential Item Functioning in Small Samples

William C. M. Belzak 《Multivariate behavioral research》2020,55(5):722-747

Abstract

Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity—the number of model parameters estimated simultaneously—affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples. 相似文献

10.

Nonparametric item response theory axioms and properties under nonlinearity and their exemplification with knowledge space theory

Ali Ünlü 《Journal of mathematical psychology》2007,51(6):383-400

This paper investigates the dichotomous Mokken nonparametric item response theory (IRT) axioms and properties under incomparabilities among latent trait values and items. Generalized equivalents of the unidimensional nonparametric IRT axioms and properties are formulated for nonlinear (quasi-ordered) person and indicator spaces. It is shown that monotone likelihood ratio (MLR) for the total score variable and nonlinear latent trait implies stochastic ordering (SO) of the total score variable, but may fail to imply SO of the nonlinear latent trait. The reason for this and conditions under which the implication holds are specified, based on a new, simpler proof of the fact that in the unidimensional case MLR implies SO. The approach is applied in knowledge space theory (KST), a combinatorial test theory. This leads to a (tentative) Mokken-type nonparametric axiomatization in the currently parametric theory of knowledge spaces. The nonparametric axiomatization is compared with the assumptions of the parametric basic local independence model which is fundamental in KST. It is concluded that this paper may provide a first step toward a basis for a possible fusion of the two split directions of psychological test theories IRT and KST. 相似文献

11.

Model Selection of Nested and Non-Nested Item Response Models Using Vuong Tests

Lennart Schneider R. Philip Chalmers Rudolf Debelak Edgar C. Merkle 《Multivariate behavioral research》2020,55(5):664-684

Abstract

In this paper, we apply Vuong’s general approach of model selection to the comparison of nested and non-nested unidimensional and multidimensional item response theory (IRT) models. Vuong’s approach of model selection is useful because it allows for formal statistical tests of both nested and non-nested models. However, only the test of non-nested models has been applied in the context of IRT models to date. After summarizing the statistical theory underlying the tests, we investigate the performance of all three distinct Vuong tests in the context of IRT models using simulation studies and real data. In the non-nested case we observed that the tests can reliably distinguish between the graded response model and the generalized partial credit model. In the nested case, we observed that the tests typically perform as well as or sometimes better than the traditional likelihood ratio test. Based on these results, we argue that Vuong’s approach provides a useful set of tools for researchers and practitioners to effectively compare competing nested and non-nested IRT models. 相似文献

12.

混合IRT潜在模型及其应用轨迹

王霞谭国华王旭张敏强骆聪《心理科学进展》2014,22(3):540-548

项目反应理论是测量被试潜在特质的现代测量理论, 潜在类别分析是基于模型的潜在特质分类技术。混合项目反应理论将项目反应理论与潜在类别分析相结合, 能够同时对被试分类并量化其潜在特质。在阐述混合项目反应理论概念、原理的基础上, 介绍了MRM、mNRM和mPCM等几种常见混合模型及其参数估计方法, 并从心理与行为特征分类、项目功能差异检测、测验效度评价等方面评述了其在心理测验中的应用发展轨迹。相似文献

13.

The Spiritual Transcendence Index: An Item Response Theory Analysis

Alexis D. Abernethy Seong-Hyeon Kim 《The International journal for the psychology of religion》2013,23(4):240-256

ABSTRACT

In an attempt to measure understudied dimensions of spirituality, recent efforts have focused on the transcendent dimension of spirituality. The Spiritual Transcendence Index (STI) was developed to assess a perceived experience of the sacred that affects one’s ability to transcend life’s difficulties. The main focus of the current study was to investigate the psychometric properties of the STI by utilizing the microscopic item-level examination tools unique in item response theory (IRT), as well as its scale-level exploration devices for psychometric properties of an assessment measure. IRT analyses were conducted to investigate the STI’s psychometric properties across samples (N = 712) including how well the measure assesses the latent construct, spiritual transcendence, from the low to high range of the construct. The findings confirm that the 8-item index is a single factor that assesses the latent construct, spiritual transcendence. Instead of the original 6-category version, these findings support a 4-category response version; the 3 categories of disagreement may be collapsed into a single category. These findings not only inform the refinement of the STI but also highlight an important psychometric approach for the refinement of spirituality/religiousness measures, especially those with ceiling effects. 相似文献

14.

Item Characteristic Curve Estimation of Signal Detection Theory-Based Personality Data: A Two-Stage Approach to Item Response Modeling

《International Journal of Testing》2013,13(2):189-213

Signal Detection Theory (SDT; MacMillan & Creelman, 1991) is a method of data collection that has been used for several years, which describes the decision-making strategies of individuals. However, its use has been largely restricted to experiments involving sensation and perception. The Overclaiming Questionnaire (OCQ; Paulhus & Bruce, 1990) is a scale that has been developed to measure intellectual ability and personality, using SDT as a guideline. Although the scale has been successful in measuring human characteristics such as narcissism and intelligence, it is still unclear how to measure the characteristics of the various stimuli used (e.g., item difficulty, item discrimination, etc.). In some ways, this is a direct consequence of the general lack of research involved in item parameter estimation in the field of SDT. Using the OCQ, this article presents a graphical and nonparametric form of item response modeling to address this issue. In many ways, the approach is influenced by and structured around item response theory (IRT; Hambleton, Swaminathan, & Rogers, 1991). The general features of both SDT and IRT are described. Results suggest that this method is indeed a reasonable approach to describing item functioning, and there are several advantages to using this method over traditional IRT methods. Furthermore, SDT appears to be a fruitful approach to assessing intelligence, ability, and other psychological constructs, with advantages over traditional approaches. Overall, the results provide interesting implications for item selection and test development in several scientific and academic fields. 相似文献

15.

The person response function as a tool in person-fit research

Klaas Sijtsma Rob R. Meijer 《Psychometrika》2001,66(2):191-207

Item responses that do not fit an item response theory (IRT) model may cause the latent trait value to be inaccurately estimated. In the past two decades several statistics have been proposed that can be used to identify nonfitting item score patterns. These statistics all yieldscalar values. Here, the use of the person response function (PRF) for identifying nonfitting item score patterns was investigated. The PRF is afunction and can be used for diagnostic purposes. First, the PRF is defined in a class of IRT models that imply an invariant item ordering. Second, a person-fit method proposed by Trabin & Weiss (1983) is reformulated in a nonparametric IRT context assuming invariant item ordering, and statistical theory proposed by Rosenbaum (1987a) is adapted to test locally whether a PRF is nonincreasing. Third, a simulation study was conducted to compare the use of the PRF with the person-fit statistic ZU3. It is concluded that the PRF can be used as a diagnostic tool in person-fit research.The authors are grateful to Coen A. Bernaards for preparing the figures used in this article, and to Wilco H.M. Emons for checking the calculations. 相似文献

16.

Analyzing Longitudinal Item Response Data via the Pairwise Fitting Method

Zhi-Hui Fu Jian Tao Ning-Zhong Shi Ming Zhang Nan Lin 《Multivariate behavioral research》2013,48(4):669-690

Multidimensional item response theory (MIRT) models can be applied to longitudinal educational surveys where a group of individuals are administered different tests over time with some common items. However, computational problems typically arise as the dimension of the latent variables increases. This is especially true when the latent variable distribution cannot be integrated out analytically, as with MIRT models for binary data. In this article, based on the pseudolikelihood theory, we propose a pairwise modeling strategy to estimate item and population parameters in longitudinal studies. Our pairwise method effectively reduces the dimensionality of the problem and hence is applicable to longitudinal IRT data with high-dimensional latent variables, which are challenging for classical methods. And in the low-dimensional case, our simulation study shows that it performs comparably with the classical methods. We further illustrate the implementation of the pairwise method using a development study of mathematics levels of junior high school students in which the response data are collected from 65 classes of 8 schools from 4 different school districts in China. 相似文献

17.

A nonlinear mixed model framework for item response theory 总被引：1，自引：0，他引：1

Rijmen F Tuerlinckx F De Boeck P Kuppens P 《心理学方法》2003,8(2):185-205

Mixed models take the dependency between observations based on the same cluster into account by introducing 1 or more random effects. Common item response theory (IRT) models introduce latent person variables to model the dependence between responses of the same participant. Assuming a distribution for the latent variables, these IRT models are formally equivalent with nonlinear mixed models. It is shown how a variety of IRT models can be formulated as particular instances of nonlinear mixed models. The unifying framework offers the advantage that relations between different IRT models become explicit and that it is rather straightforward to see how existing IRT models can be adapted and extended. The approach is illustrated with a self-report study on anger. 相似文献

18.

Bayes Factors for Evaluating Latent Monotonicity in Polytomous Item Response Theory Models

Tijmstra Jesper Bolsinova Maria 《Psychometrika》2019,84(3):846-869

The assumption of latent monotonicity is made by all common parametric and nonparametric polytomous item response theory models and is crucial for establishing an ordinal level of measurement of the item score. Three forms of latent monotonicity can be distinguished: monotonicity of the cumulative probabilities, of the continuation ratios, and of the adjacent-category ratios. Observable consequences of these different forms of latent monotonicity are derived, and Bayes factor methods for testing these consequences are proposed. These methods allow for the quantification of the evidence both in favor and against the tested property. Both item-level and category-level Bayes factors are considered, and their performance is evaluated using a simulation study. The methods are applied to an empirical example consisting of a 10-item Likert scale to investigate whether a polytomous item scoring rule results in item scores that are of ordinal level measurement.

相似文献

19.

题组反应理论及其在中学英语考试中的应用研究

田文娜张敏强胡小甜梁淑仪张楠楠黄牧蕙《心理学探新》2014,34(5):441-445

题组作为众多测验中的一种常见题型,由于项目间存在一定程度的依赖性而违背了局部独立性假设,若用项目反应模型进行参数估计将会出现较大的偏差.题组反应理论将被试与题组的交互作用纳入到模型中,解决了项目间相依性的问题.笔者对题组反应理论的发展、基本原理及其相关研究进行了综述,并将其应用在中学英语考试中.与项目反应理论相对比,结果发现：（1）题组反应模型与项目反应模型在各参数估计值的相关系数较强,尤其是能力参数和难度参数;（2）在置信区间宽度的比较上,题组反应模型在各个参数上均窄于项目反应模型,即题组反应模型的估计精度优于项目反应模型. 相似文献

20.

Constant latent odds-ratios models and the mantel-haenszel null hypothesis 总被引：1，自引：0，他引：1

David?J.?Hessen Email author 《Psychometrika》2005,70(3):497-516

In the present paper, a new family of item response theory (IRT) models for dichotomous item scores is proposed. Two basic assumptions define the most general model of this family. The first assumption is local independence of the item scores given a unidimensional latent trait. The second assumption is that the odds-ratios for all item-pairs are constant functions of the latent trait. Since the latter assumption is characteristic of the whole family, the models are called constant latent odds-ratios (CLORs) models. One nonparametric special case and three parametric special cases of the general CLORs model are shown to be generalizations of the one-parameter logistic Rasch model. For all CLORs models, the total score (the unweighted sum of the item scores) is shown to be a sufficient statistic for the latent trait. In addition, conditions under the general CLORs model are studied for the investigation of differential item functioning (DIF) by means of the Mantel-Haenszel procedure. This research was supported by the Dutch Organization for Scientific Research (NWO), grant number 400-20-026. 相似文献